Question: RNA Normalization from DESeq to RPKM
0
gravatar for alpha09
4.2 years ago by
alpha0910
Norway
alpha0910 wrote:

I have RNA Seq DESeq normalized data.I want to convert it into RPKM.

kindly let me know what should I do?

Any R package etc.

Thank You.

rna-seq next-gen R • 2.6k views
ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by alpha0910

Total gene read counts were normalized on library size using DESeq method (size factor)

Feature_ID           M1_1              M1_2              M2_1              M2_2              M3_1              M3_2              M4_1              M4_2              M5_1              M5_2              M6_1              M6_2              M7_1              M7_2              M8_1              M8_2              M9_1              M9_2              M10_1             M10_2
AT1G01010            82.76             155.63            82.04             120.97            96.56             89.69             148.62            88.95             56.90             112.33            122.04            127.13            119.41            107.63            125.20            105.24            98.49             94.63             55.92             41.66
AT1G01020            287.06            233.44            232.45            326.93            261.60            194.49            478.16            424.96            108.23            241.24            296.39            327.82            405.37            459.73            333.87            300.69            318.63            356.01            539.58            682.54
AT1G01030            20.69             1.35              28.49             26.03             17.96             19.15             10.77             28.33             1.12              7.37              33.62             27.32             10.47             8.21              1.39              8.10              39.59             34.92             50.03             77.97
AT1G01040            1100.40           521.02            961.69            1202.83           898.21            1179.01           1620.81           1694.58           706.82            511.95            1414.69           1671.66           1864.49           1848.93           1295.15           1350.78           1352.74           1394.73           1789.44           1843.60
AT1G01046            0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00              0.00
AT1G01050            1705.56           2801.34           2171.78           1633.89           1871.64           1609.30           1285.88           1171.45           2309.58           2438.19           1463.26           1272.39           1210.87           1132.89           1499.65           1391.25           1760.20           1696.66           1372.49           1527.43

This is my file that I am using Devon. This A.Thaliana data. So simply I divide each counts with transcript length in kb then it will be converted to RPKM?

ADD REPLYlink modified 24 days ago by RamRS25k • written 4.2 years ago by alpha0910

Divide by a million too, that'll be the M part in RPKM.

ADD REPLYlink modified 24 days ago by RamRS25k • written 4.2 years ago by Devon Ryan93k

Thank You Devon.

ADD REPLYlink written 4.2 years ago by alpha0910

Thank You so much.

It helped me a lot

ADD REPLYlink written 4.2 years ago by alpha0910
1
gravatar for Devon Ryan
4.2 years ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

Take the counts, divide them by the gene length in KB (you can probably download this, but if not just google for how to generate it from a GTF file) and then divide by the number of mapped reads in millions.

For what it's worth, edgeR provides an rpkm() function, though once again you'll need to supply the gene lengths.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Devon Ryan93k

Devon, In that case, the normalized read count would be the base mean number generated by the Deseq for each experimental/controle comparision?

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by tiago2112871.1k

The normalized counts are per-sample.

ADD REPLYlink written 4.2 years ago by Devon Ryan93k

BTW, I should note that if you input normalized counts then you can just divide by a million rather than number of mapped reads in millions. It's best to not adjust for library-size differences twice...

ADD REPLYlink written 4.2 years ago by Devon Ryan93k

If you input DESeq normalized count, then it is not RPKM (Reads Per Kilobase of transcript per Million mapped reads) but something like "Reads Per Kilobase of transcript per Million mapped reads on exons". I'm not saying it is a bad metric, but don't call this RPKM to avoid confusion !

ADD REPLYlink written 4.2 years ago by Carlo Yague4.8k

I hate to break it to you but it's quite likely that most published RPKM values are calculated in this manner. I agree that a different term should probably be used, but that ship has already sailed.

ADD REPLYlink written 4.2 years ago by Devon Ryan93k

While this might be true, I don't think we should encourage use of inaccurate/imprecise terms... this just add to the general confusion with all the FPKM/RPKM/TPM/... things.

ADD REPLYlink written 4.2 years ago by Carlo Yague4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1861 users visited in the last hour