Question: On calculating RPKM values for RNA-seq data
0
gravatar for Yongjie Zhang
2.8 years ago by
UC Berkeley, USA/ Shanxi Univ, China
Yongjie Zhang80 wrote:

Hello,

I performed RNA-seq for a sample and want to calculate RPKM values. But I have two questions for your help.

1) Since paired-end Illumina sequencing was performed, can I use either single reads or fragments (i.e. paired reads) to calculate RPKM?

2) Let's suppose the total numbers of clean reads is A, among which the total number of mitochondrial reads is B. So, if I only want to calculate RPKM for mitochondrial genes, should I use A or B during calculation?

Thanks for any comments.

Yongjie

rpkm • 1.2k views
ADD COMMENTlink modified 2.8 years ago by dariober10k • written 2.8 years ago by Yongjie Zhang80

To be unbiased, always consider complete profile (all genes) and total number of reads mapped to them as library size for calculating RPKM/FPKM.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by EagleEye6.3k
0
gravatar for dariober
2.8 years ago by
dariober10k
WCIP | Glasgow | UK
dariober10k wrote:

Can I use either single reads or fragments (i.e. paired reads) to calculate RPKM?

What definitely you do not want to do is to count both read 1 and read 2. Counting fragments (i.e. both mates mapping to the same gene in correct orientation?) should be more appropriate but in practice I think counting only read 1s should be the same.

If I only want to calculate RPKM for mitochondrial genes, should I use A or B during calculation?

I guess it depends on whether you are interested in the concentration of a mitochondrial gene within mitochondrial genes or among all the genes. In the first case use mitochondrial reads otherwise use everything.

For example, in one sample you have 1M reads in the genome, 10k reads on chrM and 100 reads on gene X on chrM. In another sample you have 1M reads in the genome, 1k reads on chrM and 100 reads on gene X. In this case the concentration of X is pretty much the same in the two samples, genomewide. But relative to the mitochondrial genome the second sample is much richer in gene X (100/10k vs 100/1k).

ADD COMMENTlink written 2.8 years ago by dariober10k

Thank you, dariober. I noticed that some literature considered a gene was expressed if RPKM > 0.2. Do you think if such a threshold value (0.2) widely recognized in the community? In my case, RPKM for some mitochondrial genes will be lower than 0.2 if I use the whole reads instead of only mitochondrial reads.

ADD REPLYlink written 2.8 years ago by Yongjie Zhang80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1822 users visited in the last hour