Question: Which Expression Units To Use, Fpkm Or Rpkm ?
19
gravatar for biorepine
5.2 years ago by
biorepine1.4k
Spain
biorepine1.4k wrote:

Dear Biostars! I think this is one of the common problems (which expression units to use, FPKM or RPKM) in RNA-Seq expression analysis. People who use cufflinks end up with FPKM and ERANGE with RPKM. Cufflinks has nice explanation why FPKM save us from the skewed expression values called by other softwares especially with paired-end read data....

They're almost the same thing. RPKM stands for Reads Per Kilobase of transcript per Million mapped reads. FPKM stands for Fragments Per Kilobase of transcript per Million mapped reads. In RNA-Seq, the relative expression of a transcript is proportional to the number of cDNA fragments that originate from it. Paired-end RNA-Seq experiments produce two reads per fragment, but that doesn't necessarily mean that both reads will be mappable. For example, the second read is of poor quality. If we were to count reads rather than fragments, we might double-count some fragments but not others, leading to a skewed expression value. Thus, FPKM is calculated by counting fragments, not reads.

However, after analyzing around 10 tissues paired end, long, polyA+, RNA-Seq datasets (after mapping them with TopHat and Bowtie), I noticed that same genes that have expression of FPKM between >0 and <1 have ~200 RPKM. I think this difference could cause serious problems in defining accurate expression units and defining the number of expressed or up-regulated or down-regulated..

I would appreciate if any answer or comment on using RPKM over FPKM or vice versa ? Gracias! :)

rpkm fpkm rna-seq • 66k views
ADD COMMENTlink modified 9 months ago by Renesh1.2k • written 5.2 years ago by biorepine1.4k

Just to make sure - if I have paired and reads, then one read can be mapped an other not and in this case I will count it as one fragment? And if both reads are mapped, I will also count it as one fragment? (Otherwise I do not understand how we could double-count some fragments when counting raw reads). Thank you very much for explanation.

ADD REPLYlink written 5.2 years ago by Biomonika (Noolean)3.0k
8
gravatar for seidel
5.2 years ago by
seidel6.5k
United States
seidel6.5k wrote:

I think FPKM is the conceptually cleaner way to go, and thus is the preferred term. The rationale is that one is inferring expression level of a gene (concentration of a transcript) based on observations of a fragment from that transcript. Whether the presence of that fragment is quantified from 1 read, or 2 reads, is simply a technical concern, outside of the unit definition. Granted, you indicated a result where software reports different values on a data set for the two different units, but I would argue that's because of messy implementation. A read is evidence of a fragment, 2 paired-end reads are evidence of a fragment. Evidence of a fragment is used to count transcripts. Since both infer fragment counts, I think FPKM is the more general and appropriate term. (that's my opinion - though I'm not sure it helps your particular quandry).

ADD COMMENTlink written 5.2 years ago by seidel6.5k

If this is the case, I think recent ENCODE paper(http://www.nature.com/nature/journal/v489/n7414/abs/nature11233.html) used RPKM instead of FPKM for their paired-ends RNA-Seq data and I guess the main reason they found most of the transcriptome is expressed because they used RPKMs instead FPKMs. Uffff! What the helll!

ADD REPLYlink written 5.2 years ago by biorepine1.4k
7
gravatar for matted
5.2 years ago by
matted6.7k
Boston, United States
matted6.7k wrote:

I think there's some confusion in the question and comments here. FPKM are the "fancy" units that cufflinks uses specifically to report its probabilistic estimates of isoform abundances. They don't have direct mappings from individual reads, though of course they are estimated from the read data. The f instead of r is to unify the terminology to data from paired (and higher order) reads.

For more on this topic see Meaning Of Fpkm Value Used By Cufflinks and here.

So to me, "should I use FPKM" is more accurately "should I use Cufflinks." RPKM would typically be used by a more "direct" analysis that maps reads to specific single exons and yields an exon-level analysis, rather than a more complicated isoform-level analysis with advanced statistical techniques.

With that said, differences between FPKM and RPKM are most likely due to the complicated procedure the cufflinks follows to estimate isoform abundance, rather than any paired vs. single counting issue.

Furthermore, I don't think the FPKM vs. RPKM question has any direct bearing to the ENCODE results, as suggested in a comment above.

ADD COMMENTlink written 5.2 years ago by matted6.7k
3

You're mixing stuffs here. The quoted cufflinks explanation in the original question explains very clearly what FPKMs are. This unit is not specific to Cufflinks, and can be easily calculated manually for genes. It is meant to correct a small glitch in the RPKM calculation when using paired-end reads. This is explained in this video of a talk by Lior Pachter (https://www.youtube.com/watch?v=5NiFibnbE8o at 34:17)

What is specific to Cufflinks is that it gives FPKM measurements at transcript level. To do so it uses a complex methodology to deconvolute the reads mapping to a given gene model into the expression levels of all of its transcripts. FPKM is merely the unit that the authors chose to report their deconvoluted expression values 

I hope this clarifies things. In summary "should I use FPKM" is not the same as "should I use Cufflinks"

 

ADD REPLYlink written 3.0 years ago by julien.roux80
0
gravatar for Renesh
9 months ago by
Renesh1.2k
United States
Renesh1.2k wrote:

Please, read this article,

http://bioinfogeek.over-blog.com/2017/09/gene-expression-units-explained-rpm-rpkm-fpkm-and-tpm.html

ADD COMMENTlink written 9 months ago by Renesh1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1236 users visited in the last hour