Question: How is FPKM variability handled?
gravatar for NHEJ
5.3 years ago by
United States
NHEJ320 wrote:

I have seen a recurring trend that FPKM values tend to be highly variable/volatile quantities across samples.  For example, for the same transcript of a given gene, an FPKM value for this transcript could be 10 or 15 or 5 (across three samples analyzed by Cufflinks, for example).   

How does one computationally handle the variability in FPKM value?  How "real" are these values and how much can one trust them in a lab setting?

rna-seq fpkm • 3.9k views
ADD COMMENTlink modified 5.3 years ago by Devon Ryan94k • written 5.3 years ago by NHEJ320
gravatar for Vishaka Datta
5.3 years ago by
Vishaka Datta80 wrote:

How much can one trust them : The cufflinks output files (genes.fpkm_tracking, isoforms.fpkm_tracking) will contain confidence intervals for each of the FPKM estimates (under the headings FPKM_conf_lo and FPKM_conf_hi). Similarly, cuffdiff output also contains the FPKM estimates and confidence intervals used for differential expression testing. These intervals will give you an idea as to the extent of variability within each sample. 

You could also do a pair-wise scatter plot of the FPKM values of each of your samples. That is, plot the FPKM values from sample 1 along the axis, and FPKM values from sample 2 on the y axis to look at how much variability is present in your data. The scatter should appear "elliptical", with the major axis along the y = x line if the variability between replicates is "low".

How "real" are these values : I didn't quite follow your question. FPKM values are estimates, and have errors associated with them due to many reasons. The confidence interval is an estimate of that error. Was that what you meant?

ADD COMMENTlink modified 5.3 years ago • written 5.3 years ago by Vishaka Datta80

+1 for great response.  Yes, when you say that FPKM values are estimates that have errors associated with them due to many reasons, could you please name some of those reasons (and cite a relevant publication)?

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by NHEJ320

Reasons for errors - Off the top of my head :

1) Sequence reads might align to multiple locations within the genome. This makes assigning a given read to a particular gene ambiguous. Cufflinks has a few ways of dealing with it (the multi-read correct option, for instance). See the cufflinks manual for this -

2) The fragments in the library may be very short, or from a particular location within the gene, etc. This means that the FPKM values of the gene that gave rise to these fragments might be underestimated. Again, cufflinks gives you an option on whether you want to correct for this (Paper : )

3) The supplement of the original cufflinks paper gives you details on the various approximations, assumptions made in computing FPKM estimates. I can't list all of them here.

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by Vishaka Datta80

See this figure from the DESeq paper and, of course, the accompanying paper:

ADD REPLYlink written 5.3 years ago by Sean Davis26k
gravatar for Devon Ryan
5.3 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

The best practice is to avoid using FPKM for anything. If you need to use an expectation maximization algorithm in your workflow, then keep things in estimated counts. One of the (many) issues with FPKM/RPKM is that they lose precision information. Thus, an FPKM of 1 in samples 1 and 2 will likely have completely different errors associated with them and tracking/handling this is very non-trivial (I think cuffdiff converted to a more DESeq-like method internally a while back, perhaps because of this).

ADD COMMENTlink written 5.3 years ago by Devon Ryan94k

Could you please point me to a relevant publication explaining why estimated counts are a superior measure than FPKM?  So is DESeq superior to Cufflinks then, in this regard?

ADD REPLYlink written 5.3 years ago by NHEJ320

Not off-hand, search pubmed. This has been more heavily discussed in blog posts. BTW, it's unclear what cuffdiff actually uses internally these days (its method seems to have changed drastically over the years) for computation.

ADD REPLYlink written 5.3 years ago by Devon Ryan94k

If it's not formally published that estimated counts are a superior measure to FPKM, then it's just personal preference (particularly in blogs, where people mostly speak of their experience with their data that they analyzed).  If that claim is not objectively benchmarked (as seen in high-impact papers that compare methods), then it is just a claim, plain and simple.  

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by NHEJ320

That's not how it works in bioinformatics. Many things in this field never get fully published, but reside on blogs and bioarxiv or even only in presentations. If you want publications, you should be able to find a few on pubmed. I don't have sufficient time to perform trivial searches for you.

ADD REPLYlink written 5.3 years ago by Devon Ryan94k

There are comparisons between FPKM and counts with respect to which measures and methods of modelling RNA-seq data tend to give better differential expression results. 

ADD REPLYlink modified 5.3 years ago • written 5.3 years ago by Vishaka Datta80

I think Devon Ryan is absolutely right.

This issue has been discussed so many times on this forum. And if you put some effort to search the comparison of FPKM with other methods on pubmed. I am sure you would get more papers to get your answer.

I am not sure that anyone would like to waste their time to provide you all the evidences and citations, for what they are answering your issues.

ADD REPLYlink written 5.3 years ago by Manvendra Singh2.1k

Indeed, and apart from the specific discussion in this thread, FPKM/RPKM should not be used for differential expression analysis - there is no cross-sample normalisation performed when deriving these expression units.

Please read this: A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis

The Total Count and RPKM [FPKM] normalization methods, both of which are still widely in use, are ineffective and should be definitively abandoned in the context of differential analysis.

Also, by Harold Pimental: What the FPKM? A review of RNA-Seq expression units

The first thing one should remember is that without between sample normalization (a topic for a later post), NONE of these units are comparable across experiments. This is a result of RNA-Seq being a relative measurement, not an absolute one.

ADD REPLYlink written 11 months ago by Kevin Blighe55k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1879 users visited in the last hour