Question: cuffdiff and changes in FPKM values due to isoforms
0
gravatar for jo_grodem
4.9 years ago by
jo_grodem0
Norway
jo_grodem0 wrote:

Hi, I hope that someone can help with this or direct me to the right thread - I need help on how to accurately report out RNAseq data.

I just started in a lab where they have used a service to perform the two rounds of RNAseq and the downstream bioinformatics. We were delivered tables with FPKM values for each gene in each treatment, and tables with differential expression analysis (of genes, not isoforms). The question here arises from the cuffdiff output.

The set up is (round 1) treatment 1 vs 2 and (round 2) treatment 3 v 4, 3 v 5, 4 vs 5. Each performed in triplicate

The FPKM values differ for about 2 % of the genes between the two lists, and the company explained this as an effect of isoforms, but that the FPKM values in both tables are correct.

e.g. For one gene of interest:

 

(A) FPKM values taken from FPKM tables: T1 (36,3145) T2 (38,0397) T3 (36,489) T4 (34,001) T5 (38,242)

(B) FPKM values taken from pairwsie comparision datasets:

T 1 vs 2: T 1 (0,867) T 2 (1,693)

T 3 v 4, 3 v 5, 4 vs 5: T3 (36,489) T4 (34,001) T5 (38,242)

As you can see, for the first 2 treatments, the values change. For the final 3, the values do not change in the diff exp analysis.  We are concerned as treatment 1 and 3 are the same, just repeated in different rounds of RNAseq.

 

I want to report the expression of this gene in each treatment and between treatment.  If I take the values from (A) OR (B) the result is different. 

When I asked the company how to correctly handle this, the company said that “it seems that the values of fpkm tables fit better”…can I pick and chose?!  Does anyone know the best way to accurately report this?

 

Thank you in advance

 

rna-seq forum • 1.5k views
ADD COMMENTlink modified 4.9 years ago by Michael Dondrup47k • written 4.9 years ago by jo_grodem0
1
gravatar for Michael Dondrup
4.9 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

I cannot directly comment on the cause for the differences but would advise caution with respect to using FPKM to report gene expression values. It is not state-of-the-art.

We have had several reports of problems with using FPKM on BioStars, from users and multiple publications. It seems that you are not interested in isoform expression, then use of FPKM is not justified imo.

I would recommend re-analyzing for gene-wise DE analysis using raw counts from bam files or raw data under a negative binomial model (edgeR or DEseq) and report changes as normalized fold change as well as raw counts or CPM/TPM. 

I should add that it might be an advantage to get most or all of the computation under your control, as it is problematic to publish data that you cannot fully understand nor guarantee their correctness.

ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by Michael Dondrup47k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 977 users visited in the last hour