Hi, I hope that someone can help with this or direct me to the right thread - I need help on how to accurately report out RNAseq data.
I just started in a lab where they have used a service to perform the two rounds of RNAseq and the downstream bioinformatics. We were delivered tables with FPKM values for each gene in each treatment, and tables with differential expression analysis (of genes, not isoforms). The question here arises from the cuffdiff output.
The set up is (round 1) treatment 1 vs 2 and (round 2) treatment 3 v 4, 3 v 5, 4 vs 5. Each performed in triplicate
The FPKM values differ for about 2 % of the genes between the two lists, and the company explained this as an effect of isoforms, but that the FPKM values in both tables are correct.
e.g. For one gene of interest:
(A) FPKM values taken from FPKM tables: T1 (36,3145) T2 (38,0397) T3 (36,489) T4 (34,001) T5 (38,242)
(B) FPKM values taken from pairwsie comparision datasets:
T 1 vs 2: T 1 (0,867) T 2 (1,693)
T 3 v 4, 3 v 5, 4 vs 5: T3 (36,489) T4 (34,001) T5 (38,242)
As you can see, for the first 2 treatments, the values change. For the final 3, the values do not change in the diff exp analysis. We are concerned as treatment 1 and 3 are the same, just repeated in different rounds of RNAseq.
I want to report the expression of this gene in each treatment and between treatment. If I take the values from (A) OR (B) the result is different.
When I asked the company how to correctly handle this, the company said that “it seems that the values of fpkm tables fit better”…can I pick and chose?! Does anyone know the best way to accurately report this?
Thank you in advance