Obtaining Expression Values (Fpkm) From Rna-Seq Data [Cufflinks/Cuffdiff]
Entering edit mode
11.6 years ago

Hola! I have a couple of questions about obtaining expression values from the RNA-Seq dataset.

I would like to get the FPKM(replaced RPKM) values for the all the genes from RNA expression dataset (RNA-SEQ). I analysed using tophat and cufflinks.

1) Can I just take the values from the genes.fpkm_tracking file obtained after running the cuffdiff. (which has values for WT and condition being tested) which has the same values as in gene_exp.diff file. So, essentially taking the condA_FPKM value from the following dataset.

Example set:

tracking_id     class_code     nearest_ref_id     gene_id     gene_short_name     tss_id     locus     length     coverage     condA_FPKM     condA_conf_lo     condA_conf_hi     condA_status     condB_FPKM     condB_conf_lo     condB_conf_hi     condB_status
0610005C13Rik     -     -     0610005C13Rik     0610005C13Rik     TSS14039     chr7:45567794-45589710     -     -     0.22571     0     1.2222     OK     0.313291     0.749483     OK
0610007C21Rik     -     -     0610007C21Rik     0610007C21Rik     TSS22873     chr5:31036035-31054623     -     -     8.45646     4.77158     12.1413     OK     5.85864     4.54238     7.17491     OK
0610007L01Rik     -     -     0610007L01Rik     0610007L01Rik     TSS25102,TSS544     chr5:130219743-130243765     -     -     28.4043     20.4585     36.3502     OK     31.5888     28.4398     34.7379     OK
0610007N19Rik     -     -     0610007N19Rik     0610007N19Rik     TSS20841     chr15:32240567-32244662     -     -     0.453308     0.248242     0.658374     OK     0.459355     0.277206     0.641504     OK

2) Can also take the values from the genes.fpkm_tracking file obtained after running the cufflinks (though it lacks genenames). Should there be a difference in this value and the one obtained after running cuffdiff for the same locus.

3) What should be the cutoff for the raw FPKM value to say its significant without taking the condition into account.

Also, can FPKM value contribute directly to the expression value or is there any other factor to be taken into account as well.

Thanks a lot for your time.

cuffdiff rna-seq expression next-gen cufflinks fpkm • 6.6k views
Entering edit mode
11.6 years ago
JC 13k


1) If you only need the expression levels is ok to do that.

2) Gene name are incorporated if you ran Cufflinks with the same annotation, but the values must be the same as Cuffdiff.

3) That's little hard to know because you already scaled your values to FPKM, some articles started proposing a significant minimal expression > 0.001 RPKM, but in reality this value depends on how many reads are mapped and how many are uniquely mapped, this is a strong bias in RPKM when you compare experiments with different coverage. For me, just one read uniquely mapped in a gene is enough to define minimal expression evidence.

Entering edit mode

Thanks Juan, for the answers. Could you also comment on if I have three conditions A,B,C, why the FPKM values of condition A when running cuffdiff on A->B and A->C, are little different. That means the expression values are also, condition specific, which might should matter or not!!

Entering edit mode

cuffdiff normalize the samples to compare, that's why the values are different

Entering edit mode

Al right Thanks, I will test different threshold levels :)


Login before adding your answer.

Traffic: 1527 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6