Question: Obtaining Expression Values (Fpkm) From Rna-Seq Data [Cufflinks/Cuffdiff]
gravatar for Sukhdeep Singh
4.8 years ago by
Sukhdeep Singh9.0k
Sukhdeep Singh9.0k wrote:

Hola! I have a couple of questions about obtaining expression values from the RNA-Seq dataset.

I would like to get the FPKM(replaced RPKM) values for the all the genes from RNA expression dataset (RNA-SEQ). I analysed using tophat and cufflinks.

1) Can I just take the values from the genes.fpkm_tracking file obtained after running the cuffdiff. (which has values for WT and condition being tested) which has the same values as in gene_exp.diff file. So, essentially taking the condA_FPKM value from the following dataset.

Example set:

tracking_id     class_code     nearest_ref_id     gene_id     gene_short_name     tss_id     locus     length     coverage     condA_FPKM     condA_conf_lo     condA_conf_hi     condA_status     condB_FPKM     condB_conf_lo     condB_conf_hi     condB_status
0610005C13Rik     -     -     0610005C13Rik     0610005C13Rik     TSS14039     chr7:45567794-45589710     -     -     0.22571     0     1.2222     OK     0.313291     0.749483     OK
0610007C21Rik     -     -     0610007C21Rik     0610007C21Rik     TSS22873     chr5:31036035-31054623     -     -     8.45646     4.77158     12.1413     OK     5.85864     4.54238     7.17491     OK
0610007L01Rik     -     -     0610007L01Rik     0610007L01Rik     TSS25102,TSS544     chr5:130219743-130243765     -     -     28.4043     20.4585     36.3502     OK     31.5888     28.4398     34.7379     OK
0610007N19Rik     -     -     0610007N19Rik     0610007N19Rik     TSS20841     chr15:32240567-32244662     -     -     0.453308     0.248242     0.658374     OK     0.459355     0.277206     0.641504     OK

2) Can also take the values from the genes.fpkm_tracking file obtained after running the cufflinks (though it lacks genenames). Should there be a difference in this value and the one obtained after running cuffdiff for the same locus.

3) What should be the cutoff for the raw FPKM value to say its significant without taking the condition into account.

Also, can FPKM value contribute directly to the expression value or is there any other factor to be taken into account as well.

Thanks a lot for your time.

ADD COMMENTlink modified 4.7 years ago by JC6.1k • written 4.8 years ago by Sukhdeep Singh9.0k
gravatar for JC
4.7 years ago by
JC6.1k wrote:


1) If you only need the expression levels is ok to do that.

2) Gene name are incorporated if you ran Cufflinks with the same annotation, but the values must be the same as Cuffdiff.

3) That's little hard to know because you already scaled your values to FPKM, some articles started proposing a significant minimal expression > 0.001 RPKM, but in reality this value depends on how many reads are mapped and how many are uniquely mapped, this is a strong bias in RPKM when you compare experiments with different coverage. For me, just one read uniquely mapped in a gene is enough to define minimal expression evidence.

ADD COMMENTlink written 4.7 years ago by JC6.1k

Thanks Juan, for the answers. Could you also comment on if I have three conditions A,B,C, why the FPKM values of condition A when running cuffdiff on A->B and A->C, are little different. That means the expression values are also, condition specific, which might should matter or not!!

ADD REPLYlink written 4.7 years ago by Sukhdeep Singh9.0k

cuffdiff normalize the samples to compare, that's why the values are different

ADD REPLYlink written 4.7 years ago by JC6.1k

Al right Thanks, I will test different threshold levels :)

ADD REPLYlink written 4.7 years ago by Sukhdeep Singh9.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1279 users visited in the last hour