Question: Obtaining Expression Values (Fpkm) From Rna-Seq Data [Cufflinks/Cuffdiff]
1
gravatar for Sukhdeep Singh
17 months ago by
Sukhdeep Singh4.6k
Germany
Sukhdeep Singh4.6k wrote:

Hola! I have a couple of questions about obtaining expression values from the RNA-Seq dataset.

I would like to get the FPKM(replaced RPKM) values for the all the genes from RNA expression dataset (RNA-SEQ). I analysed using tophat and cufflinks.

1) Can I just take the values from the genes.fpkm_tracking file obtained after running the cuffdiff. (which has values for WT and condition being tested) which has the same values as in gene_exp.diff file. So, essentially taking the condA_FPKM value from the following dataset.

Example set:

tracking_id     class_code     nearest_ref_id     gene_id     gene_short_name     tss_id     locus     length     coverage     condA_FPKM     condA_conf_lo     condA_conf_hi     condA_status     condB_FPKM     condB_conf_lo     condB_conf_hi     condB_status
0610005C13Rik     -     -     0610005C13Rik     0610005C13Rik     TSS14039     chr7:45567794-45589710     -     -     0.22571     0     1.2222     OK     0.313291     0.749483     OK
0610007C21Rik     -     -     0610007C21Rik     0610007C21Rik     TSS22873     chr5:31036035-31054623     -     -     8.45646     4.77158     12.1413     OK     5.85864     4.54238     7.17491     OK
0610007L01Rik     -     -     0610007L01Rik     0610007L01Rik     TSS25102,TSS544     chr5:130219743-130243765     -     -     28.4043     20.4585     36.3502     OK     31.5888     28.4398     34.7379     OK
0610007N19Rik     -     -     0610007N19Rik     0610007N19Rik     TSS20841     chr15:32240567-32244662     -     -     0.453308     0.248242     0.658374     OK     0.459355     0.277206     0.641504     OK

2) Can also take the values from the genes.fpkm_tracking file obtained after running the cufflinks (though it lacks genenames). Should there be a difference in this value and the one obtained after running cuffdiff for the same locus.

3) What should be the cutoff for the raw FPKM value to say its significant without taking the condition into account.

Also, can FPKM value contribute directly to the expression value or is there any other factor to be taken into account as well.

Thanks a lot for your time.

ADD COMMENTlink modified 17 months ago by JC4.8k • written 17 months ago by Sukhdeep Singh4.6k
1
gravatar for JC
17 months ago by
JC4.8k
Seattle
JC4.8k wrote:

Hola,

1) If you only need the expression levels is ok to do that.

2) Gene name are incorporated if you ran Cufflinks with the same annotation, but the values must be the same as Cuffdiff.

3) That's little hard to know because you already scaled your values to FPKM, some articles started proposing a significant minimal expression > 0.001 RPKM, but in reality this value depends on how many reads are mapped and how many are uniquely mapped, this is a strong bias in RPKM when you compare experiments with different coverage. For me, just one read uniquely mapped in a gene is enough to define minimal expression evidence.

ADD COMMENTlink written 17 months ago by JC4.8k

Thanks Juan, for the answers. Could you also comment on if I have three conditions A,B,C, why the FPKM values of condition A when running cuffdiff on A->B and A->C, are little different. That means the expression values are also, condition specific, which might should matter or not!!

ADD REPLYlink written 17 months ago by Sukhdeep Singh4.6k

cuffdiff normalize the samples to compare, that's why the values are different

ADD REPLYlink written 17 months ago by JC4.8k

Al right Thanks, I will test different threshold levels :)

ADD REPLYlink written 17 months ago by Sukhdeep Singh4.6k
Please log in to add an answer.

Help
Access
  • RSS
  • Stats
  • API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.0.0
Traffic: 881 users visited in the last hour