Question: cufflinks with reference annotation, FPKM for a gene
0
gravatar for tonja.r
2.9 years ago by
tonja.r430
UK
tonja.r430 wrote:

 

 

I run a cufflinks (2.2.1) with annotated exons (I do not want t count intron reads) and got the following gene tracks and isoform tracks. FPKM for a gene is not the sum of FPKMs of the isoforms, so I am asking myself how they calculated the FPKM for a gene.

I also run cufflinks with only transcript annotation and got totally different results. 

​There is little information about cufflinks running with reference annotation. As far as I understand, cufflinks counts the reads only in the specified regions in the annotation file and as the total length of the exons belonging to one transcript differ from the total length of the transcript (because of the introns), the coverage and FPKM will also differ.
 

Only exon annotation:
Isoform tracks:

tracking_id    class_code    nearest_ref_id    gene_id    gene_short_name    tss_id    locus    length    coverage    FPKM    FPKM_conf_lo    FPKM_conf_hi    FPKM_status
ENSMUST00000162897.1    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3195981-3206425    4153    23.0161    2.7933    2.46392    3.12269    OK
ENSMUST00000159265.1    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3196603-3205713    2989    22.8029    2.76743    2.38783    3.14703    OK
ENSMUST00000070533.4    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3204562-3661579    3634    32.277    3.91723    3.5261    4.30836    OK


​gene tracks

tracking_id    class_code    nearest_ref_id    gene_id    gene_short_name    tss_id    locus    length    coverage    FPKM    FPKM_conf_lo    FPKM_conf_hi    FPKM_status
ENSMUSG00000051951.5    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3195981-3661579    -    -    9.47796    8.91496    10.041    OK
                         



With only transcript annotation:
isoform track

tracking_id    class_code    nearest_ref_id    gene_id    gene_short_name    tss_id    locus    length    coverage    FPKM    FPKM_conf_lo    FPKM_conf_hi    FPKM_status
ENSMUST00000162897.1    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3195981-3206425    10444    21.4978    2.58177    2.39326    2.77028    OK
ENSMUST00000159265.1    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3196603-3205713    9110    5.17168e-06    6.2109e-07    0    0.00526385    OK
ENSMUST00000070533.4    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3204562-3661579    457017    0.188298    0.0226136    0.0202322    0.024995    OK


gene track 

tracking_id    class_code    nearest_ref_id    gene_id    gene_short_name    tss_id    locus    length    coverage    FPKM    FPKM_conf_lo    FPKM_conf_hi    FPKM_status
ENSMUSG00000051951.5    -    -    ENSMUSG00000051951.5    Xkr4    -    chr1:3195981-3661579    -    -    2.60439    2.41573    2.79304    OK

 

 

rna-seq • 1.7k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by tonja.r430
1
gravatar for Carlo Yague
2.9 years ago by
Carlo Yague4.1k
Belgium
Carlo Yague4.1k wrote:

Hi again, glad u could make it work !

 

"FPKM for a gene is not the sum of FPKMs of the isoforms, so I am asking myself how they calculated the FPKM for a gene."

With the "only exon" annotation, FPKM for a gene is the sum of FPKMs of the isoforms : 2.7933 + 2.76743 + 3.91723 = 9.47796

 

Now providing an "only transcripts" annotation seems really weird to me because, as you said, Cufflinks won't know where are the introns. This is probably why you get lower FPKM values for the second and third isoforms. Also, to answer your question, FPKM are computed by Cufflink using a probalistic method with read length correction... read their paper if you want the details. I don't know why the sum of the isoforms FPKM values is not equal to the gene FPKM value but it is possibly linked to that  "probabilistic computation". Anyway, you probably shouldn't use only transcript annotation.

 

Best,

 

Carlo

 

ADD COMMENTlink written 2.9 years ago by Carlo Yague4.1k

I guess we overlooked it. It does sum up. Sorry

ADD REPLYlink written 2.9 years ago by tonja.r430
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 705 users visited in the last hour