Question: why is FPKM equal to 0 in transcripts.gtf generating by cufflinks?
1
gravatar for 512788522
4.4 years ago by
51278852220
China
51278852220 wrote:

I have analyzed some RNA-seq data with Tophat and cufflinks, and I have several problems about the output. I run the tophat and cufflinks with default values, that is, I used the following command:

$/opt2/tools/tophat-2.0.13.Linux_x86_64/tophat -p 5 -G genes.gtf -o tophat_mut ./ucsc.hg19 2-K13-mut
.fastq.gz

$/opt/toolkit/cufflinks-2.2.1.Linux_x86_64/cufflinks -p 5 -u -g genes.gtf -o ./cufflinks ./tophat/a
ccepted_hits.bam

But the output(i.e. transcripts.gtf) looked like strange.

....................

chr1    Cufflinks    transcript    34611    36081    1    -    .    gene_id "FAM138A"; transcript_id "NR_026818_1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000"; full_read_support "no";

chr1    Cufflinks    exon    34611    35174    1    -    .    gene_id "FAM138A"; transcript_id "NR_026818_1"; exon_number "1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";

chr1    Cufflinks    exon    140075    140566    1000    -    .    gene_id "CUFF.2"; transcript_id "NR_039983"; exon_number "3"; FPKM "0.0401424461"; frac "1.000000"; conf_lo "0.000000"; conf_hi "0.080304"; cov "0.037191";
chr1    Cufflinks    transcript    323892    328581    1    +    .    gene_id "CUFF.1"; transcript_id "NR_028322_1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000"; full_read_support "no";

.....................

1.Why wad FPKM equal to 0? I found that many FPKM values were equal to 0, but some FPKM values were very high. Was this situation unusual? Why did it occur?

2.What did the "gene_id "CUFF.2""(or gene_id "CUFF.1") mean?

Thanks!

 

rna-seq • 4.0k views
ADD COMMENTlink modified 4.0 years ago by pengchy410 • written 4.4 years ago by 51278852220
1
gravatar for ashwini
4.4 years ago by
ashwini90
India
ashwini90 wrote:

FPKM=0 means that there were no reads which mapped to that location. Which may also mean that the particular transcript was not expressed.

 

ADD COMMENTlink written 4.4 years ago by ashwini90
1
gravatar for Devon Ryan
4.4 years ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:
  1. If no reads map to a gene, then its FPKM is 0.
  2. It has no meaning. Cufflinks just sequentially numbers features.
ADD COMMENTlink written 4.4 years ago by Devon Ryan89k

I have similar question.

Cufflinks has -F/--min-isoform-fraction option set to 10% by default so it should suppress isoforms of FPKM=0. But when I run Cufflinks with -g/--GTF-guide option, I usually get isoforms with FPKM=0 in output while there are other expressed transcripts in the same gene. 

ADD REPLYlink written 4.0 years ago by Dawn0
1

Cufflinks just doesn't try to tweak the assembly of those isoforms. The FPKMs should still be output.

ADD REPLYlink written 4.0 years ago by Devon Ryan89k
0
gravatar for pengchy
4.0 years ago by
pengchy410
China/Beijing
pengchy410 wrote:

It seems the reason that you have used a reference gene annotation file. The reference gene will output with FPKM 0 when there is no reads hit on them. However, after having checked the reference gene number appeared in the transcripts.gtf, I found it is less than the supplied "-r" reference gene number. In my case, 17500 in the reference and 16900 in the transcripts.gtf file. Still confused to this phenomenon.

Anyone who can give a explanation will be very appreciated.

ADD COMMENTlink written 4.0 years ago by pengchy410

http://seqanswers.com/forums/showthread.php?t=8218

This thread in seqanswers give more information and examples, but the problem still there.

ADD REPLYlink written 4.0 years ago by pengchy410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1165 users visited in the last hour