why is FPKM equal to 0 in transcripts.gtf generating by cufflinks?
3
1
Entering edit mode
9.4 years ago
512788522 ▴ 20

I have analyzed some RNA-seq data with Tophat and cufflinks, and I have several problems about the output. I run the tophat and cufflinks with default values, that is, I used the following command:

$/opt2/tools/tophat-2.0.13.Linux_x86_64/tophat -p 5 -G genes.gtf -o tophat_mut ./ucsc.hg19 2-K13-mut
.fastq.gz
$/opt/toolkit/cufflinks-2.2.1.Linux_x86_64/cufflinks -p 5 -u -g genes.gtf -o ./cufflinks ./tophat/a
ccepted_hits.bam

But the output (i.e. transcripts.gtf) looked like strange.

chr1    Cufflinks    transcript    34611    36081    1    -    .    gene_id "FAM138A"; transcript_id "NR_026818_1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000"; full_read_support "no";
chr1    Cufflinks    exon    34611    35174    1    -    .    gene_id "FAM138A"; transcript_id "NR_026818_1"; exon_number "1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";

chr1    Cufflinks    exon    140075    140566    1000    -    .    gene_id "CUFF.2"; transcript_id "NR_039983"; exon_number "3"; FPKM "0.0401424461"; frac "1.000000"; conf_lo "0.000000"; conf_hi "0.080304"; cov "0.037191";
chr1    Cufflinks    transcript    323892    328581    1    +    .    gene_id "CUFF.1"; transcript_id "NR_028322_1"; FPKM "0.0000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000"; full_read_support "no";

1.Why wad FPKM equal to 0? I found that many FPKM values were equal to 0, but some FPKM values were very high. Was this situation unusual? Why did it occur? 2.What did the gene_id "CUFF.2" (or gene_id "CUFF.1") mean?

Thanks!

RNA-Seq • 6.6k views
ADD COMMENT
1
Entering edit mode
9.4 years ago
ashwini ▴ 100

FPKM=0 means that there were no reads which mapped to that location. Which may also mean that the particular transcript was not expressed.

ADD COMMENT
1
Entering edit mode
9.4 years ago
  1. If no reads map to a gene, then its FPKM is 0.
  2. It has no meaning. Cufflinks just sequentially numbers features.
ADD COMMENT
0
Entering edit mode

I have similar question.

Cufflinks has -F/--min-isoform-fraction option set to 10% by default so it should suppress isoforms of FPKM=0. But when I run Cufflinks with -g/--GTF-guide option, I usually get isoforms with FPKM=0 in output while there are other expressed transcripts in the same gene.

ADD REPLY
1
Entering edit mode

Cufflinks just doesn't try to tweak the assembly of those isoforms. The FPKMs should still be output.

ADD REPLY
0
Entering edit mode
9.0 years ago
pengchy ▴ 450

It seems the reason that you have used a reference gene annotation file. The reference gene will output with FPKM 0 when there is no reads hit on them. However, after having checked the reference gene number appeared in the transcripts.gtf, I found it is less than the supplied "-r" reference gene number. In my case, 17500 in the reference and 16900 in the transcripts.gtf file. Still confused to this phenomenon.

Anyone who can give a explanation will be very appreciated.

ADD COMMENT
0
Entering edit mode

http://seqanswers.com/forums/showthread.php?t=8218

This thread in seqanswers give more information and examples, but the problem still there.

ADD REPLY

Login before adding your answer.

Traffic: 1949 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6