Low Fpkm Values From Cufflinks ?
1
0
Entering edit mode
10.2 years ago

I have aligned my data ( paired end RNA-SEQ ) to genome ( Hg 19 - ensemble ) using tophat2 and with a GTF file and default options. When I give the same referance GTF file and accepted_hits.bam to cufflinks ( v2.1.1 ), the isoform.fpkm_tracking file has FPKM values of range 1.66072e-316 for some of the transcripts. Is it normal ? Can we consider them as less abundant transcripts and move on with downstream analysis ?

cufflinks command is

cufflinks -o outdir -p 5 -G ref.gtf sample.bam
cufflinks tophat2 rna-seq cuffdiff rnaseq fpkm rpkm • 3.9k views
ADD COMMENT
1
Entering edit mode
10.2 years ago

A number that small is bound to be an artifact of computation - for most intents and purposes the abundance of that is zero.

ADD COMMENT
1
Entering edit mode

I agree - in fact, low coverage genes tend to yield unreasonably high fold-change values, so I use log2(RPKM + 0.1) for analysis (although that is not to say values less than 0.1 are technically problematic).

Here is a paper with a more detailed explanation:

http://bioinfo.aizeonpublishers.net/content/2013/6/285-292.html

The paper also includes some benchmarks with other algorithms, which is emphasized more here:

http://cdwscience.blogspot.com/2013/11/rna-seq-differential-expression.html

ADD REPLY

Login before adding your answer.

Traffic: 2773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6