Question: Low Fpkm Values From Cufflinks ?
0
gravatar for geek_y
5.2 years ago by
geek_y9.4k
Barcelona/CRG/London/Imperial
geek_y9.4k wrote:

I have aligned my data ( paired end RNA-SEQ ) to genome ( Hg 19 - ensemble ) using tophat2 and with a GTF file and default options. When I give the same referance GTF file and accepted_hits.bam to cufflinks ( v2.1.1 ), the isoform.fpkm_tracking file has FPKM values of range 1.66072e-316 for some of the transcripts. Is it normal ? Can we consider them as less abundant transcripts and move on with downstream analysis ?

cufflinks command is

cufflinks -o outdir -p 5 -G ref.gtf sample.bam
ADD COMMENTlink modified 5.2 years ago by Istvan Albert ♦♦ 80k • written 5.2 years ago by geek_y9.4k
1
gravatar for Istvan Albert
5.2 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

A number that small is bound to be an artifact of computation - for most intents and purposes the abundance of that is zero.

ADD COMMENTlink written 5.2 years ago by Istvan Albert ♦♦ 80k
1

I agree - in fact, low coverage genes tend to yield unreasonably high fold-change values, so I use log2(RPKM + 0.1) for analysis (although that is not to say values less than 0.1 are technically problematic).

Here is a paper with a more detailed explanation:

http://bioinfo.aizeonpublishers.net/content/2013/6/285-292.html

The paper also includes some benchmarks with other algorithms, which is emphasized more here:

http://cdwscience.blogspot.com/2013/11/rna-seq-differential-expression.html

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by Charles Warden6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 692 users visited in the last hour