Dear Biostars! I think this is one of the common problems (which expression units to use, FPKM or RPKM) in RNA-Seq expression analysis. People who use cufflinks end up with FPKM and ERANGE with RPKM. Cufflinks has nice explanation why FPKM save us from the skewed expression values called by other softwares especially with paired-end read data....
They're almost the same thing. RPKM stands for Reads Per Kilobase of transcript per Million mapped reads. FPKM stands for Fragments Per Kilobase of transcript per Million mapped reads. In RNA-Seq, the relative expression of a transcript is proportional to the number of cDNA fragments that originate from it. Paired-end RNA-Seq experiments produce two reads per fragment, but that doesn't necessarily mean that both reads will be mappable. For example, the second read is of poor quality. If we were to count reads rather than fragments, we might double-count some fragments but not others, leading to a skewed expression value. Thus, FPKM is calculated by counting fragments, not reads.
However, after analyzing around 10 tissues paired end, long, polyA+, RNA-Seq datasets (after mapping them with TopHat and Bowtie), I noticed that same genes that have expression of FPKM between >0 and <1 have ~200 RPKM. I think this difference could cause serious problems in defining accurate expression units and defining the number of expressed or up-regulated or down-regulated..
I would appreciate if any answer or comment on using RPKM over FPKM or vice versa ? Gracias! :)