Question: Is Cufflinks2's normalization method reliable for differential expression analysis?
0
gravatar for kevin.l.yang
3.1 years ago by
kevin.l.yang0 wrote:

I am using both DESeq2 and Cufflinks2 for differential expression analysis for some mouse RNAseq data. I was reading this guide (http://chagall.med.cornell.edu/RNASEQcourse/Intro2RNAseq.pdf) for help. Page 49 of the guide says that RPKM/FPKM isn't suitable for differential expression analysis, since it doesn't take into account sequencing depth among different samples. However, I figured that going from Cufflinks to Cufflinks2, the developers would have accounted for this issue. Can I trust my results from Cufflink2? I am making a poster right now, and I don't want to put my Cufflinks2 results if they're not reliable

rna-seq • 859 views
ADD COMMENTlink modified 3.1 years ago by Kevin Blighe66k • written 3.1 years ago by kevin.l.yang0
2
gravatar for Kevin Blighe
3.1 years ago by
Kevin Blighe66k
Kevin Blighe66k wrote:

You're right: for differential expression itself, I don't believe that Cufflinks is ideal due to the fact that it doesn't handle the wide variation in counts that can exist. One can end up with log base 2 fold change difference of upward of 100, which is astronomical and doesn't make much sense. You are not required to use FPKM with Cufflinks, though; If you must use Cufflinks, for whatever reason, then use geometric normalisation.

DESeq2 handles these issues very well and allows you to perform a regularised log transformation (rlog) or variance stabilised transformation (VST) on your data, which produces very nice P values and fold-changes. So, many people then try to extract raw counts from the TopHat/Cufflinks method and then do the differential expression in DESeq2, but this is not possible directly using TopHat/Cufflinks.

A much simpler way to analyse RNA-seq data is with Kallisto ( https://pachterlab.github.io/kallisto/ ), which extracts raw counts over a CDS FASTA that you supply. If you supply GENCODE's CDS FASTA ( https://www.gencodegenes.org/releases/current.html ), then you can get count abundances over upward of 200,000 transcripts and their isoforms.

ADD COMMENTlink modified 21 months ago • written 3.1 years ago by Kevin Blighe66k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1814 users visited in the last hour