6.2 years ago by
I use gene-level abundance (from genes.fpkm_tracking) for differential expression. At least in my experience, every time that I saw a potentially interesting different expression trends for different transcripts of the same gene, the alignment didn't really seem to support the predicted differences (I think the issue may be differences in low coverage, but I think uneven coverage is also a confounding issue). I have found differential splicing events (predicted from MATS, for example) to be decent, but that is not really related to gene expression differences / RPKM measurements.
In other words, do you get more consistent results from biological replicates from genes.fpkm_tracking rather than isoforms.fpkm_tracking (or gene_exp.diff versus isoform_exp.diff, from cuffdiff)? Of course, there may just be true biological variability, but this is one potential technical issue that I am aware of.
FYI, you need to provide DESeq with raw counts, not FPKM values.
Although most of my experience has been with cuffdiff (not cuffdiff2), I'm pretty sure the actual results will not be the same for cuffdiff (or cuffdiff2) and DESeq (or DESeq2). In fact, the cuffdiff2 paper compares the differences in results for cuffdiff2, DESeq, and edgeR:
My expectation is that DESeq will probably give better results than cuffdiff, but I think it will be useful to compare your own results. Once you have a table of raw counts, DESeq will run much more quickly than cuffdiff. So,your own personal benchmark shouldn't take that long, and I think it can help you be a lot more confident in your results.