Question

why does cuffdiff command line contain parameters suited for cufflinks?

0

Entering edit mode

8.6 years ago

grokaine ▴ 40

For those who don't know cuffdiff is a program belonging to the cufflinks set of programs designed for RNA-Seq. While cuffdiff is designed to do differential expression analysis I noticed there are several mapping related options over there that I never used, because they seem related to cufflinks, the program that maps the reads to the genome/transcripts which I already use previously in my pipelines. Even on cuffdiff documentation, it is cufflinks being mentioned instead of cuffdiff.

http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/

So my question is in which cases should cuffdiff use those parameters? Also how does it work in practice, does cuffdiff order a re-assembly through cufflinks?

Example of parameters I am interested in (pasted from the above link, but there are several other similar parameters):

-compatible-hits-norm

With this option, Cufflinks counts only those fragments compatible with some reference transcript towards the number of mapped fragments used in the FPKM denominator. Using this mode is generally recommended in Cuffdiff to reduce certain types of bias caused by differential amounts of ribosomal reads which can create the impression of falsely differentially expressed genes. It is active by default.

-b/-frag-bias-correct <genome.fa>

Providing Cufflinks with the multifasta file your reads were mapped to via this option instructs it to run our bias detection and correction algorithm which can significantly improve accuracy of transcript abundance estimates. See How Cufflinks Works for more details.

-u/-multi-read-correct

Tells Cufflinks to do an initial estimation procedure to more accurately weight reads mapping to multiple locations in the genome. See How Cufflinks Works for more details.

RNA-Seq • 2.8k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by grokaine ▴ 40

score 2 · Accepted Answer · 2015-09-16

2

Entering edit mode

8.6 years ago

Devon Ryan 104k

Cuffdiff needs to recalculate everything, since if you cuffmerge cufflinks-generated GTF files then the input metrics will no longer be correct.

ADD COMMENT • link 8.6 years ago by Devon Ryan 104k

0

Entering edit mode

*edit: sorry I wasn't very clear and had to do some edits, please read again my answer: What input metrics are you talking about? Any input metrics are probably computed by cuffdiff based on the assembled gtf files and the provided alignments (sam/bam) coming out of tophat. So there are no "input" metrics, only internal metrics that cuffdiff computes. While an option like -u would actually correct the assembly itself. It makes no sense to me. Or are you implying that cuffdiff would actually re-assemble all over again?

ADD REPLY • link 8.6 years ago by grokaine ▴ 40

0

Entering edit mode

Options like -u don't affect assembly, they affect estimation. Cufflinks does an initial estimation along with assembly, but cuffdiff redoes these estimations using the merged assembly. The options you mentioned all pertain to FPKM estimation and have nothing to do with assembly.

ADD REPLY • link 8.6 years ago by Devon Ryan 104k