Question

Large Data Set Analysis With Cuffdiff

0

Entering edit mode

10.3 years ago

newDNASeqer ▴ 760

Hi,

I have 28 samples for analysis with cuffdiff, since cuffdiff performs permutations for all the samples, I am afraid 28 samples will take a long time to finish the permutations. I was wondering if I can split the 28 samples into smaller data set, each of which will be processed by cuffdiff and the results will be merged. I am not sure how to do this. Do I need to use a common sample in each sub set of cuffdiff analysis?

cuffdiff rnaseq • 2.6k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 10.3 years ago by newDNASeqer ▴ 760

Ram · Answer 1 · 2014-01-07

1

Entering edit mode

10.3 years ago

Johan ▴ 890

Does each sample represent a unique condition? If they are really representatives of different groups you should use the: --labels flag and compare the conditions rather than compare all samples vs. all other samples.

Your command line should end up looking something like this:

cuffdiff --labels CondA,CondB [all other options]  <transcripts.gtf> sample1_from_condA.bam, sample2_from_condA.bam sample3_from_condB.bam, sample4_from_condB.bam

This should reduce the number of comparisons that has to be performed, which should also reduce the run time. As the number of comparisons grows quadratically with the number of conditions to be compared it pays off to construct your study in a reasonable way with respect to this.

ADD COMMENT • link 10.3 years ago by Johan ▴ 890

0

Entering edit mode

Hi Johan,

I am doing similar kind of analysis.Does the transcript.gtf file consists of all the transcript gtfs from both the conditions(which can be accomplished by cuffmerge)?

Or I can use my annotation.gtf file instead there and compare the samples from two conditions?

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Ron ★ 1.2k