Question

Setup for multiple strains, multiple conditions, and replicates

1

Entering edit mode

8.5 years ago

montezuma ▴ 10

Hello all,

I have RNA-Seq data from an experiment with multiple strains and conditions and wanted to clarify the correct setup for DEG using the tuxedo pipeline.

The experiment is looking at two C. elegans strains: wild-type and mutant, both water treated or drug treated, and triplicate for each, so 12 samples total (2 x 2 x 3). What I have done:

I ran tophat for each sample, and then cufflinks separately for each sample.
To compare drug treatment to water treatment, I cuffmerged all cufflinks files from the wild-type strain and then ran cuffdiff from that output. I did the same process for the mutant strain.
This results in DEG for drug to water treatment in each strain separately, but I also want a comparison between strains. What would be the correct way of comparing the effect of drug treatment between the wild-type and mutant strain?

Thanks for the help, much appreciated!

cufflinks tuxedo rna-seq • 2.2k views

ADD COMMENT • link updated 20 months ago by Ram 43k • written 8.5 years ago by montezuma ▴ 10

Ram · Answer 1 · 2016-01-26

I have a similar setup and people work differently with these kinds of setup depending on the tool and workflow they use. There is no straight answer to this, after testing and searching, I can share with you what I do

What I do is use Tuxedo suite for most of the analysis.

Run Tophat individually on all samples and biological replicates (you can merge technical replicates before this step) which maps the reads.
Run Cuffquant which generates .cxb profiles and these can be used later via cuffdiff.
I completely skip the cuffmerge, because I am only interested in DEG not novel isoform discovery.
Run Cuffdiff here I combine all the replicates and conditions together. Replicates and same conditions are seperated by a comma (,) where as different conditions are seperated by space in a time series manner.

For the last step, you can run several scripts each time answering different question

WT1rep1,WT1rep2,WT2rep1,WT2rep2 Cond1rep1,Cond1rep2,Cond2rep1,Cond2rep2 -> will give you DEG b/w WT and mutant strain

WT1rep1,WT1rep2 WT2rep1,WT2rep2 -> will give you the differences between WT strains

WT1rep1,WT1rep2 Cond1rep1,Cond1rep2 and WT2rep1,WT2rep2 Cond2rep1,Cond2rep2, DEG's from these two should have a good overlap to say your strains and replicates are good in terms of reproducibility and experimentation.

Also this might be helpful

RNA seq time course data experiment design

HTH