DESeq2 Design Clarification
1
3
Entering edit mode
6.4 years ago

Hi,

I've done a model design and I hope someone can help out with my understanding of it!

I have an experimental setup that looks something like this:

3 Time points (0hrs, 6hrs, 12hrs)

3 Different Conditions (Treatments A, B and C)

So that makes 9 different combinations of time points and treatments, each is in triplicate. There are therefore 27 Samples

My design formula is : ~ Treatment + TimePoint + Treatment:TimePoint

My current understanding is this will give me small pValues of Treatment-specific effects over time?

I wanted to further refine this and look at a specific treatment and how it differs between two time points. So I used the following line:

foo <- list(c(“TreatmentA.TimePoint12hrs"), c(“TreatmentA.TimePoint0hrs"))

resMFType <- results(dds, contrast=foo)

Is this correct?

Thanks,

DESeq2 Model Design • 5.1k views
1
Entering edit mode

hi Andrew,

We try to discourage cross-posting to multiple online forums simultaneously as it makes it hard to follow the trail of responses for other users with the same question, and it duplicates the effort for the answerers.

Here's a link to the Bioc post https://support.bioconductor.org/p/63201/#63206

0
Entering edit mode

No problems, sorry!

6
Entering edit mode
6.4 years ago

"My current understanding is this will give me small pValues of Treatment-specific effects over time?"

I wouldn't say that, rather you're fitting with a model that can measure treatment-specific effects over time (i.e., time:treatment interactions as well as time-specific and treatment-specific effects). Whether the resulting p-values are small or not depends on whether there are any significant effects in the dataset at hand (yes, I'm being rather nit-picky here :) ).

BTW, you can shorten your design to ~Treatment*TimePoint.

The contrast you mentioned looks correct for looking at changes between 12 and 0hrs within treatmentA.

Anyway, you seem to have the correct design and know how to form the contrasts, so you should be good to go!

1
Entering edit mode

Nit-Picking, maybe, but I get your point, thanks! Do you know of any resources to learn more about the model designs? Documentation, particularly in DESeq2, seems to be sparse about specifics on it.

Another minor question (or maybe not so minor): Based on the model and contrast I described above, would subsetting the input data so I had just data for TreatmentA@12hrs and TreatmentA@0hrs and doing a simple ~ TimePoint model design yield the same results? i.e., Is that a like for like comparison?

3
Entering edit mode

point taken. we are working on building up more examples of real data analysis, which run live code, such as the new DESeq2 workflow linked from the first page of our vignette. for now, for exploring different designs, see the examples section of ?results which has a number of example designs and use of contrast to pull out different tables.

2
Entering edit mode

I personally find the book "Statistical Models" by Freedman to be pretty good (I have it on my desk in fact). He has a section on matrix algebra that should be helpful for those who never had to take linear algebra (i.e., pretty much all biologists with the exception of more biophysics or computational-focused people like me).

Your minor question turns out to be a very good one! The answer is no, subsetting won't typically produce the same results...and the reason is important. Tools like DESeq2 work by looking at all of the samples at once, which allows them to get a nice measurement of things like dispersion (the more samples you have, the better your measurement). By subsetting early on, you limit the amount of information available to make these estimates. If you had many more samples per treatment:time pairing, then subsetting probably wouldn't affect much, but with only an N of 3 I expect you'll get better results by fitting the whole model and using contrasts.