Deseq2 with crossover design
1
0
Entering edit mode
3.3 years ago
Ludo25 • 0

Hi Everyone,

I'm using Deseq2 to find out the differentially expressed genes between treatments at the end of a 2x2 crossover study. I went through the Deseq2 vignette and forums but was not able to find example using this specific design.

My colData looks like this:

Subject     Treatment         Time        Sequence
S1             T1              Pre          T1-T2
S1             T1              Post         T1-T2     
S1             T2              Pre          T1-T2     
S1             T2              Post         T1-T2
S2             T2              Pre          T2-T1     
S2             T2              Post         T2-T1  
S2             T1              Pre          T2-T1    
S2             T1              Post         T2-T1

As i said, i'd like to test for the genes that are significantly changed between the two treatments (T1 vs T2) at the end of the study (Post). Does the following design seem appropriate for this purpose?

~ Subject + Treatment + Time + Time*Treatment

And if so, what contrast from resultsNamesshould i use?

Thank you so much for your reply!

LD

Deseq2 crossover • 1.0k views
ADD COMMENT
0
Entering edit mode

Time*Treatment is equivalent to Time + Treatment + Time:Treatment, so your formula as written should be Subject + Treatment*Time.

Let's say that your base factor levels are T1 and Pre using this formula. The interaction term will model the comparison of (T2_pre - T1_pre) - (T2_post - T1_post), which means it will return genes whose fold change difference between the two treatment regimes changes over time (while controlling for subject and main effects). Is this the comparison you wanted?

ADD REPLY
0
Entering edit mode

Thanks a lot for your quick reply rpolicastro!

Thank you for the clarification on the formula. Sorry, I may not have made it clear enough but i don't want to identify genes that change over time but genes which have a significant T1 vs T2 fold change at final time point (=Post). So, what contrast or name should I use in the results() function to get these genes (as you mentioned, i'd use T1 and Pre as base factor levels)?

ADD REPLY
2
Entering edit mode
3.3 years ago

Make a new factor level that is a combination of treatment and time, such as T1_Pre, T1_Post, etc. As an example let's call that new factor 'Combined'.

coldata$Combined <- paste(coldata$Treatment, coldata$Time, sep="_")

> coldata
  Subject Treatment Time Sequence Combined
1      S1        T1  Pre    T1-T2   T1_Pre
2      S1        T1 Post    T1-T2  T1_Post
3      S1        T2  Pre    T1-T2   T2_Pre
4      S1        T2 Post    T1-T2  T2_Post
5      S2        T2  Pre    T2-T1   T2_Pre
6      S2        T2 Post    T2-T1  T2_Post
7      S2        T1  Pre    T2-T1   T1_Pre
8      S2        T1 Post    T2-T1  T1_Post

Your formula for regression would be ~ Subject + Combined, and your contrast for use in the results function would be c("Combined", "T2_Post", "T1_Post").

ADD COMMENT
0
Entering edit mode

Thanks again for this explanation! It's very clear now.

By the way, the Subject variable should be considered as random effect but as far as I know, Deseq2 cannot handle random variables; so do you see any alternative/better ways to include this variable?

ADD REPLY
0
Entering edit mode

I'm not sure if any DE software supports random effects (someone correct me if I'm wrong), so you'll have to maintain it as a fixed effect.

ADD REPLY
0
Entering edit mode

Thanks a lot for your reply!

ADD REPLY
0
Entering edit mode

By the way, when using the formula with combined factors (Treatment and Time) and then the contrast you mentioned above, I understand that you get the difference between treatments at the final time point (as desired), but do you control for baseline (Pre) as well?

ADD REPLY
0
Entering edit mode

Can you explain a little more by what you mean for controlling for the baseline level?

ADD REPLY
0
Entering edit mode

Of course rpolicastro! By "controlling for baseline" I meant taking into account difference at time=0 (or "Pre" here) when calculating difference between treatments at the final time point ("Post"). Is that clearer?

ADD REPLY
0
Entering edit mode

This question sounds closer to the original method of using an interaction term discussed early. If you use the interaction term you will get back genes whose log2 FC is different between Pre and Post.

ADD REPLY

Login before adding your answer.

Traffic: 1393 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6