Multiple exposure timelines (time-points) in one DESeq2 object or multiple DESeq objects
2
0
Entering edit mode
4 weeks ago
salman_96 ▴ 60

Hi, I am working on a study where I have control vs drug exposed samples to compare. An important part of the study design is the presence of different time points (controls at one day vs drug exposure at one day, controls at4 days vs drug exposure at 4 days etc.) The metadata info looks something like this below.

time_days<-(factor(c(10,10,10,10,10,10,4,4,4,4,4,4,1,1,1,1,1,1)))
coldata <- data.frame(dose,
                  condition = factor(c(
                    rep("control", 3),
                    rep("low", 3),
                    rep("control", 3),
                    rep("low", 3),
                    rep("control", 3),
                    rep("low", 3))),time_days)
coldata$Groups_of_Interest <- paste(coldata$condition,coldata$time_days,sep = "_")
coldata

I want suggestion on should I keep all samples to be compared together or split them based on time of exposure. In other words, should there be only one DESeq2 object for all or multiple DESeq2 objects for samples based on time points. In general, I do understand that splitting the DESeq2 object may compromise the normalization step.

Is there any suggestions please to look into?

Best regards

time points exposure drug DEGs multiple DESEQ2 • 332 views
ADD COMMENT
1
Entering edit mode
4 weeks ago
Rafael Soler ▴ 460

The dispersion estimates will change depending on whether you analyze the data together or separately, so the best strategy depends on the dispersion between samples. If some groups have greater dispersion than others, the groups with greater dispersion will affect those with less difference, so the best option will be to analyze them separately. If all the groups have a similar spread, it would be best to keep the data together. You can use the PCA plot with plotPCA to see if the dispersion between samples is small or large.

See also these posts as reference:

Dispersion estimation using DESeq2

DESeq2 with multiple variable give me different results

Best,

Rafa

ADD COMMENT
1
Entering edit mode

+1, fyi you can simply post a plain biostars link to an existing thread/answer, it will then display the title of the thread, there is no need to embed links manually

ADD REPLY
1
Entering edit mode

Oh, I see. Thank you :)

ADD REPLY
1
Entering edit mode
4 weeks ago

This depends on what question you are trying to answer.

If your questions is "Which genes vary between treated and untreated at each time point", then follow the advice set out above by @rafaelsoler9.

However, if you question is "Can I find genes where the expression time course is altered by treatment?", then you want to analyze them together with an interaction design.

You would use an LRT to compare the full model: ~ time_days + condition + condition:time_days

to the reduced model:

~time_days + condition

This will find genes whose timecourse is different between treatments.

ADD COMMENT

Login before adding your answer.

Traffic: 860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6