Question: Cluster strategy for a time-series analysis in DESeq2
gravatar for Cecelia
13 months ago by
Cecelia20 wrote:


I would like to do a time-series analysis in DESeq2. I have following experimental design: Treatment: Infected and Non-infected (Treat vs. Control) Time: 1, 4, 12 hours For each Treatment-Time point combination, we have imbalanced biological replicates (from 3 to 8).

I already did the transcript quantification using Salmon. And now I would like to have some advice on DESeq2 analysis. I followed the tutorial from here: to build a complete design:

ddsTCpus14 <- DESeqDataSetFromTximport(txifilesevipus14, colData=colDatapus,
                              design = ~ Treatment + Time +  Treatment:Time)

and with a reduced design:

ddsTCpus14 <- DESeq(ddsTCpus14, test="LRT", reduced = ~ Time + Treatment)

And the resultnames are:

> resultsNames(ddsTCpus14)
[1] "Intercept"                  "Treatment_Treat_vs_Control"
[3] "Time_12h_vs_1h"             "Time_4h_vs_1h"             
[5] "TreatmentTreat.Time12h"     "TreatmentTreat.Time4h"

So my questions are:

Question 1

I am thinking of making the lists of DE genes for my comparisons: First comparing between Treat vs. Control in each time point:

#1hour treat vs control

pusres1 <- results(ddsTCpus14, name="Treatment_Treat_vs_Control", test="Wald", alpha=0.05)

#4hour treat vs control

pusres4 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time4h" )), test="Wald", alpha=0.05)

#12hour treat vs control

pusres12 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time12h" )), test="Wald", alpha=0.05)

Then comparing the treat:time interaction term between different time points:

#1h to 4h

pusresinter14 <- results(ddsTCpus14, name="Time4h.TreatmentTreat", test="Wald", alpha=0.05)

#1h to 12h
pusresinter14 <- results(ddsTCpus14, name="TreatmentTreat.Time12h", test="Wald", alpha=0.05)

My question is: Is there a way to test the interaction term between 4h and 12h? I read through this post but still could not figure it out.

Question 2

Assuming I have the DEG list of all the comparisons. Does it make sense if I first filter all the DEG list by adjusted P-value and fold change and than combine all the list into one big DEG list. Then I do a clustering (using hclust or other clustering approach) based on the big list. All the downstream go enrichment test will be based on the clusters.

Question 3

I am not sure if the imbalanced numbers replicates would influence the clustering. Is there any suggestions?

Some of the ideas probably make no sense for you specialists, but any suggestion would be very much appreciated!

Thanks in advance! Cecelia

ADD COMMENTlink written 13 months ago by Cecelia20

For question 2: there is no problem in doing that - that is a standard procedure.

For question 3: imbalanced replicate numbers will affect the statistical inferences from your sample data, which will of course indirectly influence the clustering. For one, in an unbalanced dataset, the same p- or adjusted p-value used as cut-off for statistical significance will have a different 'meaning' between one balanced comparison and another imbalanced comparison.

Now, question 1: I cannot be entirely sure. I think that something like:

results(ddsTCpus14, contrast=list("Treatment_Treat_vs_Control", c("TreatmentTreat.Time12h","TreatmentTreat.Time4h")), test="Wald", alpha=0.05)

should work.

ADD REPLYlink modified 13 months ago • written 13 months ago by Kevin Blighe41k

Thanks a lot for your reply. Really helpful!

ADD REPLYlink written 13 months ago by Cecelia20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 657 users visited in the last hour