Question: Cluster strategy for a time-series analysis in DESeq2
gravatar for Cecelia
2.7 years ago by
Cecelia20 wrote:


I would like to do a time-series analysis in DESeq2. I have following experimental design: Treatment: Infected and Non-infected (Treat vs. Control) Time: 1, 4, 12 hours For each Treatment-Time point combination, we have imbalanced biological replicates (from 3 to 8).

I already did the transcript quantification using Salmon. And now I would like to have some advice on DESeq2 analysis. I followed the tutorial from here: to build a complete design:

ddsTCpus14 <- DESeqDataSetFromTximport(txifilesevipus14, colData=colDatapus,
                              design = ~ Treatment + Time +  Treatment:Time)

and with a reduced design:

ddsTCpus14 <- DESeq(ddsTCpus14, test="LRT", reduced = ~ Time + Treatment)

And the resultnames are:

> resultsNames(ddsTCpus14)
[1] "Intercept"                  "Treatment_Treat_vs_Control"
[3] "Time_12h_vs_1h"             "Time_4h_vs_1h"             
[5] "TreatmentTreat.Time12h"     "TreatmentTreat.Time4h"

So my questions are:

Question 1

I am thinking of making the lists of DE genes for my comparisons: First comparing between Treat vs. Control in each time point:

#1hour treat vs control

pusres1 <- results(ddsTCpus14, name="Treatment_Treat_vs_Control", test="Wald", alpha=0.05)

#4hour treat vs control

pusres4 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time4h" )), test="Wald", alpha=0.05)

#12hour treat vs control

pusres12 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time12h" )), test="Wald", alpha=0.05)

Then comparing the treat:time interaction term between different time points:

#1h to 4h

pusresinter14 <- results(ddsTCpus14, name="Time4h.TreatmentTreat", test="Wald", alpha=0.05)

#1h to 12h
pusresinter14 <- results(ddsTCpus14, name="TreatmentTreat.Time12h", test="Wald", alpha=0.05)

My question is: Is there a way to test the interaction term between 4h and 12h? I read through this post but still could not figure it out.

Question 2

Assuming I have the DEG list of all the comparisons. Does it make sense if I first filter all the DEG list by adjusted P-value and fold change and than combine all the list into one big DEG list. Then I do a clustering (using hclust or other clustering approach) based on the big list. All the downstream go enrichment test will be based on the clusters.

Question 3

I am not sure if the imbalanced numbers replicates would influence the clustering. Is there any suggestions?

Some of the ideas probably make no sense for you specialists, but any suggestion would be very much appreciated!

Thanks in advance! Cecelia

timecourse rna-seq deseq2 hclust • 1.8k views
ADD COMMENTlink written 2.7 years ago by Cecelia20
gravatar for Kevin Blighe
2.7 years ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

For question 2: there is no problem in doing that - that is a standard procedure.

For question 3: imbalanced replicate numbers will affect the statistical inferences from your sample data, which will of course indirectly influence the clustering. For one, in an unbalanced dataset, the same p- or adjusted p-value used as cut-off for statistical significance will have a different 'meaning' between one balanced comparison and another imbalanced comparison.

Now, question 1: I cannot be entirely sure

ADD COMMENTlink modified 4 months ago • written 2.7 years ago by Kevin Blighe67k

Thanks a lot for your reply. Really helpful!

ADD REPLYlink written 2.7 years ago by Cecelia20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2070 users visited in the last hour