Hello,
I would like to do a time-series analysis in DESeq2. I have following experimental design:
- Treatment: Infected and Non-infected (Treat vs. Control)
- Time: 1, 4, 12 hours
- For each Treatment-Time point combination, we have imbalanced biological replicates (from 3 to 8).
I already did the transcript quantification using Salmon. And now I would like to have some advice on DESeq2 analysis. I followed the tutorial from here: http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments to build a complete design:
ddsTCpus14 <- DESeqDataSetFromTximport(txifilesevipus14, colData=colDatapus,
design = ~ Treatment + Time + Treatment:Time)
and with a reduced design:
ddsTCpus14 <- DESeq(ddsTCpus14, test="LRT", reduced = ~ Time + Treatment)
And the resultnames are:
resultsNames(ddsTCpus14)
[1] "Intercept" "Treatment_Treat_vs_Control"
[3] "Time_12h_vs_1h" "Time_4h_vs_1h"
[5] "TreatmentTreat.Time12h" "TreatmentTreat.Time4h"
So my questions are:
Question 1
I am thinking of making the lists of DE genes for my comparisons. First comparing between Treat vs. Control in each time point:
#1hour treat vs control
pusres1 <- results(ddsTCpus14, name="Treatment_Treat_vs_Control", test="Wald", alpha=0.05)
#4hour treat vs control
pusres4 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time4h" )), test="Wald", alpha=0.05)
#12hour treat vs control
pusres12 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time12h" )), test="Wald", alpha=0.05)
Then comparing the treat:time interaction term between different time points:
#1h to 4h
pusresinter14 <- results(ddsTCpus14, name="Time4h.TreatmentTreat", test="Wald", alpha=0.05)
#1h to 12h
pusresinter14 <- results(ddsTCpus14, name="TreatmentTreat.Time12h", test="Wald", alpha=0.05)
My question is: Is there a way to test the interaction term between 4h and 12h? I read through this post but still could not figure it out.
https://support.bioconductor.org/p/65676/
Question 2
Assuming I have the DEG list of all the comparisons. Does it make sense if I first filter all the DEG list by adjusted P-value and fold change and than combine all the list into one big DEG list. Then I do a clustering (using hclust or other clustering approach) based on the big list. All the downstream go enrichment test will be based on the clusters.
Question 3
I am not sure if the imbalanced numbers replicates would influence the clustering. Is there any suggestions?
Some of the ideas probably make no sense for you specialists, but any suggestion would be very much appreciated!
Thanks in advance!
Cecelia
Thanks a lot for your reply. Really helpful!