I am analyzing several clinical experiments in DESeq2.
Half of the samples are control (ctrl), nine samples are treated with combined drug (combined), and two are treated with single drug (single).
Sampleinfo X treatment drug 1 10A ctrl none 2 10B therapy combined 3 12A ctrl none 4 12B therapy combined 5 13A ctrl none 6 13B therapy combined 7 16A ctrl none 8 16B therapy combined 9 19A ctrl none 10 19B therapy combined 11 24A ctrl none 12 24B therapy single 13 2A ctrl none 14 2B therapy single 15 34A ctrl none 16 34B therapy combined 17 6A ctrl none 18 6B therapy combined 19 7A ctrl none 20 7B therapy combined 21 9A ctrl none 22 9B therapy combined
all(Sampleinfo$Sample == colnames(count_data)) 1 TRUE
I want to analyze differential expression of: A) difference between untreated (ctrl) and combined B) difference between untreated (ctrl) and single C) difference between combined and single treatment, with ctrl as the untreated ctrl
I am mostly interested in question C, as A and B have already been described.
I have found some examples of complex design, but am unclear how to make my sample info table so that I can perform these complex designs. Any input appreciated.
Maybe it has to do with factor levels??
Note on factor levels
By default, R will choose a reference level for factors based on alphabetical order. Then, if you never tell the DESeq2 functions which level you want to compare against (e.g. which level represents the control group), the comparisons will be based on the alphabetical order of the levels. There are two solutions: you can either explicitly tell results which comparison to make using the contrast argument (this will be shown later), or you can explicitly set the factors levels. In order to see the change of reference levels reflected in the results names, you need to either run DESeq or nbinomWaldTest/nbinomLRT after the re-leveling operation. Setting the factor levels can be done in two ways, either using factor:
dds$condition <- factor(dds$condition, levels = c("untreated","treated"))
…or using relevel, just specifying the reference level:
dds$condition <- relevel(dds$condition, ref = "untreated")
If you need to subset the columns of a DESeqDataSet, i.e., when removing certain samples from the analysis, it is possible that all the samples for one or more levels of a variable in the design formula would be removed. In this case, the droplevels function can be used to remove those levels which do not have samples in the current DESeqDataSet:
dds$condition <- droplevels(dds$condition)