I have the following data: a). 3 strains (wild type & two mutants); b). 2 conditions: untreated & treated; c). 3 time points (min): 0, 15, 90. Each time point , except for 0 time point (base line control), has untreated & treated samples. There two replicates for each condition+time point.
strain treatment time replicate id
S1.1 SC5314-wt untr 0 r1 SC5314-wt_untr_0_r1
S2.6 SC5314-wt untr 0 r2 SC5314-wt_untr_0_r2
S1.2 SC5314-wt untr 15 r1 SC5314-wt_untr_15_r1
S2.7 SC5314-wt untr 15 r2 SC5314-wt_untr_15_r2
S1.3 SC5314-wt trt 15 r1 SC5314-wt_trt_15_r1
S2.8 SC5314-wt trt 15 r2 SC5314-wt_trt_15_r2
S1.4 SC5314-wt untr 90 r1 SC5314-wt_untr_90_r1
S2.9 SC5314-wt untr 90 r2 SC5314-wt_untr_90_r2
S1.5 SC5314-wt trt 90 r1 SC5314-wt_trt_90_r1
S2.10 SC5314-wt trt 90 r2 SC5314-wt_trt_90_r2
IR1.11 irs4-mut untr 0 r1 irs4-mut_untr_0_r1
IR2.16 irs4-mut untr 0 r2 irs4-mut_untr_0_r2
IR1.12 irs4-mut untr 15 r1 irs4-mut_untr_15_r1
IR2.17 irs4-mut untr 15 r2 irs4-mut_untr_15_r2
IR1.13 irs4-mut trt 15 r1 irs4-mut_trt_15_r1
IR2.18 irs4-mut trt 15 r2 irs4-mut_trt_15_r2
IR1.14 irs4-mut untr 90 r1 irs4-mut_untr_90_r1
IR2.19 irs4-mut untr 90 r2 irs4-mut_untr_90_r2
IR1.15 irs4-mut trt 90 r1 irs4-mut_trt_90_r1
IR2.20 irs4-mut trt 90 r2 irs4-mut_trt_90_r2
IN1.21 inp51-mut untr 0 r1 inp51-mut_untr_0_r1
IN2.26 inp51-mut untr 0 r2 inp51-mut_untr_0_r2
IN1.22 inp51-mut untr 15 r1 inp51-mut_untr_15_r1
IN2.27 inp51-mut untr 15 r2 inp51-mut_untr_15_r2
IN1.23 inp51-mut trt 15 r1 inp51-mut_trt_15_r1
IN2.28 inp51-mut trt 15 r2 inp51-mut_trt_15_r2
IN1.24 inp51-mut untr 90 r1 inp51-mut_untr_90_r1
IN2.29 inp51-mut untr 90 r2 inp51-mut_untr_90_r2
IN1.25 inp51-mut trt 90 r1 inp51-mut_trt_90_r1
IN2.30 inp51-mut trt 90 r2 inp51-mut_trt_90_r2
Did the following: with DESeq2_1.6.3
> sampledata$time <- factor(sampledata$time,levels = c("0", "15", "90"))
> sampledata$sample <- factor (paste0(sampledata$strain,"_", sampledata$treatment))
> sampledata$sample
[1] SC5314-wt_untr SC5314-wt_untr SC5314-wt_untr SC5314-wt_untr SC5314-wt_trt SC5314-wt_trt SC5314-wt_untr SC5314-wt_untr SC5314-wt_trt SC5314-wt_trt irs4-mut_untr irs4-mut_untr irs4-mut_untr
[14] irs4-mut_untr irs4-mut_trt irs4-mut_trt irs4-mut_untr irs4-mut_untr irs4-mut_trt irs4-mut_trt inp51-mut_untr inp51-mut_untr inp51-mut_untr inp51-mut_untr inp51-mut_trt inp51-mut_trt
[27] inp51-mut_untr inp51-mut_untr inp51-mut_trt inp51-mut_trt
Levels: inp51-mut_trt inp51-mut_untr irs4-mut_trt irs4-mut_untr SC5314-wt_trt SC5314-wt_untr
> sampledata$time
[1] 0 0 15 15 15 15 90 90 90 90 0 0 15 15 15 15 90 90 90 90 0 0 15 15 15 15 90 90 90 90
Levels: 0 15 90
#failed with the design:
> ddsCATS <- DESeqDataSetFromMatrix(countData = countsdata, colData = sampledata,design = ~ sample + time + sample:time)
Error in DESeqDataSet(se, design = design, ignoreRank) :
the model matrix is not full rank, so the model cannot be fit as specified.
one or more variables or interaction terms in the design formula
are linear combinations of the others and must be removed
# this worked
> ddsCATS <- DESeqDataSetFromMatrix(countData = countsdata, colData = sampledata,design = ~ sample + time)
#define factor level references
> ddsCATS$sample <- relevel (ddsCATS$sample, "SC5314-wt_untr")
> ddsCATS$time <- relevel (ddsCATS$time, "0")
#LRT test
> ddsCATS <- DESeq(ddsCATS, test="LRT", reduced = ~ time)
> resddsCATS <- results(ddsCATS)
> resultsNames(ddsCATS)
[1] "Intercept" "sampleinp51.mut_untr" "sampleirs4.mut_trt" "sampleirs4.mut_untr" "sampleSC5314.wt_trt" "sampleSC5314.wt_untr" "time_15_vs_0" "time_90_vs_0"
> resddsCATS <- results(ddsCATS)
> resddsCATS
log2 fold change (MLE): time 90 vs 0
LRT p-value: '~ sample + time' vs '~ time'
DataFrame with 8051 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
C2_00990W_A 0.02556151 0.88717584 4.894719 0.3568103 0.9964362 NA
C2_00990W_B 0.02556151 0.88717584 4.894719 0.3568103 0.9964362 NA
C2_02640C_B 0.02976336 0.02591547 4.919825 0.4767291 0.9929506 NA
C3_04310C_A 0.03574418 -0.61652583 4.918296 0.5500559 0.9901756 NA
C3_06300W_B 0.03094594 0.13771721 4.919847 0.4435255 0.9940455 NA
... ... ... ... ... ... ...
C2_01000W_B 23247.92 -0.4226896 0.1487500 2.368988 0.7960833 0.9354503
C2_03270W_A 32929.48 -0.4830989 0.2028736 2.764650 0.7362141 0.9142012
C2_03270W_B 37671.48 -0.4933234 0.1991737 5.960478 0.3100836 0.6393959
C5_05050W_A 60685.44 -0.4559260 0.1654281 3.772222 0.5826539 0.8407806
C5_05050W_B 80734.06 -0.4962704 0.1552556 4.574513 0.4699761 0.7663622
Questions: 1). I would like to compare the effect of each condition at each time point both within and between three strains. what should be the full and reduced design terms for LRT test? 2). how do I setup contrasts to compare conditions at different time points both within and across the strains? 3). Lastly, what happened to sampleinp51.mut_trt as resultsNames(ddsCATS) does not list it?
Thanks.