Hi community,
Our RNAseq experiemental design is as follow:
conditions batch rep samples
1 ctl 1 1 control_24_rep1
2 ctl 2 2 control_24_rep5
3 ctl 2 3 control_24_rep6
4 trt 1 1 treat_24_rep1
5 trt 2 2 treat_24_rep5
6 trt 2 3 treat_24_rep6
MDplot picks up the batch effect: https://image.ibb.co/dzhGna/MDplot.png
we are interested to compare treatment vs control
So in the model design I attempted to do as this
model.matrix( ~ batch + conditions)
This model accounts only for 40% of the variation https://ibb.co/mtGsSa, and detect about 11 significant genes
res <- results(dds, contrast = c('conditions', 'ctl', 'trt'))
summary(res)
out of 3595 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 8, 0.22%
LFC < 0 (down) : 3, 0.083%
outliers [1] : 0, 0%
low counts [2] : 1255, 35%
also after seeing some posts, it seems batch is confounded in rep, therefore I did as follow:
model.matrix(~ rep + conditions)
res <- results(dds, contrast = c('conditions', 'ctl', 'trt'))
summary(res)
out of 3595 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 35, 0.97%
LFC < 0 (down) : 10, 0.28%
outliers [1] : 0, 0%
low counts [2] : 1533, 43%
My question is the design = ~ rep + conditions correct in this setting, should I also include interaction terms? Or is there a better way to design the model?
Looking forward to your comment and help.
Thank you,
Xiaoping
Is
rep
the designation for the biological replicates and are they supposed to be paired between treatments? Your second design is pairing samples together. If that's appropriate then that's why you have more power. If that's not appropriate then you should accept the lower number of DE genes, since that' more likely to be correct.Thanks Ryan for the reply. Yes they are biological replicates. I am not following your second question about rep paired between treatments.