Hi community,

Our RNAseq experiemental design is as follow:

```
conditions batch rep samples
1 ctl 1 1 control_24_rep1
2 ctl 2 2 control_24_rep5
3 ctl 2 3 control_24_rep6
4 trt 1 1 treat_24_rep1
5 trt 2 2 treat_24_rep5
6 trt 2 3 treat_24_rep6
```

MDplot picks up the batch effect: https://image.ibb.co/dzhGna/MDplot.png

we are interested to compare treatment vs control

So in the model design I attempted to do as this

```
model.matrix( ~ batch + conditions)
```

This model accounts only for 40% of the variation https://ibb.co/mtGsSa, and detect about 11 significant genes

```
res <- results(dds, contrast = c('conditions', 'ctl', 'trt'))
summary(res)
out of 3595 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 8, 0.22%
LFC < 0 (down) : 3, 0.083%
outliers [1] : 0, 0%
low counts [2] : 1255, 35%
```

also after seeing some posts, it seems batch is confounded in rep, therefore I did as follow:

```
model.matrix(~ rep + conditions)
res <- results(dds, contrast = c('conditions', 'ctl', 'trt'))
summary(res)
out of 3595 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 35, 0.97%
LFC < 0 (down) : 10, 0.28%
outliers [1] : 0, 0%
low counts [2] : 1533, 43%
```

My question is the design = ~ rep + conditions correct in this setting, should I also include interaction terms? Or is there a better way to design the model?

Looking forward to your comment and help.

Thank you,

Xiaoping

Is

`rep`

the designation for the biological replicates and are they supposed to be paired between treatments? Your second design is pairing samples together. If that's appropriate then that's why you have more power. If that's not appropriate then you should accept the lower number of DE genes, since that' more likely to be correct.Thanks Ryan for the reply. Yes they are biological replicates. I am not following your second question about rep paired between treatments.