**200**wrote:

I have set a model where I look to different conditions adjusting on a potential batch effect.(see here PCA plot ) To confirm that I tried to use SVA package.

I 'm doing as follows and then the plot looks like this . emt in the model is batch . (see here)

```
dds <- DESeqDataSetFromMatrix(countData=countdata,colData=sampleTable, design =~ emt + condition)
dds <- dds[rowSums(counts(dds)) > 1,]
dds <- DESeq(dds)
sizeFactors(dds)
dat <- counts(dds, normalized=TRUE)
idx <- rowMeans(dat) > 1
dat <- dat[idx,]
mod <- model.matrix(~ emt + condition, colData(dds))
mod0 <- model.matrix(~ emt, colData(dds))
# To see how many surrogate I have
n.sv = num.sv(dat,mod,method="leek")
# plot 2 surrogate variables
printn.sv)
svseq <- svaseq(dat, mod, mod0, n.sv=2)
```

I 'm not sure to understand what i see on the plot. SV1 is relative to me to the batch effect ( *_2015 vs others) SV2 is relative to the condition, my variable of interest. (For info, Mant and T6 are cells a long time after treatment , Unt & T0 are controls treated cells, T1 are cells early after treatment, so yes contion is time relative)

Am I right ?

But I was thinking that it should have shown others sources of variation, other than the ones I set in my model . What do you think ?

UPDATE : If i set ~1 in the mod0 model , i have this plot . SV1 doesn't change but SV2 clearly separate conditions with a lot of sense (here)

```
mod <- model.matrix(~ emt + condition, colData(dds))
mod0 <- model.matrix(~ 1, colData(dds))
```