Hi, I'm currently struggling with an RNA-Seq experiment, especially with the batch-effects which potentially effects my analysis. I want to use svaseq() from the sva-package like recommended here (chapter "Removing hidden batch effects"), to find and account for hidden surrogate variables.

Because I do not see a clear batch-effect clustering with my raw-data I thought to estimate the batch-effects with num.sv() and then use this result with svaseq(). Interestingly, num.sv() gives me **10 surrogate variables** which confused me a bit. Additionally svaseq() runs into an error when using **10 surrogate variables**. Because of that I wanted to define the n.sv-argument by hand using the number of my assumed batch-effects which is **3**.

My question is now, what should I put into the n.sv-argument within svaseq()? Is it simply the **3**? In the above-mentioned manual they write:

As we described above, we are trying to recover any hidden batch effects, supposing that we do not know the cell line information... Finally we specify that we want to estimate 2 surrogate variables.

Here is what they define for svaseq:

```
svseq <- svaseq(dat, mod, mod0, n.sv=2)
```

So they want to add the cell line as possible surrogate variable but then define n.sv with two possible surrogate variables. Why? Are they assuming, beside the cell line, another batch-effect? At the end, they add these two variables to the DESeq2-design which seems to represent the cell line effect. Maybe I missed it. However, it is not described very clearly.

Thanks for all your help in advance.