Question: including batch in design~

Morris_Chair

Hello, I want to add the batch option in DESeqDataSet but I get error

```
dds=DESeqDataSetFromTximport(txi,colData = samples,design=~ batch + condition)
```

**Error in DESeqDataSet(se, design = design, ignoreRank) :
all variables in design formula must be columns in colData**

I don't know what should I add in the coldData (in my table) in order to have the batch argument working ..

if I remove batch from the command line it's all good

```
dds=DESeqDataSetFromTximport(txi,colData = samples,design=~ batch + condition)
```

**using counts and average transcript lengths from tximport**

Any help will be appreciated

thanks

`batch`

should a=be a column in the`samples`

data.frame you provided6.1kHi Asaf,

what that column named batch should contain?

Thank you

The name of the batch (like sequencing lane ID for instance) each library was prepared and sequenced in

Hi Asaf, I followed your suggestion but another error came out

here is my sample data.frame

Here is my design

from the vignette

thank you

the meaning of batch is a group of libraries that might be influenced by a confounding effect like technician or sequencing. Of course each library is different but that's exactly what you are testing. Unless you have prior knowledge about groups of libraries that might have a confounding effect you don't need to batch correct.

I want to compare how much the PCA plots or heatmap change when I subtract the batch effect, do you have any idea why is not working in my code? I read in the vignette that possibilities to have the error message like in my case are two but to my understanding none of them fit the situation above

thank you

120What batches do you have in your data? In the table you didn't introduce a batch effect, batch effect should group several libraries, not be library specific.

6.1kHi Asaf, After few days of trying I have to ask again because I can't still figure it out. I give you a summary of the situation, it's a bit different compared to the one above but I hope we can solve it

Here is my coldata

I give the name to each file sample names(files) <- paste0((colData$sample_id),1:8)

and here is the code dds with the design formula,

I have two errors: the design formula contains a numeric variable with integer values, specifying a model with increasing fold change for higher values. did you mean for this to be a factor? if so, first convert this variable to a factor using the factor() function Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.

can you help me to fix it ? what is the way to introduce the batch in the formula,

I can fix those error but using letter instead of number like I, but then I have another problem

it's something wrong with this batch argument because if I take it out it works ok ..

thank you

120