Entering edit mode
4.7 years ago
Morris_Chair
▴
350
Hello, I want to add the batch option in DESeqDataSet but I get error
dds=DESeqDataSetFromTximport(txi,colData = samples,design=~ batch + condition)
Error in DESeqDataSet(se, design = design, ignoreRank) : all variables in design formula must be columns in colData
I don't know what should I add in the coldData (in my table) in order to have the batch argument working ..
if I remove batch from the command line it's all good
dds=DESeqDataSetFromTximport(txi,colData = samples,design=~ batch + condition)
using counts and average transcript lengths from tximport
Any help will be appreciated
thanks
batch
should a=be a column in thesamples
data.frame you providedHi Asaf,
what that column named batch should contain?
Thank you
The name of the batch (like sequencing lane ID for instance) each library was prepared and sequenced in
Hi Asaf, I followed your suggestion but another error came out
here is my sample data.frame
Here is my design
from the vignette
thank you
the meaning of batch is a group of libraries that might be influenced by a confounding effect like technician or sequencing. Of course each library is different but that's exactly what you are testing. Unless you have prior knowledge about groups of libraries that might have a confounding effect you don't need to batch correct.
I want to compare how much the PCA plots or heatmap change when I subtract the batch effect, do you have any idea why is not working in my code? I read in the vignette that possibilities to have the error message like in my case are two but to my understanding none of them fit the situation above
thank you
What batches do you have in your data? In the table you didn't introduce a batch effect, batch effect should group several libraries, not be library specific.
Hi Asaf, After few days of trying I have to ask again because I can't still figure it out. I give you a summary of the situation, it's a bit different compared to the one above but I hope we can solve it
Here is my coldata
I give the name to each file sample names(files) <- paste0((colData$sample_id),1:8)
and here is the code dds with the design formula,
I have two errors: the design formula contains a numeric variable with integer values, specifying a model with increasing fold change for higher values. did you mean for this to be a factor? if so, first convert this variable to a factor using the factor() function Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
can you help me to fix it ? what is the way to introduce the batch in the formula,
I can fix those error but using letter instead of number like I, but then I have another problem
it's something wrong with this batch argument because if I take it out it works ok ..
thank you