Does the order of the batch variable matter in Limma design matrix?
1
0
Entering edit mode
6 weeks ago
CTLong ▴ 110

Hi all,

I come to find that the order of variables in generating a design matrix with batch effect is different in DESeq2 and Limma. Just wondering if I have misinterpreted anything, or is this a discrepancy between the two packages.

Here's how I would model batch effect with DESeq2

dds <- DESeqDataSetFromMatrix(countData = df,
design = ~ batch + condition)


For Limma I would do the following

design <- model.matrix(~0 + condition + batch)


I'm slightly less confident about the matrix design that I specified for Limma. Just want to ask if this is the correct order in how the variables should be assigned, and whether I should include the 0 preceeded all my variables to be able to specify contrasts, if I am performing pairwise comparisons.

Batch • 413 views
1
Entering edit mode
6 weeks ago
Gordon Smyth ★ 7.0k

limma works with the variables in either order and with or without the 0+. So there is absolutely no reason why you can't use ~ batch + condition. But we often suggest that you use ~0 + condition + batch so that you can construct contrasts between the conditions in an intuitive way. If you remove the intercept by using 0+ then the order of factors does matter because you can only remove the intercept once.

Which way you setup the design matrix is just convenience. It is entirely up to you.

For more discussion, see A guide to creating design matrices for gene expression experiments.

0
Entering edit mode

Hi Gordon, thanks for the advice. I did notice that setting the design ~0 + batch + condition allows me to include all the contrasts I want for my conditions. Whereas ~0 + condition + batch would cause one of my conditions to be excluded from the design matrix to I can't make a contrast with that condition. Good to know that the order does not impact the actual DE anlaysis.

0
Entering edit mode

I think you mean the other way around.

But excluding one of the conditions from the design matrix does not prevent you from making contrasts with that condition. limma can easily test all the pairwise contrasts regardless of which design matrix you use. If the first condition is excluded from the design matrix, you just need a slightly deeper understanding of what the coefficients mean in order to form the pairwise comparisons. As I keep saying, it is entirely a matter of convenience.