I am analyzing data from microarrays with batch effect. To correct batch effect I am including this variable in the formula for the linear model:
formula <- paste("~0 ", "Group ", "Batch", sep = "+ ")
The group variable includes two categories: M_GO and NM_GO, whereas the batch variable includes three categories: Batch1, Batch2 and Batch3.
After that, I have created the design matrix, where the name of the columns are the following:
M_GO NM_GO Batch1 Batch2
Both categories for group appear because there is not an intercept. But, regarding the batch variable, only two of three categories are present. Samples with Batch1=1 & Batch2=0 will be Batch1, samples with Batch1=0 & Batch2=1 will be Batch2, and samples with Batch1=0 & Batch2=0 will be Batch3.
However, my question is regarding the contrastsmatrix. As Batch3 does not appear in the design matrix, the only comparison that can be perfomed for batch is "Batcheffect = Batch1 - Batch2".
contrasts <- c("GO_MvsNM = M_GO - NM_GO", "Batcheffect = Batch1 - Batch2")
Is this contrasts matrix taking into account the 3 categories for batch effect? Am I doing the analysis to correct batch effect properly?
Thank you very much in advance!!