I'm trying to remove batch effect from my data using the
sva package. The process described here is like this:
library(bladderbatch) data(bladderdata) pheno = pData(bladderEset) edata = exprs(bladderEset) # The null model contains only the adjustment variables. Since we are not adjusting # for any other variables in this analysis, only an intercept is included in # the model. mod0 = model.matrix(~1,data=pheno) # The full model includes both the adjustment variables # and the variable of interest mod <- model.matrix(~1+ cancer, data=pheno) # Identify the number of latent factors that need to be estimated. n.sv = num.sv(edata,mod,method="leek") # estimate the surrogate variables svobj = sva(edata,mod,mod0,n.sv=n.sv) # include the surrogate variables in both the null and full models modSv = cbind(mod,svobj$sv) mod0Sv = cbind(mod0,svobj$sv) # Adjusting for surrogate variables using the limma package fit = lmFit(edata,modSv)
Now I want to have a one vs all comparison. So, according to this post by Michael Love my model should be like this:
limma_mod <- model.matrix(~0 + cancer, data=pheno)
I'm confused whether I should use
model.matrix(~1+ cancer, data=pheno) as my full model and
model.matrix(~1, data=pheno) as the null model, then, for
model.matrix(~0+ cancer, data=pheno) with the surrogate variables, or use
model.matrix(~0+ cancer, data=pheno) as my full model and
model.matrix(~1, data=pheno) as the null model?
mod0 <- model.matrix(~0, data=pheno) and
mod <- model.matrix(~0+ cancer, data=pheno) results in this error:
Error in solve.default(t(mod0) %*% mod0) : 'a' is 0-diml
I don't really see how your post title relates to the body of your post, but as described here if you have known batch effects you want to use either ComBat (maybe to get CPM values for heatmaps) or simply account for them in your limma design matrix (for DE analysis), whereas SVA is useful if you have unknown sources of variation. Do you still wish to use SVA? If so I can dig through some old code and probably help out for real.
Sorry about the title. I am really confused on this topic. Working with combat is really easier but I actually don't know all sources of variation in my data. I would be grateful if you coul help me.