Unfortunately removeBatchEffects uses contr.sum in accounting for batch. That means that it effectually corrects everything to the average of all the batches.
Here is the removeBatchEffects code:
function (x, batch = NULL, batch2 = NULL, covariates = NULL,
design = matrix(1, ncol(x), 1), ...)
{
if (is.null(batch) && is.null(batch2) && is.null(covariates))
return(as.matrix(x))
if (!is.null(batch)) {
batch <- as.factor(batch)
contrasts(batch) <- contr.sum(levels(batch)) # <--------- this line tells it to average batches
batch <- model.matrix(~batch)[, -1, drop = FALSE]
}
if (!is.null(batch2)) {
batch2 <- as.factor(batch2)
contrasts(batch2) <- contr.sum(levels(batch2))
batch2 <- model.matrix(~batch2)[, -1, drop = FALSE]
}
if (!is.null(covariates))
covariates <- as.matrix(covariates)
X.batch <- cbind(batch, batch2, covariates)
fit <- lmFit(x, cbind(design, X.batch), ...)
beta <- fit$coefficients[, -(1:ncol(design)), drop = FALSE]
beta[is.na(beta)] <- 0
as.matrix(x) - beta %*% t(X.batch)
}
In order to correct the average of one batch to the average of the other, we need to use the default contrast scheme instead. This is as simple as removing the marked line in the above:
new_batch_effects_removeal <- function(m, batch, design, matrix(1, ncol(x), 1), ...) {
if (!is.null(batch)) {
batch <- as.factor(batch)
batch <- model.matrix(~batch)[, -1, drop = FALSE]
}
fit <- lmFit(x, cbind(design, batch), ...)
beta <- fit$coefficients[, -(1:ncol(design)), drop = FALSE]
beta[is.na(beta)] <- 0
as.matrix(x) - beta %*% t(batch)
}
In this simplified version I've removed refrences to batch2 and covariates. It will take the alphanumerically first batch (i.e. A before B, or 0 before 1, 1 before 2) as the reference and correct other batches to that.
I guess that you could work through the code of
removeBatchEffectsto find a way to do this