3.1 years ago by
Republic of Ireland
In order to better understand the batch correction methods, I highly recommend reading the following manuscript two or three times:
The conclusion that you should get from reading this is that correcting for batch directly with programs like ComBat is best avoided. If at all possible, include batch as a covariate in all of your statistical models / tests. Programs like DESeq2 allow you to include batch (like any other factor) as a covariate during normalisation.
Edit 1: 17th June 2018 ...and again in the early hours of October 8, 2019
It should be pointed out that there is a difference between modelling a batch effect and directly modifying data in order to counteract a batch effect. Programs like ComBat aim to directly modify your data in an attempt to eliminate batch effects (it literally 'subtracts' out the modeled effect, which can result in the infamous negative values after employing ComBat). After employing ComBat, statistical tests are conducted on the modified data, with batch not appearing in the design formula.
However, by including 'batch' as a covariate in a design formula for the purpose of differential expression analysis, one is simply modeling the effect size of the batch, without actually modifying your data. The modeled effect size is then taken into account when statistical inferences are made.
If you feel that you still need to correct for batch in your workdlow, then follow the guidance, here:
Why after VST are there still batches in the PCA plot?.
Edit 2: 17th June 2018.
Between ComBat and the
removeBatchEffect() function from limma, I would use the limma function.
Edit 3: 15th October 2019.
A further answer given here: Batch correction by using ComBat function in sva package