13 months ago by
Republic of Ireland
Buenas tardes amigo/a,
In order to better understand the batch correction methods, I highly recommend reading the following manuscript two or three times:
The conclusion that you should get from reading this is that correcting for batch directly with programs like ComBat is best avoided. If at all possible, include batch as a covariate in all of your statistical models / tests. Programs like DESeq2 allow you to include batch (like any other factor) as a covariate during normalisation.
Edit 1: 17th June 2018.
It should be pointed out that there is a difference between modelling a batch effect and directly modifying counts in order to counteract a batch effect. Programs like ComBat aim to directly modify your counts in an attempt to eliminate batch effects (it literally 'subtracts' out the modelled effect, which can result in the infamous negative values after employing ComBat). After employing ComBat, statistical tests are conducted on the modified counts, with batch not appearing in the design formula.
However, by including 'batch' as a covariate in design formulae for the purposes of differential expression analysis, one is simply modelling the effect sizes of the batch (without actually adjusting your raw or normalised counts), which are then used to adjust the statistical inferences when calculating p-values for your condition of interest. This does not modify the underlying data. In DESeq2, however, the
rld() transformations can adjust the actual transformed counts based on batch (and anything else in your design formula) by setting
blind=FALSE, and this is the recommended procedure by Michael Love when the aim is to use the transformed counts for downstream analyses.
Edit 2: 17th June 2018.
Between ComBat and the
removeBatchEffects() function from limma, I would use the limma function.