Entering edit mode
13 days ago
Eliveri
▴
350
Hi all,
I'm working with Illumina EPIC methylation array data and I'm using ComBat (from the sva package) for batch correction.
I’d like to correct for multiple sources of batch effects:
- Chip/Sentrix_ID
- Sentrix_Row (position on chip)
- Sample Plate
As far as I understand, the standard ComBat() function accepts only one batch variable. So I’m wondering:
- What is the best practice to correct for multiple batch variables? Do I run ComBat() sequentially (e.g., first by plate, then by chip, then by row)? Or is there a more appropriate way to handle this?
I’ve seen some approaches using model matrices (mod) to adjust for biological covariates, but I’m specifically unclear on how to incorporate multiple batch factors.
Thanks in advance for your help!
It is generally better to include your batch effects in your model for differential analysis. Which tool will you use ? Usually, modelling your deisgn matrix will look at something like this :
design <- model.matrix(~Group_of_interest + batch1 + batch 2
) and so on. On my side, I prefer to go with limma to identify differentially methylated positions following by a differentially methylated region analysis with dmrcate.