sva combat removes wanted biological variation
1
1
Entering edit mode
3.4 years ago
exin ▴ 60

My sc-RNAseq on sponge larvae cells was done as 2 batches (library prepped separately, separate sequencing runs). Samples consist of 7 cell types, 8 replicates each: 5 reps in batch 1 and 3 reps in batch 2; except for one cell type, which went to batch 2.

I use combat in the sva package to remove this known batch effect. However, it seems to have removed some of the wanted biological differences (difference between cell types):

Before combat:

After combat:

I’m unable to back this up with by quoting the actual loss of DE genes after combat (I can’t do DESeq2 DE analysis on combat-ed counts as it’s normalised…?)

What do I do in this case? Is there a way to tweak the power of combat? This doesn't seem to mentioned in the vignette.

Admittedly, there are lots of noise in my data and if I run sva, I'll likely uncover other variables, but I wanna get rid of this known batch effect first.

rna-seq r combat sva PCA • 1.3k views
3
Entering edit mode
3.4 years ago

In as far as it is feasibly possible, you should aim to just include batch as a covariate in your design formula. When the regression model is then fit for each gene, the effect of the batch covariate will be estimated and taken into account when calculating the coefficient for your gene of interest (i.e., the level of diff. expression of each gene will be 'adjusted' for the effect of batch). As the batch effect is known (and hopefully consistent with respect to how it affects your samples), doing this is 'better' for DEA (diff. expression analysis) than directly adjusting your counts / expression data for batch.

I would only use ComBat as an absolute final resort.

1
Entering edit mode

Ok I'll do that. Thank you Kevin!!