Dear all,
I have RNA-seq data that I've collected from 4 datasets, 3 of which correspond to the same type of sample (control, green). The other dataset (4) contains only cancer samples (red). All these samples are integrated in the same matrix, and the goal is to perform DE analysis between control and cancer patients.
Therefore, I want to perform ComBat to adjust for batch effects. However, I have two batch effects: one is Dataset, and the other is Hospital (for dataset 1, samples were collected from two different hospitals adding another layer of variability).
How should I proceed? I've tried to find the answer looking in different forums, but nothing helps.
Thanks in advance!
Even if the batches were meaningfully distributed, OP doesn't need to use ComBat-seq for batch especially when they're trying to do DE - they need to use batch as covariate then. But like you say, the sample type and dataset number are confounded - there is no way to attribute any change observed to just biology or just batch.