I was writing to see if anyone has experience combining single cell RNA-seq data from different conditions and biological/technical replicates in an experiment. I have a dataset with two different conditions (WT/Treatment) and each condition has two different replicates (done in different isogenic mice). I would like to correct for batch for each of the conditions independently and then combine the dataset from the two conditions and do a joint analysis to see the difference in clusters/cell types between the two conditions. Generally I have been using Seurat in which I tried the following strategy:
For ex, COND1 had Exp1.1 and Exp1.2 and COND2 had Exp2.1 and Exp2.2.
The process I followed is:
- merge COND1/Exp1.1 and COND2/Exp1.2
- after the usual pre-processing of the merged object for COND1, correct for batch in ScaleData using the expt id.
- Do the same for COND2
- then merge the two objects - COND1 and COND2 for a combined analysis.
The problem is that on merging COND1 and COND2 in the last step I have normalize and ScaleData again which would lose the batch corrected expression values. If I merge all the conditions and experiments in the beginning then I don't think I could correct for batch across all datasets since that would neutralize the difference between the conditions.
Any thoughts/suggestions would be greatly appreciated. If someone can point me to any code that does this, even better!