Dear all,
I have six samples from a 10x experiment: three under control conditions and three subjected to a treatment. I am performing QC preprocessing on each sample separately. Now, I want to integrate the samples using Harmony before proceeding with downstream analyses. My question is: Is it better to run the RunHarmony() function on all preprocessed and merged control samples, then do the same for the preprocessed treated samples, and finally merge the two resulting integrated objects? Or should I first merge all six preprocessed samples into a single Seurat object and then run RunHarmony once on the combined dataset? Any suggestions on which way is better? Those are very large files...
Harmony can integrate over multiple covariates, allowing you to provide both control/treatment and batch information, and run it once. Running Harmony separately will result in incompatible embeddings.
https://portals.broadinstitute.org/harmony/articles/quickstart.html#harmony-with-two-or-more-covariates