Hi,
Please check my previous answer to a similar post/question on this forum: When should I NOT apply batch correction for my single-cell RNAseq data?.
I also recommend you check the following bioRxiv paper comparing different integration methods for the integration of scRNA-seq cancer samples: A comparison of data integration methods for single-cell RNA sequencing of cancer samples.
Regarding your specific question, the answer depends on your aims. In my view, integration is about identifying the shared cell populations across datasets with or without a batch.
In your example, the two samples represent different biological conditions - WT versus drug-resistant cancer cell lines - and not different (technical) batches. If you have expectations about identifying shared cell populations across the two biological conditions you have, then, I would perform integration; otherwise, I would not.
There are generally three main approaches that one can do to check if the data requires or not integration (can be combined):
- Dimensional reduction techniques
- Automatic cell annotation (this might not apply in your case)
- Independent sample analysis: clustering, cluster markers, annotation
- (cluster comparison between samples)
You can check the following course materials to see in practice how this can be done for a few examples (I should disclose that I am the author of these materials): The Hitchhiker’s Guide to scRNA-seq course.
For example, check the following vignette: Cross-tissue integration task.
Regarding the difference between no integration and integration (UMAPs) results, you can check if the shared cell clusters share a good number of marker genes that you can use to confidently say these clusters are shared between biological conditions. Ideally, the clusters identified in each individual sample should map one-to-one onto the integrated clusters. This can be assessed by generating a confusion matrix comparing the clustering results obtained with and without integration.
I hope this helps.
Best,
António
Actually, I tried two methods—one with batch effect correction and one without—and the results were completely different. In the UMAP without batch effect correction, WT and RT are completely separated, whereas after batch correction, they partially overlap.