I have single cell datasets including 5 samples from two different tissue (2 blood samples and 3 urothelial samples). After quality control, I got 41070 cells in total.
I tried to integrate these datasets useing CCA, MNN, HARMONY and BBKNN respectively, and then perform dimention reduction (PCA), clustering followed by cell type annotation. When I checked the results, 500-700 (about 5-6%) blood cells were annotated as epithelium mistakenly whatever above integration methods I used. I checked these cells using single sample clustering, I confirmed no epithelium in blood samples.
Is it good to integrate datasets from different tissue samples? If it's ture, how to do it better to get a finer integrated data?