I have data scRNAseq data comprising 8 different donors. I followed Seurat scRNAseq integration vignette to integrate the data. (https://satijalab.org/seurat/articles/integration_introduction.html). In the vignette, the standard normalization method NormalizeData is used.
I've read that it might be better to perform normalization with the scran package using quickCluster(), computeSumFactors() and scuttle::logNormCounts in order to account for cell type specific biases.
My question is, whether it is okay to use scran/scuttle normalization method before data integration? Or does the cell-type specific normalization introduce even more biases, considering that the 8 samples have different cell numbers, ranging from 1000-8000. Different total cell numbers could result in quite different clustering, for example by the definition of minimum cluster size. A minimum cluster size of 50 could allow detection of rare cell clusters for cluster-specific normalization in the 8000 cell sample, but not in the 1000 cell sample.
And what if I have quite small batches of a size of 50-100 cells? I guess the clustering approach using scran/scuttle wouldn't make sense here?
Is it better to use NormalizeData() in general if you want to integrate scRNAseq data?