Hi all,
I have a Seurat object made of several samples of different conditions. Some of the samples have much more cells loaded than others, and even after resequencing, these samples still have lower read counts per cell than the other samples. We want to ensure that any clustering differences are due to biological differences, and not simply clustering based on an artifact of lower reads per cell. Is there a strategy to help randomly select/remove reads so that all samples have about the same amount, then ensure that the clustering still has biologically relevant changes? We expect that these changes would result in biological differences, however some conditions had We want to run our actual analysis on the UMAP with all read counts, however we want to do a QQC step to ensure clustering differences are not due to read count per cell (UMAP visualization). Thank you for any and all help with this approach. Attached are UMAPs of the four conditions, Feature plots for QC metrics, summaries of reads from html file. This dataset has been pre-processed using Seurat pipeline, filtered for QC, and ran harmony to remove batch effects for cell line.
I see discussions about randomly downsampling and removing cells, but I don't believe this would address my question.
Thank you for any and all help!
Hi Bastien! Thank you for your help. These cells have been normalized already. I agree that cluster 6 is potentially a doublet, and we are running a doublet removal on the dataset.
However, Cluster 0 is quite enlarged in a particular condition, a condition which had many more cells loaded, thus fewer reads per cell. We want to ensure that cluster 0 is not an artifact of this difference in read count, and is indeed biologically different.
Is there a way to subset reads (or randomly delete reads) then regenerate the umap and ensure these differences in clustering still exist?
NormalizeData is already doing this read counts mitigation
You can try
cellranger aggr
https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/running-pipelines/cr-3p-aggr#depth_normalization
Thank you for the insight Bastien! I think cell ranger aggr achieves this.