Hi everyone,
Apologies if this has been posted before but I can't find an answer and am struggling to understand the literature on what to do.
A bit of background: I have 4 samples that I have sent for 10x single cell sequencing. I have run the raw data through the Cell Ranger pipeline to generate the raw count matrices. I am now stuck on the normalization process.
I have imported the 4 samples into Seurat using the same filters (min.cells=3 & min.features=200) as 4 separate Seurat objects. I then combined them into one Seurat object using the merge function. I then filtered the samples after QC checks (nFeature_RNA > 200 & < 6500 & percent.MT < 20) and performed NormalizeData function using normalization.method = "LogNormalize".
To assess the normalzation method, I checked the expression of GAPDH, which looked good. However, the distributions of expression values did not look good. I tried the other normalization methods, such as CLR using the margin = 1 or 2 argument and am still getting bad normalizations.
I have checked many articles/vignettes and am thinking perhaps I should be processing and normalizing these samples separately and then combining them into one object afterwards? Would I use the "LogNormalize" method on each sample and then integrate them through the 'IntegrateData' function, or should I use SCTransform and then combine afterwards?
My overall goal here is to be able to explore each sample and the cells etc within each and then also compare cell clusters etc between samples, hence why I would like to combine them into one object.
Thank you very much in advance!!
If by this you mean that the clustering looks really odd before integration, that's normal. If anything, I would just follow whatever the documentation says to do (for Seurat, Harmony, Scanorama, etc).