Entering edit mode
16 months ago
sp
•
0
Hello,
I am new to scRNASeq analysis. All of this is in R, and all functions were run in default.
I am trying to use fastMNN only till data integration for a comparative study with other tools. I am following a very standard workflow that I found through the example codes in documentation:
- Read in the matrix as a
sce - Normalised it with
logNormCounts() - Performed feature selection using
modelGeneVar() - Selected top n hvgs with
getTopHVGs() - Performed PCA and UMAP on the sce, used
runPCA()andrunUMAP(). This info I believe is stored in "PCA" and "UMAP" of the sce. - Visualised the UMAP using
plotReducedDim(), which I believe to be the same as the likes ofplotUMAP(), exceptdimredis a requirement (which I set to"UMAP"). - Performed data integration using
fastMNN()after subsetting using the chosen hvgs. - Again ran PCA and UMAP on sce_integrated, except now
dimredHAS to equal"corrected'. I don't understand this. - Plotted the UMAP for sce_integrated for comparison with before integration, and again used
plotReducedDim(). I was not sure if dimred should equal"UMAP"or"corrected", since I believe the embeddings are stored in"corrected", so shouldn't"corrected"be used for visualisation as well? However when I plotdimred="UMAP", the UMAP is different from the UMAP earlier, which means the embeddings get overwritten?
Summary of doubts:
- I don't understand why PCA and UMAP need to be run twice, before and after integration.
- Why is
dimred="corrected"needed for runPCA after data integration? (earliersce <- runPCA(sce, ncomponents = 50)worked). - For plotting UMAPs should
dimred="corrected"be used after data integration? - Do UMAP embeddings get overwritten in sce if UMAP is run again after data integration?
Thanks for all your help~ Sorry about the long post, I wanted to provide as much context as possible.
Thank you so much! This clarified everything for me and I was wondering if there was a way to not overwrite the embeddings so your answer was super helpful!!