How to choose the number of PCs for a large Seurat Object
16 months ago

I am analyzing a single cell dataset with ~160k cells. The samples were integrated using Reference-based SCTransform. For the downstream analysis, do I have choose a higher number of pcs and dims?

Can anyone help me understand how to select the npcs for the runPCA and dims RunUMAP and FindNeighbors? Is that something we need to change according to the dataset size?

Thanks

Parvathi.

16 months ago

Generate a plot with Seurat::ElbowPlot, which shows the amount of variance explained by each PC. You should then pick a PC at the point in which additional PCs fail to explain much more variance than the preceding PCs.

They have an example in their guided clustering vignette and probably a few more.

