how to Seurat::RunUMAP() but run reduction="pca" on subset of features?
0
0
Entering edit mode
11 weeks ago
mk ▴ 230

How can I run umap on a seurat object, and specify the features (genes) to use for the initial PCA reduction?

I'm looking for something like what the following [hypothetical] syntax would achieve:

data("pbmc_small")
pbmc_small
# Run UMAP map on first 5 PCs
pbmc_small <- RunUMAP(object = pbmc_small, dims = 1:5, reduction="pca", reduction features= c("CD79A", "MS4A1, TCL1A", "HLA-DQA1", "HLA-DQB1",...))


Where the [non-existent] option "reduction features" specifies the features to use for RunUMAP(...,reduction="pca")

seurat pca umap • 393 views
1
Entering edit mode

RunUMAP has an argument features where you can specify the features to run PCA on.

0
Entering edit mode

I think this actually bypasses dimred and uses the features for embedding, which is what I'm trying to avoid:

If set, run UMAP on this subset of features (instead of running on a set of reduced dimensions). Not set (NULL) by default; dims must be NULL to run on features

0
Entering edit mode

If you are planning to select a relatively small number of features - say, less than 50 - you could skip the PCA and let UMAP work with them directly.

Separately, PCA already takes care of uninformative features by selecting first the eigenvectors that maximize the variance. It would be less biased if you go with PCA rather than hand-picking the features.

0
Entering edit mode

yeah for these 'special' umap embeddings the feature subset is more like 1-2k genes, so I think it's still optimal to dimred prior to embedding.

agree dimred usually obviates marker selection for embedding, issue is that i know beforehand that some markers in the original object actually code for certain confounding factors that are only of biological interest for a subset of the visualizations i need to generate

in the end i just created a whole new seuarat object with only the features i want for that particular plot, not ideal imho since im going to end up with a bunch of seurat objects all containing the same cells, or nearly the same cells