Question: Using seurat FindClusters on the tSNE embeddings
0
gravatar for themangoscholar
9 weeks ago by
themangoscholar0 wrote:

My objective is to find clusters using Leiden algorithm on the 2D tSNE embeddings of the pbmc RNA Seq data. I am doing the following:

seed = 22

sce1 <- RunPCA(object = sce,features = sce@assays$RNA@var.features, seed.use=seed)

sce1 <- RunTSNE(object = sce1,features = sce@assays$RNA@var.features, seed.use=seed)

nn1 <- FindNeighbors(sce1, reduction = "tsne", dims = 1:2, k.param = 50, compute.SNN = TRUE, nn.method = "rann", annoy.metric = "euclidean", graph.name = "CCA_snn")

clust_obj <- FindClusters( nn1, resolution =0.5,algorithm = 4, method = "igraph", graph.name = "CCA_snn",group.singletons=T)

Note: sce is a seurat object of pbmc dataset.

What I am unable to understand is that if FindClusters is working on the reduced dimensions (i.e. the 2D cell embeddings) or on the whole dataset, since the size of clust_obj is same as sce . Also, the number of clusters are way more than scanpy provides using the 2D tSNE projection on the same data. Also, from I understood, the seurat documentations shows clustering on the whole assay, and then provides a 2D PCA/tSNE/UMAP projection. So, I am not sure if the clustering step is working on the whole data or 2D projection over here.

Please help me understand if I am doing this correctly. If I have made any mistakes, kindly help me correct it.

seurat rna-seq tsne R • 211 views
ADD COMMENTlink modified 9 weeks ago by igor12k • written 9 weeks ago by themangoscholar0

A reduced dimension is the whole dataset in terms of all cells have values for the redDims. Typically these redDims though are based on a selection of genes (the highly variable ones) and the reducedDim, (usually PCA) is then used for graph-based clustering.

ADD REPLYlink written 9 weeks ago by ATpoint46k

So, what I have done will find clusters on the reduced dims itself, and not use the whole assay?

ADD REPLYlink written 9 weeks ago by themangoscholar0

Yes, and I strongly suggest you exactly follow the Seurat clustering and/or integration vignette.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by ATpoint46k
0
gravatar for igor
9 weeks ago by
igor12k
United States
igor12k wrote:

the size of clust_obj is same as sce

The input and output of all those functions is a Seurat object. Most of the size will be due to the expression data which would not change.

the number of clusters are way more than scanpy provides

Is it using the same resolution?

seurat documentations shows clustering on the whole assay

FindNeighbors uses PCA by default (the PCA is based on variable genes by default). Then FindClusters will use that SNN graph.

You should check the documentation for all the functions you are using which provide all of this information.

ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by igor12k

Yes, it's using the same resolution as scanpy.

ADD REPLYlink written 8 weeks ago by themangoscholar0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2691 users visited in the last hour
_