Pull number of cells in cluster from seurat object
4
2
Entering edit mode
4.6 years ago
cook.675 ▴ 220

I've been working through some of the vignettes on the Satija Lab site. Specifically the control vs. treated data set found here:

https://satijalab.org/seurat/v3.1/immune_alignment.html

I don't know how to pull the number of cells in any given cluster, and I'm sure where this data is stored? To use the vignette example, the last line to compute the clusters is

immune.combined <- FindClusters(immune.combined, resolution = 0.5)

so it must be stored in immune.combined somewhere? How is it accssed?

Thanks!

RNA-Seq seurat • 27k views
ADD COMMENT
0
Entering edit mode

Thank this has been really helpful things are starting to come into focus.

How would I sub-out and create a table showing the number of cells per cluster by stimulation type?

I tried some of the following:

table(Immune.combined@meta.data$stim.seurat_clusters) table(immune.combined@meta.data$stim$seurat_clusters)

neither worked obviously

Or even if I could get the data individually and put them in a table my self.

I want to ask "How many cells in each cluster are from vehicle and how many are from the treatment group?

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY
9
Entering edit mode
4.6 years ago

I usually assign the content of the metadata to a separate object, which tends to make subsequent manipulations easier.

I am a fan of the data.table package, but I'm sure there are solutions using the tidyverse/dplyr, too.

library(data.table)
library(magrittr)

## extract meta data
md <- immune.combined@meta.data %>% as.data.table
# the resulting md object has one "row" per cell

## count the number of cells per unique combinations of "Sample" and "seurat_clusters"
md[, .N, by = c("Sample", "seurat_clusters")]
    Sample seurat_clusters    N
 1:        KO               3  492
 2:        KO               1  786
 3:        KO               0 2031
 4:        KO               4  291
 5:        KO               7   95
 6:        KO               2  445
 7:        KO               5  140
 8:        KO               8   79
 9:        KO               9   50
10:       KO               6  130
11:         WT               1  996
12:         WT               0 3281
13:         WT               6  192
14:         WT               5  196
15:         WT               2  806
16:         WT               3  572
17:         WT               4  301
18:         WT               8   60
19:         WT               9   36
20:         WT               7   69

## with additional casting after the counting
md[, .N, by = c("Sample", "seurat_clusters")] %>% dcast(., Sample ~ seurat_clusters, value.var = "N")
   Sample    0   1   2   3   4   5   6  7  8  9
1:        KO  2031 786 445 492 291 140 130 95 79 50
2:         WT 3281 996 806 572 301 196 192 69 60 36
ADD COMMENT
0
Entering edit mode

Thanks so much for this I really appreciate it!

ADD REPLY
0
Entering edit mode

Is there also a way to know which cell has gone into which cluster apart from just the number? is that data stored anywhere in Seurat?

ADD REPLY
0
Entering edit mode

not sure I get what you mean. The cluster number specifies the cluster membership.

ADD REPLY
0
Entering edit mode

awesome and simple.

ADD REPLY
0
Entering edit mode

This is amazing, thanks!

ADD REPLY
1
Entering edit mode
4.6 years ago

It will be stored in meta data. You can check by head(immune.combined@meta.data)

If you want to plot,

ggplot(immune.combined@meta.data, aes(V8, fill=V5))+geom_bar(stat="count")

V8 should be whatever column says seurat clusters. fill=V5 can be optional if you don't want to further sub classify the clusters

ADD COMMENT
1
Entering edit mode
4.6 years ago
Haci ▴ 680

The cluster information is stored in the @meta.data slot and in a column something like res.0.5 as you used a resolution of 0.5 in your FindClusters() call. If you re-run FindClusters() with another resolution parameter, an additional column will be added.

To get the numbers in each cluster you can do something like table(seurat_object@meta.data$cluster_column).

The above is true for Seurat v2 but I would be surprised if these would be changed in v3.

ADD COMMENT
0
Entering edit mode
2.2 years ago

Late to the party, but I'd just do:

sum(immune.combined$seurat.clusters == "whateverClusterYouAreInterestedIn")

It doesn't require much work. Seurat by default (at least at V3, assume it's still the same in 4), stores the cluster ID numbers in $seurat.clusters

ADD COMMENT
0
Entering edit mode

Thank you so much for this! Having the same problem as OP, and this was simply phrased.

ADD REPLY

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6