Annotation of Single Cell Clusters
4
1
Entering edit mode
13 months ago
Shelle ▴ 20

I am new to single cell. And I would appreciate guidance for my question

I want to cluster a given single cell dataset for example PBMC. How can I annotate different clusters in the single cell without the knowledge of biology? I came across the scanpy but it was not useful and sounded like you need to know the different type of marker genes beforehand.

Any help would be appreciated

RNA-Seq Single cell Cluster tool • 1.1k views
ADD COMMENT
1
Entering edit mode
5 days ago

You can use SingleR R package as well. Here is an example:

# pbmc is my Seurat object
# wpath is my working directory

library(Seurat)
library(SingleR)
library(celldex)

## Transform Seurat object to SingleCellExperiment object
Idents(HE_control_2) <- HE_control_2$seurat_clusters
sce.HE_control_2 <- as.SingleCellExperiment(HE_control_2)

wpath="D:/Poland/PHD/spatial/Second_set/Genevariable3000_analysis/avg/correlation/MT15/Final_7000"
## Load references (after loading I save them to RDS file, so next time they load faster)
blue <- SingleR::BlueprintEncodeData()
saveRDS(blue, paste(wpath,"singleR.BlueprintEncodeData.rds",sep="/"))
blue <- readRDS(paste(wpath,"singleR.BlueprintEncodeData.rds",sep="/"))

hpca <- SingleR::HumanPrimaryCellAtlasData()
saveRDS(hpca, paste(wpath,"singleR.HumanPrimaryCellAtlasData.rds",sep="/"))
hpca <- readRDS(paste(wpath,"singleR.HumanPrimaryCellAtlasData.rds",sep="/"))

monaco <- SingleR::MonacoImmuneData()
saveRDS(monaco, paste(wpath,"singleR.MonacoImmuneData.rds",sep="/"))
monaco <- readRDS(paste(wpath,"singleR.MonacoImmuneData.rds",sep="/"))

dice <- SingleR::DatabaseImmuneCellExpressionData()
saveRDS(dice, paste(wpath,"singleR.DatabaseImmuneCellExpressionData.rds",sep="/"))
dice <- readRDS(paste(wpath,"singleR.DatabaseImmuneCellExpressionData.rds",sep="/"))

MTG <- readRDS(paste(wpath,"middle_temporal_gyrus_human_MTG_coarse-grained_annotation.rds",sep="/"))
MTG_fine <- readRDS(paste(wpath,"middle_temporal_gyrus_human_MTG_fine-grained_annotation.rds",sep="/"))

cerebral_cortex_Po <- readRDS(paste(wpath,"cerebral_cortex_human_pollen.rds",sep="/"))
cerebral_cortex_Li <- readRDS(paste(wpath,"colorectal_tumor_hu",sep="/"))



## Check what kind of annotations are available
table(blue$label.main)
table(blue$label.fine)
table(hpca$label.main)
table(hpca$label.fine)
table(MTG$feature_matrix_learned)
table(MTG_fine$cell_type_information)


refs <- list(BP=blue, HPCA=hpca)
fine.labels <- list(blue$label.fine, hpca$label.fine)
main.labels <- list(blue$label.main, hpca$label.main)

## Annotate CELLS (main label)
pred.list <- SingleR(test=sce.HE_control_2, ref=refs, assay.type.test=1, labels=main.labels)
pred.blue <- SingleR(test=sce.pbmc, ref=blue, assay.type.test=1, labels=blue$label.main)
pred.hesc <- SingleR(test=sce.pbmc, ref = hpca, assay.type.test=1, labels=hpca$label.main)
pred.monaco <- SingleR(test=sce.pbmc, ref=monaco, assay.type.test=1, labels=monaco$label.main)
pred.dice <- SingleR(test=sce.pbmc, ref=dice, assay.type.test=1, labels=dice$label.main)

## Annotate CELLS (fine-grained label)
pred.list.fine <- SingleR(test=sce.HE_control_2, ref=refs, assay.type.test=1, labels=fine.labels)
pred.blue2 <- SingleR(test=sce.HE_sample_2, ref=blue, assay.type.test=1, labels=blue$label.fine)
pred.hesc2 <- SingleR(test=sce.pbmc, ref = hpca, assay.type.test=1, labels=hpca$label.fine)
pred.monaco2 <- SingleR(test=sce.pbmc, ref=monaco, assay.type.test=1, labels=monaco$label.fine)
pred.dice2 <- SingleR(test=sce.pbmc, ref=dice, assay.type.test=1, labels=dice$label.fine)
pred.MTG <- SingleR(test=sce.HE_control_2, ref=MTG, assay.type.test=1, labels=MTG$cell_type_information)

## Add annotations to Seurat object
#celllabels1 <- cbind(celllabels=pred.blue$labels, celllabels2=pred.hesc$labels, celllabels3=pred.monaco$labels, celllabels4=pred.dice$labels)
#celllabels2 <- cbind(celllabelsb=pred.blue2$labels, celllabels2b=pred.hesc2$labels, celllabels3b=pred.monaco2$labels, celllabels4b=pred.dice2$labels)
celllabels2 <- cbind(celllabelsb=pred.list$labels)
celllabels1 <- cbind(celllabels=pred.list.fine$labels)

rownames(celllabels1) <- rownames(celllabels2) <- colnames(HE_control_2)
HE_control_2 <- AddMetaData(HE_control_2, as.data.frame(cbind(celllabels1,celllabels2)))

table(pbmc$celllabels, pbmc$seurat_clusters)
table(pbmc$celllabels3, pbmc$seurat_clusters)

## Evaluate annotation based on scores
plotScoreHeatmap(pred.list.fine, clusters=HE_control_2$seurat_clusters)

par(mfrow=c(1,1), mar=c(2.5,7,1,.5))
graphics::boxplot(pred.monaco$scores, xlab="prediction.score", ylab="", main="Prediction scores for cell labels", boxwex=.8,
                  cex=.5, cex.main=.8, cex.lab=.8, cex.axis=.6, las=1, horizontal=T, varwidth=T, col=rainbow(36))
ADD COMMENT
0
Entering edit mode
13 months ago

One need to know cell-type marker genes to annotate the cell-types. As far as I know, no tool can do it without providing marker genes for cell-type.

ADD COMMENT
0
Entering edit mode

Is there any list of universal marker genes that I can use? For example for 7 clusters in one dataset how I can know which one is representing which cell type marker genes?

ADD REPLY
0
Entering edit mode

You have to provide markers based on the biology. Biology and analysis go hand-in-hand. If you have PBMCs then you can get markers from consortium datasets like Haemopedia or ImmGen. Check the singleR package which includes these datasets for hematopoietic cells and can assign identities to your clusters.

ADD REPLY
0
Entering edit mode

Thanks for the information. Is there any other tool besides singleR which is not really dependent on Seurat? both of these tools are based on R and I would like to stay close to python because of downstream analysis

ADD REPLY
0
Entering edit mode
8 months ago

Take a look at the CellKb database. You can enter a ranked gene list and get matching cell types previously published in literature through the web interface. Matching of the query gene list to other cell types is performed using a variation of the rank bias overlap algorithm. CellKb contains several marker gene sets for immune cell types, including those from SingleR, MSigDB and Human Protein Atlas.

Disclaimer: CellKb is a commercial product and I am involved in its development. Academic users can sign up for a free license and search limited single-cell datasets and all bulk datasets, including SingleR, MSigDB and Human Protein Atlas.

ADD COMMENT
0
Entering edit mode
5 days ago

Please how a look at the scCATCH(https://github.com/ZJUFanLab/scCATCH) and SCSA (https://github.com/bioinfo-ibms-pumc/SCSA).

ADD COMMENT

Login before adding your answer.

Traffic: 1454 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6