Facing problem with single cell rna seq annotation using singleR
1
0
Entering edit mode
4 weeks ago

The public breast cancer data I am using contains annotation in metadata(like epithelial, B cell , T cell etc)..but I want to annotate different subtypes of epithelial cells like luminal epithelial, basal epithelial cells....different subtypes of T cells & B cells....and so on.

While annotating breast cancer scRNA data using singleR (Human primary cell atlas as reference) I got clusters labelled as hepatocytes, chondrocytes etc cell type. But we are not supposed to find these cell types in breast cancer tissue.

1. What did I do wrong here?

library(Seurat)
library(celldex)
library(SingleR)
hpca.se <- celldex::HumanPrimaryCellAtlasData()
merged_seurat_objects_filtered_harmony_integrated_SingleR <- GetAssayData(merged_seurat_objects_filtered_harmony_integrated, slot="data") 
hesc <- SingleR(test = merged_seurat_objects_filtered_harmony_integrated_SingleR, ref = hpca.se, labels = hpca.se$label.main)
merged_seurat_objects_filtered_harmony_integrated$celltype <- hesc$labels

2. Can I use marker genes from the differentially expressed gene of each cluster to annotate the cell subtypes?

3. What if no marker gene is present in the DEGs? How will I annotate?

4. Aren't the DEGs affected by cancer state?

5. Any tutorial/ paper to clear these doubts regarding sc RNA seq annotation?

singleR scRNA-seq • 287 views
ADD COMMENT
0
Entering edit mode
29 days ago

A few things.

1. What did I do wrong here?

SingleR will annotate each cell with whatever it's most similar to in the reference dataset, unless it's pretty uniformly not scoring well for anything in the reference dataset. If there are cells scoring better for the hepatocyte signatures in the reference than other cell types, they're going to be annotated as such whether they're actually those cells or not. If you know there are cell types in the reference not present in your query, then just remove those cell types from the reference before running SingleR. The Human Primary Cell Atlas dataset is very broad, so you may be able to find a more specific/appropriate reference dataset for your tissue.

2. Can I use marker genes from the differentially expressed gene of each cluster to annotate the cell subtypes?

Sure. It can be a pain depending on how granular you want to get and how well-distinguished the cell types in your tissue are.

3. What if no marker gene is present in the DEGs? How will I annotate?

This is largely a question of granularity. If you can't distinguish between clusters, then cluster more coarsely. Clustering is a forced process that doesn't necessary align with concrete or biologically distinct cell states. You can always subset and re-cluster a given cell type of interest to try to hone in on more subtle subtypes if you want. Start broad and go from there.

4. Aren't the DEGs affected by cancer state?

Yes. Cancer cells are not typical cells and have their own transcriptional programs. Often, you will see people annotate cancer sub-populations as "celltype-like" in an effort to tie it to a proposed or known cell of origin. See figure 2 of this paper for an example.

ADD COMMENT

Login before adding your answer.

Traffic: 2085 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6