SingleR for integrated scRNA-seq analysis
1
3
Entering edit mode
19 months ago
mvis1231 ▴ 90

Hi, I have data from a specific cell from mouse fed with a certain diet. I integrated 4 datasets that were measured at four different time for the integrated single-cell RNA seq analysis. I have been referring to the Seurat vignette : https://satijalab.org/seurat/v3.1/immune_alignment.html.

I am using SingleR to identify cell type for each cluster and I am wondering if I need to set DefaultAssay as "RNA" or "integrated". I tried both, but they gave me slightly different results for cell type identification.

Should I keep DefaultAssay as "RNA" or "integrated"?

Any thoughts and advice are greatly appreciated. Thank you.

scRNA-seq singler integrated Assay • 2.0k views
3
Entering edit mode
19 months ago
igor 12k

The assay should be RNA, since SingleR expects expression values.

1
Entering edit mode

Just to add => log2 expression values.

1
Entering edit mode

Good point. I was just concerned with "RNA" vs "integrated".

0
Entering edit mode

Sorry for the silly question, what do you mean by log2 expression value? take log2 of data slot under the RNA assay? or the counts slot under the RNA assay?

0
Entering edit mode

But we perform integrated assay when we trying to align cell states shared across datasets. If we will use 'RNA' slot what is the point of integration?

1
Entering edit mode

Integration aims to create a common clustering landscape in which all cells are embedded. This makes it easy to compare cells which (without integration) would cluster based on batch effects, biological differences such as cell cycle, all kinds of sample-specific differences. The integration values must not be used (to my knowledge) with differential analysis methods since the process creates dependencies and notably changes magnitude and direction of changes. Therefore it probably should not be used with classifiers such as singleR which operates on Spearman correlation and therefore would suffer from changes in magnitude and direction of counts.

1
Entering edit mode

To think of it another way: If you are trying to identify your cells based on independent datasets, it probably makes more sense to use the stable normalized values. The integrated values will change depending on the datasets you are integrating, which means your cell types will also change and that does not seem reasonable.

0
Entering edit mode

Under RNA assay, there are two slots, one is data, one is counts, which one should be used as SingleR input? I guess the counts slot is the raw counts, right? and the data slots are logNormalized data?

1
Entering edit mode

SingleR expects the normalized counts, so you want the data slot.