Question

Seurat3: RNA vs SCT assays for DotPlot

3

Entering edit mode

4.7 years ago

akh22 ▴ 110

HI,

I am bit confused the use of RNA vs SCT assays for DGE analysis, and wondering if anybody who uses Seurat to shed a light. I've been preforming a Seurat3 integration method with SCTranform by simply following their vignette. According to some discussion and the vignette, a Seurat team indicated that the RNA assay (rather than integrated or Set assays) should be used for DotPlot and FindMarkers functions, for comparing and exploring gene expression differences across cell types. But the RNA assay has raw count data while the SCT assay has scaled and normalized data. It seems to me that numbers in the SCT assay are more appropriate for comparing DGE among cell types. Am I missing something ?

Thanks.

scRNAseq Seurat3 • 10k views

ADD COMMENT • link updated 4.7 years ago by jared.andrews07 ★ 16k • written 4.7 years ago by akh22 ▴ 110

score 5 · Answer 1 · 2019-08-26

You can also normalize and scale data for the RNA assay. There are numerous resources on this, but Aaron Lun describes why the original log-normalized values should be used for DE and visualizations of expression quite well here:

For gene-based procedures like differential expression (DE) analyses or gene network construction, it is desirable to use the original log-expression values or counts. The corrected values are only used to obtain cell-level results such as clusters or trajectories. Batch effects are handled explicitly using blocking terms or via a meta-analysis across batches. We do not use the corrected values directly in gene-based analyses, for various reasons:

It is usually inappropriate to perform DE analyses on batch-corrected values, due to the failure to model the uncertainty of the correction. This usually results in loss of type I error control, i.e., more false positives than expected.

The correction does not preserve the mean-variance relationship. Applications of common DE methods like edgeR or limma are unlikely to be valid.

Batch correction may (correctly) remove biological differences between batches in the course of mapping all cells onto a common coordinate system. Returning to the uncorrected expression values provides an opportunity for detecting such differences if they are of interest. Conversely, if the batch correction made a mistake, the use of the uncorrected expression values provides an important sanity check.

In addition, the normalized values in SCT and integrated assays don't necessary correspond to per-gene expression values anyway, rather containing residuals (in the case of the scale.data slot for each).