Why do highly expressed genes vary more than lowly expressed genes?
1
0
Entering edit mode
21 months ago
nhaus ▴ 310

Hello,

I am currently reading about single cell sequencing and am having a hard time understanding something very basic I think.

Everywhere I look, I read that scRNA seq count tables are heteroskedastic (highly expressed genes -> higher variance), which makes analyzing them somewhat challenging. Thus our goal is to transform them somehow, so that the variance is "stable", i.e. does not depend on the mean. A common plot to visualize this is shown below: enter image description here

My first question is, why do highly expressed genes actually vary more than lowly expressed genes.

Another thing which confuses me, is that if you plot the gene expression of two cells (or patients) you will actually notice that the genes that are lowly expressed vary more and the higher the expression of the genes, the lower the variance gets. For me, this feels like it directly contradicts the heteroskedasticity statement above. I have also included a picture. enter image description here

I would really appreciate it, if somebody could clear this confusion for me.

Cheers!

single cell statistics • 707 views
ADD COMMENT
3
Entering edit mode
21 months ago
Lior Pachter ▴ 700

The relationship between the variance and the mean is due to both biological stochasticity, as well as (technical) sampling artifact. We discuss these issues in a recent preprint (see Figure 2): https://www.biorxiv.org/content/10.1101/2022.06.11.495771v1 and comment on the implications for variance stabilization.

ADD COMMENT
0
Entering edit mode

Seurat offers residual variance against geometric mean also.

ADD REPLY

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6