Gene co-expression in single-cell RNA-seq

0

Entering edit mode

2.7 years ago

randalljellis ▴ 90

I am using single-cell RNA-seq data from the Allen Institute, and I want to look at gene co-expression in different cell populations. They provide raw UMI counts, so I'm wondering what normalization method to use (e.g., CPM, TPM, VST) to look at these correlations. Any rationale/justification is appreciated.

single-cell scRNA-seq RNA-seq correlation co-expression • 1.2k views

ADD COMMENT • link updated 2.7 years ago by rpolicastro 13k • written 2.7 years ago by randalljellis ▴ 90

1

Entering edit mode

Using scTransform is not a bad idea; it normalizes for sequencing depth and does a VST transform.

You can simply take the log of the CPMs -- but there are some problems with it (see the scTransform paper).

I wouldn't use TPMs -- UMIs generally shouldn't exhibit length biases (i.e. longer genes = more counts) that require TPM correction.

ADD REPLY • link 2.7 years ago by dsull ★ 5.8k

2

Entering edit mode

The authors of Seurat now recommend not to use the SCTransformed normalized counts outside of integration and dimension reduction. Instead they recommend using NormalizeData.

ADD REPLY • link 2.7 years ago by rpolicastro 13k

0

Entering edit mode

Thank you. I have another question. If I want to compare correlations between populations (Ex. Compare the correlation of Gene1 and Gene2 in Population 1 with G1/G2 in P2), should I normalize each population separately, or together?

ADD REPLY • link 2.7 years ago by randalljellis ▴ 90

0

Entering edit mode

If they're cell populations from the same sequenced sample, I'd normalize them together (see https://github.com/ChristophH/sctransform/issues/55 )

ADD REPLY • link 2.7 years ago by dsull ★ 5.8k

Login before adding your answer.