Question

How to calculate cell type frequency between two groups in single cell data

1

Entering edit mode

23 days ago

Sara ▴ 30

Hi all!

I am working with a single-cell RNA seq dataset from 10x, and I processed my data using Seurat. I want to show the cell type frequency between two different conditions (patients vs control). How can I calculate the frequency of each cell type and compare it between my patients and controls? I have the column in my data with sample_id and condition column.

I used the following code, and I think it gives me the number of cell types per group:

table(harmonized_seurat@meta.data$cell_type, harmonized_seurat@meta.data$condition)

#(just an example of how my data looks like):
                     patient     Control
Astrocytes              157       111
Endothelial cells       12        16
Excitatory neurons      24        41
Inhibitory neurons      75        90
Microglia               40        15
Neurons                 45        39

Is it the right way to get the number of cells per different conditions?

I feel like with this code, I am getting the number of cells per group. How do I calculate the fraction of cells between two groups? Then, which statistical test must I use to see if the difference in each fraction is significant between two groups (e.g. if the difference in the fraction of astrocytes is significant between patient and control)?

Many thanks in advance!

Seurat single-cell sc-RNA cell-type • 447 views

ADD COMMENT • link updated 15 days ago by Francesco ▴ 20 • written 23 days ago by Sara ▴ 30

0

Entering edit mode

For me scProportionTest library does the trick.

ADD REPLY • link 15 days ago by Francesco ▴ 20

score 0 · Answer 1 · 2024-05-24

0

Entering edit mode

23 days ago

Bastien Hervé 5.4k

I believe sample_id are your replicates in either patient or control.

You can do it manually by normalizing the number of cells you have in each patient to the same amount. Or get a proportion of each cell type in each sample

prop.table(table(Cluster=harmonized_seurat$cell_type, Batch=harmonized_seurat$sample_id))

You can run a t-test to highlight significant differences in cell proportion.

You can also have a look at published methods like scCODA

ADD COMMENT • link 23 days ago by Bastien Hervé 5.4k

0

Entering edit mode

Thank you for your comment and sorry if this question might be so basic. How can I normalize the number of cells?

If I am not wrong the idea is that for example in controls get the total number of neurons and divide it by the total number of all cells in control; and then the same for patients? please correct me if I am wrong. but I am not sure yet how can I normalize it?

Many thanks!

ADD REPLY • link 23 days ago by Sara ▴ 30

0

Entering edit mode

That is correct

ADD REPLY • link 22 days ago by Bastien Hervé 5.4k