Question

How to minimize the effect of increased number of cells in the control group- Single cell RNA seq

0

Entering edit mode

4 months ago

Sandra ▴ 10

Hello,

I'm facing a challenge with varying cell counts between my control and disease groups in a single-cell RNA sequencing experiment. Specifically, the control group has a higher number of cells than the disease group.

Here are the cell counts for each subgroup:

HC1: 2059
HC2: 468
HC3: 3333
Disease1: 428
Disease2: 1610
Disease3: 1189

My concern is that having more cells in the control group will influence clustering, aligning cells to HC subpopulations with higher number of clusters, splitting disease cells into more clusters and making it difficult to perform DE genes as there will be less cells per cluster.

To address this imbalance, I'm considering subsampling the control group to 1500 cells. However, I'm concerned about potential biases in clustering and downstream analyses.

What methods or considerations should be employed to evaluate the impact of subsampling on downstream analyses, particularly in terms of differential expression analysis with fewer cells per cluster?

What precautions should I take to ensure that subsampling the control group maintains an accurate representation of biological variability within healthy samples?

Thank you in advance!

scRNA-Seq • 643 views

ADD COMMENT • link 3 months ago by Sandra ▴ 10