question on how differential expression is affected if there is differential abundance between celltypes
9 weeks ago


I was curious how would unequal sample sizes of groups (e.g. healthy, 3 subjects and disease, 6 subjects), influence differential expression and differential abundance of celltypes?

E.g. in this case, given that disease is 2x of healthy in number of patients-

  1. How would this affect the differential expression of cell type A in healthy vs disease?
  2. Does differential abundance accounts for this inequality in sample size? and should we account for it in the DE for step 1?
differential-abundance scRNA-seq Differential-expression
9 weeks ago

Short answer - no, there is no problem, and you don't need to do anything.

Longer answer - we use replicates in RNAseq to measure two things - the mean read counts of each condition and the variation in read counts.

The more reads you have, then more accuratly you will know both mean read counts and the variation. Thus you will have better estimates of the mean and dispersion in disease than in healthy. However, while those estimates are less accurate, they are still unbiased. You would have more power to reject the null hypothesis if you had 6 of each, but then 3 of one and 6 of the other gives you more power than having 3 of each.


