Hi Biostars community,
I have a question regarding the appropriate approach for differential expression testing of scRNA-seq data between two cell groups with unbalanced cell numbers.
Within the total cells sequenced from a single patient, a rare novel cell subpopulation of ~200 cells was discovered. I wanted to identify genes that are differentially expressed between this small subpopulation and the rest of the cells, which comprise ~4000 cells. I tried Seurat embedded Wilcoxon, LR and MAST and got very similar results. But I'm unsure if these methods are the most suitable in my scenario.
Does anyone have any recommendations for the most suitable methods? Much appreciated!
In my head 200 is not "little", but obviously the imbalance is big. In such situations I usually do subsampling several times and then see if the results are largely consistent so that I personally could stand up for it if challenged during a revision. This will of course never be reported in a paper (the niddy griddy details never are, shamefully).