Technical single-cell question
8 weeks ago
I've been searching for a publication that uses scRNAseq to look at percentage of cells expressing a gene across groups, as I am hoping to replicate their analysis. I.e, what percentage of cells express gene X in control groups vs what percentage of cells express gene X in treatment groups.

All the scRNAseq publications I have read first cluster cells obtained from all groups, and then only compare the percentage of cells expressing a gene between clusters, but not groups. I understand that there is a challenge with sparsity in sc data, but if comparisons are drawn in gene expression between clusters, why not groups?

Hello, what exactly is the question now? You can just count how many cells have counts for that gene > 0 and divide by total cell number. I do not see why that would not be proper. The sparsity is something inherent in single-cell data at the moment, so taking the percentages you get with a grain of salt would be wise I guess. You can look at essential genes, such as GAPDH, Pol2rb, Psma1, so genes that the cell essentially requires for survival. The three aforementioned genes are commonly used as killing controls in e.g. CRISPR or shRNA screens, so targeting them will kill the cell. Zeros in these genes are then most likely technical I would say, so you can get a feel for the dropout rate for strongly-expressed genes, and then based on this interpret the percentages you get for the genes of interest.


