I would like to ask for advice on using limma for differential expression analysis of scRNAseq data. I usually use Seurat, but I would really like to know whether limma is also a suitable tool for scRNAseq data. I've tried to search for papers and I couldn't find a stable answer to this question.
Thank you very much for your help!
(I haven't read the paper in full) If I read this figure correctly, it's curious that methods specific for scRNA-seq are not better, if not worse, than bulk RNAseq methods. Notably, the good old t-test and Wilcoxon tests are not bad at all!
The Wilcoxon test is the default DE method in many tools as it apparently performs well if sample sizes (so clusters per cell) is large enough, and because it scales well. Say you have 20 clusters and want to find markers so you have to perform DE for every cluster against every other cluster, that is a lot of computation time if you use tools like edgeR which have to estimate dispersion and fit models for dozens/hundreds/thousands of cells.
Maybe you don't really need to reinvent the wheel. Something like the t-test or the Wilcoxon test have been used for all sorts of data over many years. The advantage of a generic test is that it is generic.
I guess for single-sample data it is fine given that you have hundreds of cells per cluster and simply want to get the top marker genes. I personally feel safer though (if I have at least n=2 replicates per condition) to use edgeR, maybe after pseudobulk aggregation.
You could also treat the replicates as a covariate.
Thank you! I am using limma for other types of data and I was curious to know if it can be used for the scRNAseq data as well. I currently use Seurat, which as far as I know uses Wilcoxon test, but I was wondering about more complex designs with several factors.
Seurat has multiple tests. Check the documentation for
latent.varsparameter is used for more complex designs.
Thank you very much! This figure is very informative and I will read the paper for sure.