I have two groups scRNAseq data, and I have finished cell type annotation. Now I would like to find the differentially expressed genes between two condition groups(normal vs treatment) within one cell type cluster, and I used Seurat function
FindMarkers as follows hoping to find the DEGs across different conditions (Normal vs Treatment)
Alveolar.macrophages.response <- FindMarkers(normal.vs.Veh, ident.1 = "Alveolar macrophages_Normal", ident.2 = "Alveolar macrophages_Veh", verbose = FALSE)
However, I have some concerns about the returned results from
FindMarkers, I used
head(Alveolar.macrophages.response[with(Alveolar.macrophages.response, order(avg_log2FC, decreasing = T)), ], 5) to display the results with respect to the decreasing order of
avg_log2FC are in general very small, even the largest value of
avg_log2FC only up to
0.9793757 which looks very weird to me comparing to what I saw on many other tutorials. I wonder whether this result is reasonable?
p_val avg_log2FC pct.1 pct.2 p_val_adj Tm4sf19 3.630080e-153 0.9793757 0.203 0.079 7.161784e-149 Slfn4 8.873470e-65 0.6886944 0.398 0.290 1.750647e-60 LOC100360087 9.405795e-295 0.6689990 0.935 0.839 1.855669e-290 Fth1 4.259850e-230 0.5818955 0.963 0.896 8.404258e-226 Fkbp5 7.286612e-134 0.5672482 0.406 0.249 1.437576e-129 Hmox1 1.944299e-91 0.5299588 0.291 0.174 3.835907e-87 Zdhhc14 9.329413e-77 0.5016266 0.567 0.455 1.840600e-72
Do you have biological replicates, if so you could aggregate counts to pseudobulks and use standard tools such as DESeq2. DEG analysis on single-cell level is a mess due to the prevalence of zeros making the fold change estimates biased/useless in my experience, especially for the non-highly expressed genes where zeros are even more prevalent.
yes, I do have biological replicates in each groups. to aggregate counts, do you mean to use Seurat