Question: (Closed) GSEA for scRNA-seq
0
biostarukha • 10 wrote:
I want to run GSEA on my scRNA-seq data using cluster markers. I want to compare enrichment scores for gene sets and pathways across clusters in my data. I used Seurat FindAllMarkers to find DEGs for each cluster (cluster of interest vs all remaining cells). For GSEA, I want to use only DEGs with adj p-value < 0.05. As GSEA takes expression data as input, I need to find average expressions of significant DEGs per cluster and make a txt file with counts for each gene across the clusters.
But as I am dealing with DEGs, they will be different for each cluster. How should I impute data for those DEGs which are significant for cluster A but not significantly differentially expressed in cluster B (but still somehow expressed)?
"For GSEA, I want to use only DEGs with adj p-value < 0.05"
You can't do that. For GSEA, you have to use all genes (supplied with their expression data), not just the DEGs.
You'd have to use pathway overrepresentation analysis (e.g. as typically done in gene ontology enrichment) if you want to only use the DEGs.
Also, I just realized you posted a near-identical question here: How to use DEGs file for GSEA?
The response there already answers your question.
thank you very much for your comment. So, if I want to use GSEA and compare enrichment across clusters, I basically should input the average expression of all genes per cluster, right?
I have seen in scRNA-seq analysis papers that the authors filtered out insignificant DEGs. For example, this Nature publication:
That only specifies that a subset of genes were removed, rather than all genes that were not significantly differentially expressed.
You posted a similar question yesterday. Try to keep discussion in one post.