Question: Scatter plot for GSEA results analysis
4.7 years ago by
United Kingdom
Hi All,

I really like fig 5A in this paper:

It neatly displays result of the GSEA analysis with an indication of gene set size and P-value. Can anyone help me figure out how to make a graph like this given gene set + enrichment score + P-value + gene set size?



4.1 years ago by
Hi, I've run into your question while looking for something else. This gave me great idea to show my own analysis. I've prepared this fig-5A-like plot and I thought to share it. Here is the result picture: GSEA Scatter plot

And here is the code (in R):

# Preparing data
data <- data.frame(Gene_set=c("Gene_set1", "Gene_set2", "Gene_set3", "Gene_set4", "Gene_set5"),
                   NES=runif(5, -3, 3),
                   No_of_significant_genes=runif(5, 1, 100))

# Plotting
p <- ggplot(data, aes(NES, Gene_set))
p + geom_point(aes(colour=FDR_q.val, size=No_of_significant_genes)) +
    scale_color_gradientn(colours=rainbow(4), limits=c(0, 1)) +
    geom_vline(xintercept=0, size=0.5, colour="gray50") +
    theme(panel.background=element_rect(fill="gray95", colour="gray95"),
          panel.grid.major=element_line(size=0.25,linetype='solid', colour="gray90"), 
          panel.grid.minor=element_line(size=0.25,linetype='solid', colour="gray90"),
          axis.title.y=element_blank()) +
    expand_limits(x=c(-3,3)) +
    scale_x_continuous(breaks=c(-3,-2,-1,0,1,2,3)) +
Thank you so much for sharing the code! It's awesome!

I don't understand what the "number of significant genes" corresponds to in the GSEA results. Is it the number of genes marked as "core enrichment"?

4.7 years ago by
You can get such nice plots using clusterProfiler, as seen here:

