Question: Scatter plot for GSEA results analysis
3
FreeMindAlex110 wrote:

Hi All,

I really like fig 5A in this paper:

http://www.nature.com/nbt/journal/v31/n1/full/nbt.2450.html

It neatly displays result of the GSEA analysis with an indication of gene set size and P-value. Can anyone help me figure out how to make a graph like this given gene set + enrichment score + P-value + gene set size?

Thanks

Abdul

5

Hi, I've run into your question while looking for something else. This gave me great idea to show my own analysis. I've prepared this fig-5A-like plot and I thought to share it. Here is the result picture:

http://imgur.com/XC9dWXJ And here is the code (in R):

``````# Preparing data
data <- data.frame(Gene_set=c("Gene_set1", "Gene_set2", "Gene_set3", "Gene_set4", "Gene_set5"),
NES=runif(5, -3, 3),
FDR_q.val=runif(5,0,1),
No_of_significant_genes=runif(5, 1, 100))

# Plotting
library(ggplot2)
p <- ggplot(data, aes(NES, Gene_set))
p + geom_point(aes(colour=FDR_q.val, size=No_of_significant_genes)) +
geom_vline(xintercept=0, size=0.5, colour="gray50") +
theme(panel.background=element_rect(fill="gray95", colour="gray95"),
panel.grid.major=element_line(size=0.25,linetype='solid', colour="gray90"),
panel.grid.minor=element_line(size=0.25,linetype='solid', colour="gray90"),
axis.title.y=element_blank()) +
expand_limits(x=c(-3,3)) +
scale_x_continuous(breaks=c(-3,-2,-1,0,1,2,3)) +
scale_y_discrete(limits=rev(data\$Gene_set))
``````

Thank you so much for sharing the code! It's awesome!

I don't understand what the "number of significant genes" corresponds to in the GSEA results. Is it the number of genes marked as "core enrichment"?