How to represent multiple p-values ?
2
0
Entering edit mode
9.5 years ago

Hi,

I made several simulation using a gene list of interest and different publicly available cancer gene list to see if a there is an enrichment.

So I have a matrix of p-values like this

         param1    param2    param3
list1    pval_a    pval_b    pval_c
list2    pval_d    pval_e    pval_f
list3    pval_g    pval_h    pval_i

I know it's difficult to compare p-values because each cancer gene list has a different size so the statistcal power will be different. But do you have advice to represent these results in a easily readable plot ?

Thanks

p-values plot • 2.7k views
ADD COMMENT
4
Entering edit mode
9.5 years ago

Yes, ditch the p-values and show the effect size instead. (or at least both.) As you say, it is meaningless to compare P-values across comparisons, since the underlying data for the test will be different. Also remember that p-value is NOT proportional to effect size.

You can easily make a plot with lists over Y-axis and parameters over X-axis in ggplot2, and colour/size it according to effect size/p-value, respectively.

ADD COMMENT
0
Entering edit mode

Thanks. How do you introduce the effect size in the plot?

ADD REPLY
2
Entering edit mode
require(ggplot2); require(plyr)
df <- data.frame(samples = c(rep('L1',3), rep('L2', 3), rep('L3', 3)), params=c('param1', 'param2', 'param3'), pval=runif(9), effect_size=rnorm(9, mean=10))
# Normalize within sample to make comparable across samples
df$effect_norm <- ddply(df, .(samples), function(x) {return(x[4]/max(x[4]))})$effect_size
# Plot stuff.
p <- ggplot(df) + geom_point(aes(x=samples, y=params, size=effect_norm)) + aes(colour=-log10(pval))

Not quite sure what you mean, but I attached some code showing the general idea of the plot.

ADD REPLY
0
Entering edit mode

Thanks. Pretty interesting. For me here list size are stable ( so L1 will have the same value for each of the param used ). So the point size will be the same for the same list independent of the param.

ADD REPLY
0
Entering edit mode

with "effect size" David meant effect size, not the length of your lists.

also, why don't you try to perform some kind of meta-analysis, to get a single p-value at the end?

or if you simply want to plot the p-values, I would suggest you to log-transform them and produce a dotplot

ADD REPLY
0
Entering edit mode
9.5 years ago
David W 4.9k

Adding to David Westergaard's excellent answer, the typical way to visualise these sorts of results in epidemiology and other disciplines that do a lot of among-studies comparisons is the Forest Plot. Basically, you'd plot the effect size and 95% CI for each parameter-estimate. The effect size, and the method of getting the CI from the p-value will depend on exactly what the studies are measuering. But you'd end up with something like this (using the data-frame from David Westergaard's answer):

df$ci  <- abs(rnorm(9, 0, df$effect_size))/2
forest_p <- ggplot(df, aes(samples,  effect_size))
forest_p + geom_point(size=3) +
           geom_pointrange(aes(ymin=effect_size - ci, ymax=effect_size + ci)) + 
           geom_hline(x=0) +
           facet_wrap(~params) +
           coord_flip()
ADD COMMENT

Login before adding your answer.

Traffic: 2125 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6