Question: Calculate Pairwise Wilcox.Test For Several Categories And Plot Significance Into A Boxplot With Ggplot2
gravatar for Biojl
5.0 years ago by
Biojl1.6k wrote:

I have an R dataset that looks pretty much like this one from diamonds:

diamonds2 = subset(diamonds, cut!='Good' & cut!='Very Good', -c(table, x, y, z, clarity, depth, price))

I want to make a boxplot like this one:

ggplot(diamonds2, aes(x=color, y=carat, col=cut))+geom_boxplot()

And the hard question comes here. My idea is to perform pairwise wilcox.test for each distribution of the variable y (carat) by group (cut) and for each of the columns (color).

pairwise.wilcox.test(diamonds2[,'carat'], interaction(diamonds2[,'cut'],diamonds2[,'color']), p.adj = "bonf"

It's not very elegant because is creating a matrix with extra comparisons, but that's the best I got so far. I would like to prune it.

Additionally I would like to plot the results as asterisks of the color between the two distributions I'm comparing. In the first boxplot (D), I would like to plot 3 asterisks, a purple (red and blue are significantly different), a yellow and a cian.

About the asterisk color plotting I've been playing a bit with the function geom_text from ggplot2 but I can't figure out how to plot below the X axis or plot text in different colors.

UPDATE The real data is very similar to the one I posted. The real data are frequencies for all aminoacids in 3 different set of genes. I can plot asterisks/stars with the geom_text in a particular position but can't automatize it to plot significance taking the information from the table I generated and also can't plot in the X axis, above the letter of the aminoacid.

I did the first columns of the significance stars with Gimp, this is how it should look like. test plot

R statistics plot • 4.3k views
ADD COMMENTlink modified 5.0 years ago • written 5.0 years ago by Biojl1.6k

This question is a little daunting to answer as you have a lot of components to your questions. What have you tried already?

Please consider editing your question above to reflect your bioinformatics data set (and not the diamonds example data) and a graphical display of what you want your figure to look like.

ADD REPLYlink written 5.0 years ago by Josh Herr5.6k

Yes, please indicate your specific bioinformatics research problem. Right now this is a generic R/ggplot2 usage question.

ADD REPLYlink written 5.0 years ago by Neilfws48k

I uploaded a test plot, but for some reason is not appearing.

ADD REPLYlink modified 3.7 years ago • written 5.0 years ago by Biojl1.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1293 users visited in the last hour