Question

Following PostHoc Tukey test and multiple comparision

1

Entering edit mode

5.9 years ago

WUSCHEL ▴ 750

This is my sample DF of big data matrix & and each column has named with multiple information and separated by an underscore.

I want to follow a Tukey Test and plot bar charts for each Gene (Response vs. Time; filled by both the genotypes) with multiple comparisons letters.

Can someone able to help me with adding significance letters using multcomp and multcompView packages.

structure(list(Gene = c("AGI4120.1_UBQ", "AGI570.1_Acin"), WT_Tissue_0T_1 = c(0.886461437, 1.093164915), WT_Tissue_0T_2 = c(1.075140682, 1.229862834), WT_Tissue_0T_3 = c(0.632903012, 1.094003128), WT_Tissue_1T_1 = c(0.883151274, 1.26322126), WT_Tissue_1T_2 = c(1.005627276, 0.962729188), WT_Tissue_1T_3 = c(0.87123469, 0.968078993), WT_Tissue_3T_1 = c(0.723601456, 0.633890322), WT_Tissue_3T_2 = c(0.392585237, 0.534819363), WT_Tissue_3T_3 = c(0.640185369, 1.021934772), WT_Tissue_5T_1 = c(0.720291294, 0.589244505), WT_Tissue_5T_2 = c(0.362131744, 0.475251717), WT_Tissue_5T_3 = c(0.549486925, 0.618177919), mut1_Tissue_0T_1 = c(1.464415756, 1.130533457), mut1_Tissue_0T_2 = c(1.01489573, 1.114915728), mut1_Tissue_0T_3 = c(1.171797418, 1.399956009), mut1_Tissue_1T_1 = c(0.927507448, 1.231911575), mut1_Tissue_1T_2 = c(1.089705396, 1.256782289 ), mut1_Tissue_1T_3 = c(0.993048659, 0.999044465), mut1_Tissue_3T_1 = c(1.000993049, 1.103486794), mut1_Tissue_3T_2 = c(1.062562066, 0.883617224 ), mut1_Tissue_3T_3 = c(1.037404833, 0.851875438), mut1_Tissue_5T_1 = c(0.730883813, 0.437440083), mut1_Tissue_5T_2 = c(0.480635551, 0.298762126 ), mut1_Tissue_5T_3 = c(0.85468388, 0.614923997)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list( cols = list(Gene = structure(list(), class = c("collector_character", "collector")), WT_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))

My codes:

df1 <- df %>% gather(var, response, WT_Tissue_0T_1:mut1_Tissue_5T_3) %>% separate(var, c("Genotype", "Tissue", "Time"), sep = "_") %>% arrange(desc(Gene))

df2 <- df1 %>% group_by(`Gene`,Genotype,Tissue,Time) %>% mutate(Response=mean(response),n=n(),se=sd(response)/sqrt(n))

I want to perform PH Tukey, and I used:

library(car) 
library(lsmeans) 
library(multcompView) 
fit1 <- aov(Response ~ Genotype*Time, df1) 
summary(fit1) 
   lsmeans(fit1, pairwise ~ Genotype | Time)

How can I add significance letters to bar chart using, multcomp and multcompView packages.

This is my codes for bar charts;.

df2$genotype <- factor(df2$genotype, levels = c("WT","mut1")) colours <- c('#336600','#ffcc00') 
library(ggplot2)
ggplot(df2,aes( x=Time, y=Response, fill=Genotype))+ geom_bar(stat='identity', position='dodge')+scale_fill_manual(values=colours)+ geom_errorbar(aes(ymin=average_measure-se, ymax=average_measure+se)+ facet_wrap(~`Gene`)+ labs(x='Time', y='Response')

Finally, I want to denote significance difference letters in this graph, at each time point as I get from lsmeans(fit1, pairwise ~ Genotype | Time) Expected Graph:

I would appreciate your kind help, if possible.

R gene omics • 5.7k views

ADD COMMENT • link updated 5.9 years ago by GenoMax 141k • written 5.9 years ago by WUSCHEL ▴ 750

1

Entering edit mode

What are the error messages ? Code produces messages on errors so that you can get info on the problem. Without reporting the messages, it's unlikely someone will be able to help. I may also be helpful to show an example of the data.

ADD REPLY • link 5.9 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Hi Heriche, I've modified the question now.

Thank you for the support. I've been fixing most of the errors since morning, now. Would you be able to help me with the adding significance letters to the graph? I can not figure out. would be greatly appreciated.

ADD REPLY • link 5.9 years ago by WUSCHEL ▴ 750

1

Entering edit mode

I am not sure what you mean by significance letters. Stastistical significance is sometimes indicated by stars or by writing the p-value of the test above the bars. If this is what you're after, have a look at the ggpubr package.

ADD REPLY • link 5.9 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Not necessarily, library(multcompView) have an option to give letters instead of stars. Unfortunately, I'm not able to write the syntax by combining Tukey output to plot syntax

ADD REPLY • link 5.9 years ago by WUSCHEL ▴ 750

1

Entering edit mode

The ggpubr package has a stat_compare_means function, which may add significance markers (brackets, p-values or asterisks / letters) to the plot. What I didn't like is it only works when performing the test, it doesn't let the user provide the significance table.

There is also a ggsignif package, but I never used it.

ADD REPLY • link 5.9 years ago by h.mon 35k

0

Entering edit mode

ggpubr uses ggsignif which I also haven't used. A maybe less fancy way of putting statistical significance above bars could be to create a data frame with the positions of the labels and use it like this:

plot <- plot + geom_text(data = df.with.pos, label = "*")

ADD REPLY • link 5.9 years ago by Jean-Karim Heriche 27k

1

Entering edit mode

You mean something like this?

Yes, these are just done with geom_text() and geom_segment():

geom_segment(aes(x=1, y=83, xend=2, yend=83), size=0.7, data=ScatterMatrix) +
geom_segment(aes(x=1, y=83, xend=1, yend=79.0), size=0.7, data=ScatterMatrix) +
geom_segment(aes(x=2, y=83, xend=2, yend=79.0), size=0.7, data=ScatterMatrix) +
geom_text(x=1.5, y=87, size=3.0, family="mono", label=pvalues1[i]) +

Then, you can also add labels, like this:

library(gridExtra)
library(cowplot)
annot <- ggdraw() +
  draw_label(x=0.25, y=0.6, expression(NB~-~all~metabolites~Kruskal~Wallis~italic(p)~paste("<0.05")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.5, expression(paste("*,")~italic(p)~paste("< 0.05")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.45, expression(paste("**,")~italic(p)~paste("< 0.01")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.40, expression(paste("***,")~italic(p)~paste("< 0.001")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.35, "NS, not significant", fontface="plain", size=9) +
  draw_label(x=0.25, y=0.30, expression(Quoted~italic(p)~values~from~Kruskal~Wallis~test), fontface="bold", size=9)

plot_grid(annot, labels=c(""), label_size=36, ncol=2, nrow=1)

ADD REPLY • link 5.6 years ago by Kevin Blighe 87k

0

Entering edit mode

Thanks Kevin,

I am looking for bar plot / box plot with letters

Importantly, generating these letters from Tukey test and adding to plots is the main issue for me, want to know how to manage this kind of df for this work!

ADD REPLY • link 5.9 years ago by WUSCHEL ▴ 750