Following PostHoc Tukey test and multiple comparision
0
1
Entering edit mode
7.4 years ago
WUSCHEL ▴ 860

This is my sample DF of big data matrix & and each column has named with multiple information and separated by an underscore.

I want to follow a Tukey Test and plot bar charts for each Gene (Response vs. Time; filled by both the genotypes) with multiple comparisons letters.

Can someone able to help me with adding significance letters using multcomp and multcompView packages.

structure(list(Gene = c("AGI4120.1_UBQ", "AGI570.1_Acin"), WT_Tissue_0T_1 = c(0.886461437, 1.093164915), WT_Tissue_0T_2 = c(1.075140682, 1.229862834), WT_Tissue_0T_3 = c(0.632903012, 1.094003128), WT_Tissue_1T_1 = c(0.883151274, 1.26322126), WT_Tissue_1T_2 = c(1.005627276, 0.962729188), WT_Tissue_1T_3 = c(0.87123469, 0.968078993), WT_Tissue_3T_1 = c(0.723601456, 0.633890322), WT_Tissue_3T_2 = c(0.392585237, 0.534819363), WT_Tissue_3T_3 = c(0.640185369, 1.021934772), WT_Tissue_5T_1 = c(0.720291294, 0.589244505), WT_Tissue_5T_2 = c(0.362131744, 0.475251717), WT_Tissue_5T_3 = c(0.549486925, 0.618177919), mut1_Tissue_0T_1 = c(1.464415756, 1.130533457), mut1_Tissue_0T_2 = c(1.01489573, 1.114915728), mut1_Tissue_0T_3 = c(1.171797418, 1.399956009), mut1_Tissue_1T_1 = c(0.927507448, 1.231911575), mut1_Tissue_1T_2 = c(1.089705396, 1.256782289 ), mut1_Tissue_1T_3 = c(0.993048659, 0.999044465), mut1_Tissue_3T_1 = c(1.000993049, 1.103486794), mut1_Tissue_3T_2 = c(1.062562066, 0.883617224 ), mut1_Tissue_3T_3 = c(1.037404833, 0.851875438), mut1_Tissue_5T_1 = c(0.730883813, 0.437440083), mut1_Tissue_5T_2 = c(0.480635551, 0.298762126 ), mut1_Tissue_5T_3 = c(0.85468388, 0.614923997)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list( cols = list(Gene = structure(list(), class = c("collector_character", "collector")), WT_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))

My codes:

df1 <- df %>% gather(var, response, WT_Tissue_0T_1:mut1_Tissue_5T_3) %>% separate(var, c("Genotype", "Tissue", "Time"), sep = "_") %>% arrange(desc(Gene))

df2 <- df1 %>% group_by(`Gene`,Genotype,Tissue,Time) %>% mutate(Response=mean(response),n=n(),se=sd(response)/sqrt(n))

I want to perform PH Tukey, and I used:

library(car) 
library(lsmeans) 
library(multcompView) 
fit1 <- aov(Response ~ Genotype*Time, df1) 
summary(fit1) 
   lsmeans(fit1, pairwise ~ Genotype | Time)

How can I add significance letters to bar chart using, multcomp and multcompView packages.

This is my codes for bar charts;.

df2$genotype <- factor(df2$genotype, levels = c("WT","mut1")) colours <- c('#336600','#ffcc00') 
library(ggplot2)
ggplot(df2,aes( x=Time, y=Response, fill=Genotype))+ geom_bar(stat='identity', position='dodge')+scale_fill_manual(values=colours)+ geom_errorbar(aes(ymin=average_measure-se, ymax=average_measure+se)+ facet_wrap(~`Gene`)+ labs(x='Time', y='Response')

Finally, I want to denote significance difference letters in this graph, at each time point as I get from lsmeans(fit1, pairwise ~ Genotype | Time) Expected Graph:

Picture1

I would appreciate your kind help, if possible.

R gene omics • 6.7k views
ADD COMMENT
1
Entering edit mode

What are the error messages ? Code produces messages on errors so that you can get info on the problem. Without reporting the messages, it's unlikely someone will be able to help. I may also be helpful to show an example of the data.

ADD REPLY
0
Entering edit mode

Hi Heriche, I've modified the question now.

Thank you for the support. I've been fixing most of the errors since morning, now. Would you be able to help me with the adding significance letters to the graph? I can not figure out. would be greatly appreciated.

ADD REPLY
1
Entering edit mode

I am not sure what you mean by significance letters. Stastistical significance is sometimes indicated by stars or by writing the p-value of the test above the bars. If this is what you're after, have a look at the ggpubr package.

ADD REPLY
0
Entering edit mode

Not necessarily, library(multcompView) have an option to give letters instead of stars. Unfortunately, I'm not able to write the syntax by combining Tukey output to plot syntax

ADD REPLY
1
Entering edit mode

The ggpubr package has a stat_compare_means function, which may add significance markers (brackets, p-values or asterisks / letters) to the plot. What I didn't like is it only works when performing the test, it doesn't let the user provide the significance table.

There is also a ggsignif package, but I never used it.

ADD REPLY
0
Entering edit mode

ggpubr uses ggsignif which I also haven't used. A maybe less fancy way of putting statistical significance above bars could be to create a data frame with the positions of the labels and use it like this:

plot <- plot + geom_text(data = df.with.pos, label = "*")
ADD REPLY
1
Entering edit mode

You mean something like this?

d

Yes, these are just done with geom_text() and geom_segment():

geom_segment(aes(x=1, y=83, xend=2, yend=83), size=0.7, data=ScatterMatrix) +
geom_segment(aes(x=1, y=83, xend=1, yend=79.0), size=0.7, data=ScatterMatrix) +
geom_segment(aes(x=2, y=83, xend=2, yend=79.0), size=0.7, data=ScatterMatrix) +
geom_text(x=1.5, y=87, size=3.0, family="mono", label=pvalues1[i]) +

Then, you can also add labels, like this:

library(gridExtra)
library(cowplot)
annot <- ggdraw() +
  draw_label(x=0.25, y=0.6, expression(NB~-~all~metabolites~Kruskal~Wallis~italic(p)~paste("<0.05")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.5, expression(paste("*,")~italic(p)~paste("< 0.05")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.45, expression(paste("**,")~italic(p)~paste("< 0.01")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.40, expression(paste("***,")~italic(p)~paste("< 0.001")), fontface="bold", size=9) +
  draw_label(x=0.25, y=0.35, "NS, not significant", fontface="plain", size=9) +
  draw_label(x=0.25, y=0.30, expression(Quoted~italic(p)~values~from~Kruskal~Wallis~test), fontface="bold", size=9)

plot_grid(annot, labels=c(""), label_size=36, ncol=2, nrow=1)

hhh

ADD REPLY
0
Entering edit mode

Thanks Kevin,

I am looking for bar plot / box plot with letters

Importantly, generating these letters from Tukey test and adding to plots is the main issue for me, want to know how to manage this kind of df for this work!

ADD REPLY

Login before adding your answer.

Traffic: 2680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6