boxplot using ggplot2 in R
1
0
Entering edit mode
5 months ago
raavi21198 ▴ 10

Hello members!!

I have a raw data which consists of 45 samples and their intensities. This is a microarray data expression. I have comverted this into a dataframe. However, I am confused how to plot a boxplot of all these 45 samples and also group them as "normal" and "tumor". Please help me out with this The code i used is as follows

read_data <- ReadAffy() ##read the raw .CEL files

ph$sample ph@data ph@data[,1]=c("NB","ND","TB","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB","TC","NB","ND","TB") sampleNames=vector() logs=vector() for (i in 1:45) { sampleNames=c(sampleNames,rep(ph@data[i,1],dim(pmexp)[1])) logs=c(logs,log2(pmexp[,i])) } logdata <- data.frame(logint=logs,sampleName=sampleNames)  the structure of this dataframe is as follows  > str(logdata) 'data.frame': 11155455 obs. of 2 variables:$ logint    : num  8.79 9.74 11.09 12.38 12.36 ...
\$ sampleName: chr  "NB" "NB" "NB" "NB" ...
1  8.791163         NB
2  9.736402         NB
3 11.091435         NB
4 12.376125         NB
5 12.363587         NB
6 11.574594         NB
> p+geom_boxplot()


Can someone please guide me how to create a boxplot using ggplot2 in R, of these 45 samples, by grouping them as normal and tumor samples, as the above code gives me the boxplot of only four samples. I need to print them all together

Thank you

R boxplot ggplot2 • 440 views
1
Entering edit mode

in the absence of data, i suggest following:

1. Convert the data frame from wide format to long format. (dplyr/tidyr)
2. Attach grouping information for each sample (dplyr)
3. Draw box plot (ggplot)
5. Facet by group (ggplot)

Instead of boxplot, consider using violin plot with jitter.

0
Entering edit mode

Thank you for your response. I have edited to repost the data. Could you now let me know where am i going wrong

3
Entering edit mode
5 months ago

here is an example i built from https://bioconductor.org/packages/devel/workflows/vignettes/arrays/inst/doc/arrays.html

library(affy)   # Affymetrix pre-processing
library(limma)  # two-color pre-processing; differential
celfiles <- system.file("extdata", package="arrays")
eset <- justRMA(phenoData=phenoData,celfile.path=celfiles)
df=as.data.frame(exprs(eset))
pdata=pData(eset)

library(dplyr)
library(tidyr)
library(tibble)
library(ggplot2)

df %>%
pivot_longer(everything(),names_to = "cels", values_to ="vals") %>%
inner_join(., rownames_to_column(pdata),by = c("cels" = "rowname")) %>%
ggplot(., aes(cels,vals, fill=Sensitivity)) +
geom_boxplot()+
facet_wrap(~IVT, scales = "free")+
xlab("")+
ylab("")+
theme_bw()+
theme(axis.text.x = element_text(angle = 90),
axis.text = element_text(size=18),
strip.text = element_text(size=18),
legend.text = element_text(size=18),
legend.title = element_text(size = 18)
)


1
Entering edit mode

Code suggestions:

a) use theme_set() to both define a theme and set a base size for all relevant parts (axis, theme, labels) in a single command, that saves the multiple arguments in theme().

b) rotate x-axis labels with guides rather than angle as guides ensures proper alignment in horizontal and vertical directions even using angles such as 45°, see here, and

c) put legend on top so its large size does not shrink the plot itself. Again, the sizes of all fonts and labels are auto-adjusted to look decent based on the base_size in the theme_set() command on top.

theme_set(theme_bw(base_size = 15))
df %>%
pivot_longer(everything(),names_to = "cels", values_to ="vals") %>%
inner_join(., rownames_to_column(pdata),by = c("cels" = "rowname")) %>%
ggplot(., aes(cels,vals, fill=Sensitivity)) +
geom_boxplot()+
facet_wrap(~IVT, scales = "free")+
xlab("")+
ylab("")+
guides(x = guide_axis(angle = 45))+
theme(legend.position="top")


By the way, the code example you use requires the arrays package to be installed to have access to their extdata, BiocManager::install("arrays").

0
Entering edit mode

Thank you a so much for proving this example