Comparative boxplots with R
4
0
Entering edit mode
8.3 years ago
l0ka ▴ 10
sample    NAF           TAF
00450     0.5211098     0.5310629
00450     0.5193542     0.5286942
00450     0.5262824     0.5199457
00450     0.5230269     0.5252758
00450     0.5169092     0.5223112
00160     0.5221299     0.5324644
00160     0.5319794     0.5531024
00160     0.5233437     0.5358770
00160     0.5242215     0.5224607
00160     0.5152723     0.5229491
00810     0.5127049     0.5222062
00810     0.5263669     0.5320754
00810     0.5157763     0.5267149
00810     0.5433680     0.5671679
00810     0.5242678     0.5248383


Hi, I have a big data frame like this one above. I have to do a boxplot to compare NAF with TAF, by sample name. There are around 100 different samples, so I should split the data. How should I do?

With this code:

boxplot(NAF ~ sample, TAF ~ sample, data=data, las=2, varwidth=T)


It plots only NAF, not TAF...

R • 5.3k views
0
Entering edit mode

I think I had done it once , by lattice graphics, I can't remember full code but the basic one which could help you could be like this

#melt the data
library(reshape2)
data=melt(data)

library(lattice)
bwplot(sample ~ NAF | TAF, data=data)


Just play around with parameters of bwplot

3
Entering edit mode
8.3 years ago
Martombo ★ 3.0k

My suggestion would be to use the packages reshape2 and ggplot2. With those you can easily reassemble the data and plot it in a nicer way than the R default.

You just need the following commands:

library(ggplot2)
library(reshape2)

reshaped_data=melt(data,id="sample")

p=ggplot(reshaped_data,aes(x=as.factor(sample),y=value,fill=variable))

p+geom_boxplot()

2
Entering edit mode
8.3 years ago

There are many ways to do this.

For example, with ggplot2:

library(ggplot2)
ggplot(d) + geom_boxplot(aes(x='NAF', y=NAF)) + geom_boxplot(aes(x='TAF', y=TAF)) + facet_wrap(~sample, ncol=2) + theme_bw() + scale_x_discrete('x axis label') + scale_y_continuous('value')


An alternative is to reshape your dataset in a 'long' format:

library(reshape2)
d.long = melt(d, id.vars='sample')
ggplot(d, aes(x=variable, y=value)) + geom_boxplot()
# you can even use lattice
library(lattice)
bwplot(value~variable|sample, d.long, layout=c(2,2))


You can use the ncol and nrow parameters in facet_wrap (or with the layout argument if you use lattice) to adjust the number of panels displayed in each row/column. If you have a lot of samples, the best thing to do is to add another column classifying the samples into smaller groups, and then plot one page per group.

0
Entering edit mode

heh I was 2 mins late...

0
Entering edit mode

you posted while I was writing :),

My comment is useless now :)

0
Entering edit mode

Sorry, next time you will be faster than me ;-)

0
Entering edit mode

Thanks, it works very well!

1
Entering edit mode
8.3 years ago
Siva ★ 1.8k

You can try BoxPlotR which generates side-by-side box plots. This tool can be run online or locally.

1
Entering edit mode
8.3 years ago
EagleEye 7.4k

I hope this will work:

input <-read.table('mydata.txt', sep='\t', header=T)
mydata_frame <- data.frame(values=c(input[,2],input[,3]),vars = rep(c("NAF","TAF"), times = c(length(input[,2]),length(input[,2]))))
vars = rep(c("NAF","TAF"), times = c(length(input[,2]),length(input[,2]))))
values=c(input[,2],input[,3])
boxplot(values ~ vars, data = mydata_frame)