Box plot using ggplot2
1
1
Entering edit mode
7.8 years ago
1769mkc ★ 1.2k

I can make box plot using the base R plot ,but I want to do the same with ggplot2 , its not as simple as making it in normal boxplot function ,

This is my data frame

 Gene     `7256_Mono` `7653_Mono` `6792_Mono` `6792_HSC` `7653_HSC` `7256_HSC`
 DNMT3A   9.8143600   5.5681400  3.32429000  7.7649600  7.8304600 21.9411000 
 DNMT3B   0.1053070   0.0826214  0.20910300  2.3988700  3.3696100  3.9184700

So this is my subset of data set how do I coerce this dataframe into ggplot data format , i have read tutorials but it quite confusing ,I have to make factors which is used in the ggplot , so how do i make factors out of this data sets so that I can use it for ggplot and make box plot out of it. Any help and suggestion would be highly appreciated .

R • 3.8k views
ADD COMMENT
1
Entering edit mode

Take a look a tidy data, tidyverse and perhaps this older answer.

ADD REPLY
0
Entering edit mode

Oh! I just noticed that you asked the question in the answer I mentioned! OK, here the solution will be similar I believe. If you have some trouble, try giving us the actual code you used (applied on the example data you provided).

ADD REPLY
0
Entering edit mode

okay I will use the same so i can use reshape the data to make it for the ggplot form i suppose i will give it a try first and lets see if I get the same or not

ADD REPLY
5
Entering edit mode
7.8 years ago
willgilks ▴ 360

Hi krushnach80, not sure how you want to group the variables, by but below is my solution.

> head(dat)
    Gene X.7256_Mono. X.7653_Mono. X.6792_Mono. X.6792_HSC. X.7653_HSC. X.7256_HSC.
1 DNMT3A     9.814360    5.5681400     3.324290     7.76496     7.83046    21.94110
2 DNMT3B     0.105307    0.0826214     0.209103     2.39887     3.36961     3.91847

## Libraries... I think there is one library for all these now.
library(tidyr)
library(dplyr)
library(ggplot2)

## Format the data as 'long' using dplyr %>% and tidyr gather, and omitting Gene. 
mdat = dat %>% gather(key = sample, value = val, -Gene)
head(mdat)
    Gene       sample       val
1 DNMT3A X.7256_Mono. 9.8143600
2 DNMT3B X.7256_Mono. 0.1053070
3 DNMT3A X.7653_Mono. 5.5681400
4 DNMT3B X.7653_Mono. 0.0826214
5 DNMT3A X.6792_Mono. 3.3242900
6 DNMT3B X.6792_Mono. 0.2091030

## Make boxplot with scatter overlay because boxplots can be misleading. Grouping factor is gene. Also try beeswarm-type plots.
ggplot(mdat, aes(Gene, val)) + geom_boxplot(colour = "grey", fill = "light grey") + theme_bw() + geom_jitter(width = 0.2, alpha = .9, shape = 2)
ADD COMMENT

Login before adding your answer.

Traffic: 2358 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6