R : Plotting several variables using ggplot geombar
1
0
Entering edit mode
3.8 years ago

Hi there,

I'm currently struggling with ggplot geombar.

To present quickly the data, we've computed the proportion of several virus in diferent kind of samples (column sample). 3 different protocols have been used (column proto) in triplets (column dupli).

I would like to plot the proportion of each virus in each assay, depending on protocol used, and separate results for each kind of samples

Here is an extract from my data:

sample proto dupli virus                     prop
1 HSV    E     S3    Mastadenovirus      0.00000770
2 HSV    E     S3    Orthopneumovirus    0         
3 HSV    E     S3    Simplexvirus        0.996     
4 HSV    E     S3    Alphainfluenzavirus 0         
5 VRS    E     S3    Enterovirus         0         
6 HSV    E     S3    Dependoparvovirus   0         
7 HSV    E     S3    Levivirus           0.0000847 
8 HSV    E     S3    others              0.00373   
9 HSV    E     S10   Mastadenovirus      0.0000136 
10 HSV    E     S10   Orthopneumovirus    0  
11 MOCK    E     S3    Levivirus           0.0000847

I had a start of anwsers on Stackoverflow for plotting an incomplete result like this:

ggplot(subtable2, aes(fill=virus,x=proto, y=prop))+ 
geom_bar(position="stack", stat="identity")+ 
facet_grid(.~sample)

current result

How can I improve my code to visualize the proportion results for each "dupli" members, and get something like this?

final plot

I tried to create a new variable with:

subtable2$sampledupli<-paste0(subtable2$sample,"_",subtable2$dupli)

But i'm not sure this is the right way. Do you have some clues to do this, or should I rework my data and visualisation?

Thx, have a nice WE

R ggplot2 • 2.7k views
ADD COMMENT
1
Entering edit mode

Hello! I hope you're keeping well.

What happens when you use sampledupli as your faceting variable?

I'm trying to generate a toy dataset that I can work with to help you with this problem, but I'm having trouble figuring out what prop is referring to exactly. Am I correct in assuming that it is the proportion of a given virus for that particular sample, protocol, and duplicate?

Thanks!

ADD REPLY
0
Entering edit mode

Thx for the anwser!

prop is the viral proportion in the triplicat. If you sum prop for each one:

    aggregate(subtable2$prop, list(subtable2$sampledupli), function(x) sum(x))
    Group.1 x
    1   HSV_S10 1
    2   HSV_S17 1
    3   HSV_S24 1
    4    HSV_S3 1
    5   HSV_S31 1
    6   HSV_S38 1
    7   HSV_S44 1
    8   HSV_S49 1
    9   HSV_S54 1
   10  MOCK_S1 1
   11 MOCK_S15 1
   12 MOCK_S22 1
   13 MOCK_S29 1
   14 MOCK_S36 1
   15 MOCK_S43 1
   16 MOCK_S48 1
   17 MOCK_S53 1
   18  MOCK_S8 1
   19  VRS_S12 1
   20  VRS_S19 1
   21  VRS_S26 1
   22  VRS_S33 1
   23  VRS_S40 1
   24  VRS_S45 1
   25   VRS_S5 1
   26  VRS_S50 1
   27  VRS_S55 1

plot with sample dupli

ADD REPLY
2
Entering edit mode
3.8 years ago
aaragak1 ▴ 40

I've tried something like this - let me know if this is what you were going for

library(tidyverse)
set.seed(1)
reprex <-  
    tibble(sample = rep(c("HSV", "MOCK", "VRS"), each = 300), 
              proto = rep(c("E", "M", "Q"), each = 100, times = 3), 
              dupli = rep(c(1:3), times = 300), 
              virus = sample(LETTERS[1:7], 900, replace = T)) %>%
    group_by(sample, proto, dupli, virus) %>%
    mutate(num_virus = n())

ggplot(reprex, aes(fill = virus, y = num_virus, x = dupli)) + 
    geom_bar(position = "fill", stat = "identity") +
    facet_grid(cols = vars(sample, proto))

plot

ADD COMMENT
0
Entering edit mode

Well, this is that kind of plot i'm trying to produce. Thanks! However i'm getting this: my results

I'm guessing that awefull output is caused by my variable dupli. Do you know how can I recode this variable to get 1/2/3 for each sample like you instead of SX, which the number doesn't matter (it was just an experimental trial number)

ADD REPLY
0
Entering edit mode

Off the top of my head, you might be able to use dplyrs mutate + case_when, or perhaps an as.factor? I'm not entirely sure what the current levels of the column is right now

ADD REPLY
0
Entering edit mode

I'm not an expert with the tidy language, so I've just tried a dirty method: add a c(1,2,3......1,2,3) vector to my table, and thanks to your code, it worked perfectly! final plot

Thank you all for your awnsers!

ADD REPLY
0
Entering edit mode

I think all your samples do not have prop values equally in all groups. Try making scales independent in facet_grid option.

ADD REPLY

Login before adding your answer.

Traffic: 2867 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6