R ggplot stacked histogram - how to plot the small proportion on top (or color last) so the color is seen?
1
0
Entering edit mode
20 months ago
Bianca ▴ 20

I am plotting a stacked histogram, but I cannot see the color of the small value.

My dataframe is: data <- c( samples = c(sample1, sample2, sample 3, sample 4, sample 5, sample6), wins = c(30000, 50000, 700000, 900000, 1020000, 1200000), ties = c(3, 5, 6, 9, 12, 20), location = c(CA, CA, CA, NJ, NY, NY) I want to plot x as samples, y as total (wins and ties) and each histogram bar will have the bottom blue color corresponding to the number of wins and the top yellow color corresponding to the number of ties.

I am using the code below and it seem to work well. It plots the histograms. However the yellow portion (ties) is so small that I cannot see it in the plot. I only see blues. Is there a way to plot the ties (yellow) on top of the blue so I can see it even if small?

df <- plod[,-4] %>%
        pivot_longer(-sample, names_to = "games")

ggplot(df, aes(x=sample, y=value, fill = games)) + geom_col() + labs( x = "samples", y = "total games") + scale_fill_manual(values = c(blue, yellow)) + scale_y_continuos(expand = c(0,0), limits = c(0, max(df$value)) + theme_bw()

ggplot2 R dplyr • 942 views
ADD COMMENT
2
Entering edit mode
20 months ago

Technically, the number of ties is dwarfed by the number of wins in your data and the "not seeing" part is an accurate representation of the underlying counts. Therefore, you should start by asking yourself what your message is / should be?

If your audience has a scientific background, transforming the scales (e.g. to log) might work to show both fractions properly. Yet, for an audience that is not used to interpreting log-transformed scales, you produce a misleading visual representation of the data, so it might be better to stick to identity scales and add a geom_label_repel() to highlight that the data is displayed, but just not visible.

Also, do think beyond mere visual aids. If your focus is on showing differences between locations, directly plotting the ratio of ties/wins for each location instead of the raw counts conveys your message better.

ADD COMMENT
0
Entering edit mode

Hi Matthias Zepper! That was a great advice, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6