Doubts with Stacked barplot using R ggplot2
8 weeks ago

Hello! I am trying to plot some data using R but I am having some problems doing it. I have a data frame with two columns (genes and and type of mutation) and looks like:

genes variant MLH1 Intronic ATR 5' UTR TP53 missense KRAS Silent POLE Intronic MLH1 missense BRAF Intronic ATRX Silent

So I would like to graph the 25 most mutated genes in a stacked barplot, but the closest I get is this:

I am using the ggplot2 library and my code is this:

ggplot(Genes_variantclassification, aes(x = genes, y = 1))+ geom_col(aes(fill = variant), width = 0.7)+ theme(axis.text.x = element_text(angle = 90,size=rel(0.2)))

I know this is probably a very basic question but I am really novice using R, so I would really appreciate your guide.

8 weeks ago

This will keep the top 25 genes by mutation number. You may need to alter the code a bit to fit your data since I had to guess what it looked like based on your description.

library("tidyverse")

top25 <- Genes_variantclassification %>%
nest(grp=variant) %>%
mutate(n_mutations=map_dbl(grp, nrow)) %>%
slice_max(n_mutations, n=25) %>%
unnest(grp)


You can add this line if you want the genes to be in descending order on the plot by mutation number.

top25 <- mutate(top25, genes=fct_reorder(genes, n_mutations, .fun=unique, .desc=TRUE))

Many many thanks! worked percfectly well!!!