Question: How to reproduce a stacked bar chart in R
1
gravatar for deathmagnetic20
8 weeks ago by
ISPA
deathmagnetic2010 wrote:

Hi,

I have used the analytical tool CIBERSORTx to impute gene expression profiles and provide an estimation of the abundances of member cell types in a mixed cell population, using RNAseq and TCGA data. The output table have TCGA barcodes in rows, and cell types in columns. I would like to generate a stacked bar chart using R, like this one:

enter image description here

Any idea?

Thanks!

rna-seq R • 166 views
ADD COMMENTlink modified 8 weeks ago by bioinformatics2020570 • written 8 weeks ago by deathmagnetic2010

Can you repost the image? It's not showing up.

ADD REPLYlink written 8 weeks ago by bioinformatics2020570

Reposted the image, sorry!

ADD REPLYlink written 8 weeks ago by deathmagnetic2010
1
gravatar for bioinformatics2020
8 weeks ago by
bioinformatics2020570 wrote:

Data is a data.frame that has the TCGA barcodes in a column called "barcodes", relative percentage in a column called "percent", and cell types in "cell_types."

if(!require("tidyverse")) install.packages("tidyverse")
library(ggplot2)
ggplot(data, mapping = aes(x = barcodes, y = percent, fill =  cell_types)) + 
    geom_bar(position= "stack", stat = "identity")

EDIT: In case you want to change the colors of the cell_types, you can add scale_fill_manual with labels being the names of the cell-types, and values being the color hex code you want:

 ggplot(data, mapping = aes(x = barcodes, y = percent, fill =  cell_types)) + 
        geom_bar(position= "stack", stat = "identity") +
        scale_fill_manual(labels = c("cell_type1", "cell_type2", "cell_type3"), values = c("#000000", "#000000", "#000000"))
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by bioinformatics2020570

Thank you!

My problem is that I have a table data format like this one, so I dont know how to reorder in order to have the columns you said:

enter image description here

ADD REPLYlink written 8 weeks ago by deathmagnetic2010
library(ggplot2)
set.seed(1)
test_df <- data.frame(mixture = c("TCGA.P7.ASNX", "TCGA.P7.A5NY", "TCGA.P8.A5KD",
                       "TCGA.P7.ASNN", "TCGA.P7.A5NB", "TCGA.P8.A5KE"),
           cell_type_a = runif(6),
           cell_type_b = runif(6),
           cell_type_c = runif(6))

#mixture cell_type_a cell_type_b cell_type_c
#1 TCGA.P7.ASNX   0.2655087  0.94467527   0.6870228
#2 TCGA.P7.A5NY   0.3721239  0.66079779   0.3841037
#3 TCGA.P8.A5KD   0.5728534  0.62911404   0.7698414
#4 TCGA.P7.ASNN   0.9082078  0.06178627   0.4976992
#5 TCGA.P7.A5NB   0.2016819  0.20597457   0.7176185
#6 TCGA.P8.A5KE   0.8983897  0.17655675   0.9919061


 test_df <- test_df %>% pivot_longer(!mixture, names_to = "cell_types", values_to = "percent") %>%
      rename(barcodes = mixture)

#barcodes     cell_types  percent
 #  <chr>        <chr>         <dbl>
 #1 TCGA.P7.ASNX cell_type_a  0.266 
 #2 TCGA.P7.ASNX cell_type_b  0.945 
 #3 TCGA.P7.ASNX cell_type_c  0.687 
 #4 TCGA.P7.A5NY cell_type_a  0.372 
 #5 TCGA.P7.A5NY cell_type_b  0.661 
 #6 TCGA.P7.A5NY cell_type_c  0.384 
 #7 TCGA.P8.A5KD cell_type_a  0.573 
 #8 TCGA.P8.A5KD cell_type_b  0.629 
 #9 TCGA.P8.A5KD cell_type_c  0.770 
#10 TCGA.P7.ASNN cell_type_a  0.908 
#11 TCGA.P7.ASNN cell_type_b  0.0618
#12 TCGA.P7.ASNN cell_type_c  0.498 
#13 TCGA.P7.A5NB cell_type_a  0.202 
#14 TCGA.P7.A5NB cell_type_b  0.206 
#15 TCGA.P7.A5NB cell_type_c  0.718 
#16 TCGA.P8.A5KE cell_type_a  0.898 
#17 TCGA.P8.A5KE cell_type_b  0.177 
#18 TCGA.P8.A5KE cell_type_c  0.992 

ggplot(data = test_df, mapping = aes(x = barcodes, y = percent, fill = cell_types )) + 
  geom_bar(position = "stack", stat = "identity") + 
  scale_fill_manual(labels = c("cell_type_a", "cell_type_b", "cell_type_c"), values = c("#000000",
                                                                                        "#000000",
                                                                                        "#000000"))
ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by bioinformatics2020570

I will try it, thanks!

Challenging, cause I have to include 178 rows (barcodes) and 28 (cell types). Is there a simple code to do it?

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by deathmagnetic2010

What's challenging about it?

ADD REPLYlink written 8 weeks ago by bioinformatics2020570

Yes with my low code level, sorry!

I meant I dont know how to include all the barcodes and cell types but manually.

ADD REPLYlink written 8 weeks ago by deathmagnetic2010
1

Ah, you don't have to! That portion of the code when I type the cell_types and colors (scale_fill_manual) is optional! You can simply use the code above to transform your data.frame into the correct format, and then use ggplot without the scale_fill_manual:

ggplot(data = test_df, mapping = aes(x = barcodes, y = percent, fill = cell_types )) + 
  geom_bar(position = "stack", stat = "identity")

It'll automatically add colors.

ADD REPLYlink written 8 weeks ago by bioinformatics2020570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1752 users visited in the last hour
_