Using ggplot2 to make barplots of RNASeq data - maintaining sample metadata when pivoting from wide to long format
0
0
Entering edit mode
12 months ago
Dylan C-C • 0

I am currently trying to replicate the following plots of my RNASeq data made by the program Biolayout using ggplot2. This is a network analysis tool which clusters together genes which follow similar patterns of expression across your samples. This plot is showing the average TPM of all of the genes listed on the right side for all my samples, and the colours above the sample names are different genotype/tissue groupings. I want to be able to recreate this using ggplot2 so that I can have more control of the look of the plot, as well as the grouping of the samples. Biolayout Plot

My problem is that I am having difficulty in pivoting the data from a wide to long format needed for ggplot2, while maintaining the important metadata information which is needed for the grouping and colouring of the graph. The following image is an example of what the data looks like. The first 4 rows are metadata about the samples (tissue type, different genotypes) that I want to use in ggplot for the grouping. However when you are using pivot_longer to meld the data into something useable by ggplot2, you need just a plain matrix of the gene names and counts. So I am wondering how I can use this metadata down the track when making a ggplot2 plot to be able to order the samples. Is it possible to make a separate metadata dataframe with the extra information and the linking sample names, and then pull from that when calling the aesthetics of the ggplot.

enter image description here

rnaseq pivot_longer ggplot2 • 1.3k views
ADD COMMENT
1
Entering edit mode

Your intuition was correct. You want to make two separate data frames and join them on sample name. This code is untested but will probably (hopefully) work.

library("tidyverse")

df_meta <- df |>
  slice(1:4) |>
  select(!c(gene_name, description)) |>
  rename(meta_type=unique_gene_id) |>
  pivot_longer(!meta_type, names_to="sample_name", values_to="meta_val")

df_counts <- df |>
  slice(5:n()) |>
  pivot_longer(!1:3, names_to="sample_name", values_to="count")

df_merged <- inner_join(df_counts, df_meta, by="sample_name")

If you want to reproduce the figure exactly, including the colored grouping bars, it would actually be a lot more straight forward in CompelxHeatmap.

ADD REPLY
1
Entering edit mode

paste the file in csv format and a R wizard might come along and help :)

ADD REPLY

Login before adding your answer.

Traffic: 1749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6