Question: Creating Custom Plot Using Ggplot2
gravatar for jobinv
5.8 years ago by
Bergen, Norway
jobinv1.1k wrote:

I am not sure whether this is an appropriate question for this board, but if anyone can help, I would be immensely grateful.

I am trying to create a figure for a publication, showing an overview of the bioinformatics steps taken in our study. The idea for this plot came from an answer that I saw here at BioStar: A: Best way to visualize Next Generation Sequencing tumor evolution in a graphic?. That plot was generated using ggplot2, I believe.

A bit of background about the information I want to have in this plot: at each step of the bioinformatics, we have a number of upregulated genes and downregulated genes. From one step to the next, the numbers usually become smaller, but not always. I want to show this with "step number" on the x-axis (step 1, step 2, etc.). I want to divide the y-axis into two: number of upregulated genes on the top half, and number of downregulated genes on the bottom. The way I envision this plot turning out, is that there are two lines (one for upregulated genes, one for downregulated genes) that smoothly meander from value to value as one moves from left to right on this plot, looking much like the one that I referred to earlier.

I am quite proficient in R, but have no experience with ggplot2 as yet, but I do have a copy of the ggplot2 book that I have just barely started reading through. So far I have not found out how to manage to execute this plan however. Can anyone advise me about how to get started with making this? Preferably not with MS Paint :-)

ADD COMMENTlink modified 5.7 years ago by David W4.7k • written 5.8 years ago by jobinv1.1k

The plot you are referring to was certainly not made using ggplot2, and it if was it shouldn't have been. The Grammar of Graphics plotting system is all about expressively combining data and models at plotting time to display data and statistical transformations quickly, together on the same graph. The figure you link to was most likely created in a vector graphics software such as Illustrator or Inkscape (though most biologists prefer to misuse Powerpoint for this). The scatter plots at the right might have been made in ggplot, but appear to be styled more like base graphics in R.

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by Matt Shirley9.0k

I see. I will check out these vector graphics softwares and try to make sense of them then. Thanks to both of you for your tips!

ADD REPLYlink written 5.7 years ago by jobinv1.1k

What makes you think it was made using ggplot2?

ADD REPLYlink written 5.8 years ago by Ben2.0k

I might be wrong about that (couldn't find a reference in the article itself about how it was produced). A colleague told me that he suspected that it was, based on other plots he had seen in publications, even though he didn't know how to use it himself either...

Do you think it was not ggplot2? If so, any idea how I could make such a plot instead?

ADD REPLYlink written 5.7 years ago by jobinv1.1k

Personally I think it was most likely made with a vector graphics program (maybe Inkscape or Illustrator), as it's mostly qualitative (except for the percentages which could be drawn via measuring) and there's lots of embedded images. edit: Matt below posted the same general idea as I was writing this

ADD REPLYlink modified 5.7 years ago • written 5.7 years ago by Ben2.0k
gravatar for David W
5.7 years ago by
David W4.7k
New Zealand
David W4.7k wrote:

The commenters are right, that the graph was probably made "by hand" in Illustrator or similar. But it's easy enough to use ggplot2 to produce something similar.

start by faking up some data to have the number of differentially regulated genes trickle away:

df0 <- data.frame(step=1:20, 
                  n_up = rpois(20, 1/(1:20)  * 1000), 
                  n_down = -rpois(20, 1/(1:20) * 1000)) #note the negative sign

The use geom_ribbon to visualize the "funneling" effect of excluded genes at each (or at least most) of the steps

p <- ggplot(df0, aes(x=step))
p + geom_ribbon(aes(ymin=n_up, ymax=n_down) )

From there you can tweak to your heart's content, or save it as vector and work on it in an illustration program.

ADD COMMENTlink modified 5.7 years ago • written 5.7 years ago by David W4.7k

Oh, this is a great starting point, can definitely work from this! Thanks!

ADD REPLYlink written 5.7 years ago by jobinv1.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 867 users visited in the last hour