Question

Volcano plot for multiple clusters

3

Entering edit mode

4.4 years ago

michelle.piquet ▴ 60

Hello, I am trying to make a volcano plot for different clusters. I have 2 conditions, untreated vs. treated. I have a differential expression excel file that cellranger generated for me but within the file it has multiple clusters each which have a fold change and p value. How do I create a volcano plot that contains all the clusters rather than one? Would I have to do a volcano plot for each cluster and then combine them all somehow?

I use this code to generate the plot for just one of the clusters...

macrophage_list <- read.table("differential_expression_macrophage.csv", header = T, sep = ",")

EnhancedVolcano(macrophage_list, lab = as.character(macrophage_list$FeatureName), x = 'untreated.Log2.Fold.Change', y = 'untreated.P.Value', xlim = c(-8,8), title = 'Macrophage', pCutoff = 10e-5, FCcutoff = 1.5, pointSize = 3.0, labSize = 3.0)

Any help and suggestions is greatly appreciated.

RNA-Seq R volcanoplot • 4.4k views

ADD COMMENT • link updated 4.4 years ago by TriS ★ 4.7k • written 4.4 years ago by michelle.piquet ▴ 60

Kevin Blighe · Answer 1 · 2019-12-05

5

Entering edit mode

4.4 years ago

Kevin Blighe 87k

Hey,

I would generate a separate plot, but keep each volcano within the same plot space. You can do this in this way:

v1 <- EnhancedVolcano(...)
v2 <- EnhancedVolcano(...)
v3 <- EnhancedVolcano(...)
v4 <- EnhancedVolcano(...)

library(gridExtra)
library(grid)
grid.arrange(v1, v2, v3, v4, ncol = 2, nrow = 2,
  top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))
grid.rect(gp=gpar(fill=NA))

You could also bind the results tables together and plot all p-values and fold-changes in the same plot, but using, for example, a different shape for each respective comparison. Apart from requiring a bit more coding, this would also make the plot space very crowded, I think.

Kevin

ADD COMMENT • link 4.4 years ago by Kevin Blighe 87k

0

Entering edit mode

Hi Kevin,

So create a volcano plot for each of my 20 clusters and then combine them with the code you gave me?

Thanks

ADD REPLY • link 4.4 years ago by michelle.piquet ▴ 60

0

Entering edit mode

If you want.

ADD REPLY • link 4.4 years ago by Kevin Blighe 87k

0

Entering edit mode

I uploaded only 2 of the cluster as a test, ran the following code and got the following error.

grid.arrange(v1_macrophage, v2_macrophage, ncol = 2, nrow = 2, top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))
Error in `$<-.data.frame`(`*tmp*`, "wrapvp", value = list(x = 0.5, y = 0.5,  : 
  replacement has 17 rows, data has 31328

ADD REPLY • link updated 4.4 years ago by Kevin Blighe 87k • written 4.4 years ago by michelle.piquet ▴ 60

0

Entering edit mode

Hey, how did you create v1_macrophage and v2_macrophage?

Could you just try:

grid.arrange(v1_macrophage, v2_macrophage, ncol = 2,
  top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))

ADD REPLY • link 4.4 years ago by Kevin Blighe 87k

0

Entering edit mode

v1_macrophage <- read.table("cluster1_marcophage.csv", header = T, sep = ",") v2_macrophage <- read.table("cluster2_marcophage.csv", header = T, sep = ",")

Got this error for the following command:

Error in $<-.data.frame(*tmp*, "wrapvp", value = list(x = 0.5, y = 0.5, : replacement has 17 rows, data has 31328

ADD REPLY • link 4.4 years ago by michelle.piquet ▴ 60

0

Entering edit mode

Oh, they should be separate EnhancedVolcano objects, like this:

v1_macrophage <- EnhancedVolcano(...)
v2_macrophage <- EnhancedVolcano(...)

grid.arrange(v1_macrophage, v2_macrophage, ncol = 2,
  top = textGrob('Macrophages', just = c('center'), gp = gpar(fontsize = 32)))

ADD REPLY • link 4.4 years ago by Kevin Blighe 87k

0

Entering edit mode

Hi Kevin,

I created 2 separate EnhancedVolcano objects for 2 of the clusters. Everything worked fine, but I was wondering how I would combine the data into one plot rather than having 20 separate volcano plots?

I appreciate your help.

ADD REPLY • link 4.4 years ago by michelle.piquet ▴ 60

0

Entering edit mode

Then, you will have to try the other option, i.e., merge (rbind()) the results tables and just use that.

ADD REPLY • link 4.4 years ago by Kevin Blighe 87k

0

Entering edit mode

How do I go about merging the results table since all the p values and log fold change values are already in my file. They're just labeled based on which cluster it belongs to?

Thanks again for your help.

ADD REPLY • link 4.4 years ago by michelle.piquet ▴ 60

0

Entering edit mode

Oh, they are already in the same file? Can you paste an example of the data?

ADD REPLY • link 4.4 years ago by Kevin Blighe 87k

0

Entering edit mode

These are the headers.

Feature ID  Feature Name    Cluster 1 Mean Counts   Cluster 1 Log2 fold change  Cluster 1 Adjusted p value  Cluster 2 Mean Counts   Cluster 2 Log2 fold change  Cluster 2 Adjusted p value  Cluster 3 Mean Counts   Cluster 3 Log2 fold change  Cluster 3 Adjusted p value  Cluster 4 Mean Counts   Cluster 4 Log2 fold change  Cluster 4 Adjusted p value  Cluster 5 Mean Counts   Cluster 5 Log2 fold change  Cluster 5 Adjusted p value  Cluster 6 Mean Counts   Cluster 6 Log2 fold change  Cluster 6 Adjusted p value  Cluster 7 Mean Counts   Cluster 7 Log2 fold change  Cluster 7 Adjusted p value  Cluster 8 Mean Counts   Cluster 8 Log2 fold change  Cluster 8 Adjusted p value  Cluster 9 Mean Counts   Cluster 9 Log2 fold change  Cluster 9 Adjusted p value  Cluster 10 Mean Counts  Cluster 10 Log2 fold change Cluster 10 Adjusted p value Cluster 11 Mean Counts  Cluster 11 Log2 fold change Cluster 11 Adjusted p value Cluster 12 Mean Counts  Cluster 12 Log2 fold change Cluster 12 Adjusted p value Cluster 13 Mean Counts  Cluster 13 Log2 fold change Cluster 13 Adjusted p value Cluster 14 Mean Counts  Cluster 14 Log2 fold change Cluster 14 Adjusted p value Cluster 15 Mean Counts  Cluster 15 Log2 fold change Cluster 15 Adjusted p value Cluster 16 Mean Counts  Cluster 16 Log2 fold change Cluster 16 Adjusted p value Cluster 17 Mean Counts  Cluster 17 Log2 fold change Cluster 17 Adjusted p value Cluster 18 Mean Counts  Cluster 18 Log2 fold change Cluster 18 Adjusted p value Cluster 19 Mean Counts  Cluster 19 Log2 fold change Cluster 19 Adjusted p value Cluster 20 Mean Counts  Cluster 20 Log2 fold change Cluster 20 Adjusted p value

ADD REPLY • link updated 4.4 years ago by Kevin Blighe 87k • written 4.4 years ago by michelle.piquet ▴ 60

score 2 · Answer 2 · 2019-12-05

2

Entering edit mode

4.4 years ago

TriS ★ 4.7k

or if you want to put them all together you can just color each group differently.

library(ggplot2)  
ggplot(macrphage_list, aes(x = untreated.Log2.Fold.Change, y = -log10(untreated.P.Value), fill = myClusters)) + geom_point(size = 3) +  geom_hline(yinterecept=-log10(10e-5)) + geom_vline(xintercept = 0) + xlim(-8,8)

where myClusters is the factor that indicates the clusters you are interested in

ADD COMMENT • link 4.4 years ago by TriS ★ 4.7k

0

Entering edit mode

Hi TriS

So in the "myCluster" section put all the clusters that I want?

I have 20 clusters within the differential expression file, so do fill = Cluster 1 Log2 fold change, Cluster 1 Adjusted p value, etc?

The file is labeled with the following headers... Feature Name Cluster 1 Mean Counts Cluster 1 Log2 fold change Cluster 1 Adjusted p value (for all 20).

Thanks

ADD REPLY • link 4.4 years ago by michelle.piquet ▴ 60

0

Entering edit mode

if those are the column names then you must have some sort of genes as row names. you gotta do a little bit of coding to define what genes belong to what cluster. 1- based on the cluster p.value and Log2FC define what belongs to what cluster 2- repeat that for all 20 clusters 3- color each gene based on cluster # I don't know how you calculated the clusters but generally one gene should belong only to one cluster

the end results should give you something like:

GENE   CLUSTER
x   1
y   1
x   2
z   4
h   3

and so on and so forth...

ADD REPLY • link 4.4 years ago by TriS ★ 4.7k