Tool:EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling
0
43
Entering edit mode
4.3 years ago

# NB - please post questions on Bioconductor Support: https://support.bioconductor.org/

[ direct link to create a post: https://support.bioconductor.org/p/new/post/?tag_val=EnhancedVolcano ]

# GitHub

https://github.com/kevinblighe/EnhancedVolcano

# Vignette

https://bioconductor.org/packages/release/bioc/vignettes/EnhancedVolcano/inst/doc/EnhancedVolcano.html

volcano plot ggplot2 ggrepel bioconductor Tool • 22k views
2
Entering edit mode

Great tool, thanks for putting it together!

My DEseqDataSet is actually a set of peaks instead of transcripts that all have a unique identifier going out to 6 figures, I was wondering if there was a way to use selectLab() to create custom labels such as TF names for factors I know to bind within those peaks or even small representations of their PWMs - though the former would be sufficient for the moment - or does that have to come from an additional metadata column in DEseqDataSet? Thanks!

2
Entering edit mode

Hey, thanks for the comments. My colleague Myles back in London deserves the credit for the initial idea of putting this together. selectLab() will just match up to whatever you have passed to the required lab parameter. So, it can be anything really, but there is no functionality to automatically pull in TF names. That is a good idea, though, and it would show in a nice way which TFs were up- or down-regulated.

I mean, this package is only released for a few weeks at this stage, and I'm fairly open as to where it could be developed further. I was hoping to build something as comprehensive as the ComplexHeatmap package.

1
Entering edit mode

Hi Kevin,

Thanks again for the amazing tool!

Can I use EnhancedVolcano with plotly? I assume it's built using ggplot2, and I've tried to do:

ggplotly(plot) where plot is a volcano plot made using EnhancedVolcano.

I get the following error message though:

Error in unique.default(x) :
unique() applies only to vectors
1: In if (nchar(axisTitleText) > 0) { :
the condition has length > 1 and only the first element will be used
2: In if (nchar(axisTitleText) > 0) { :
the condition has length > 1 and only the first element will be used


Being able to use these volcano plots with plotly would be super useful! Especially when there are too many DEGs and it really makes labeling messy.

Thanks!

0
Entering edit mode

EnhancedVolcano does indeed return a ggplot2 object, on which extra features can be added. It also utilises ggrepel - perhaps that is the missing link? I have not tried with plotly but will make an attempt later to see how to coerce the volcano object to work with plotly.

I believe there are a few tutorials around where plotly is used to generate a volcano, though. I think that Stephen Turner had one, but cannot find it right now.

1
Entering edit mode

Aha! Thank you so much, it's working now :)

1
Entering edit mode

Please use Add Comment or Add Reply as appropriate instead of Add Answer.

0
Entering edit mode

May I ask how you managed to transform the EnhancedVolcano object with ggplotly into a plotly object? I got stuck on the very same error and the hint ggrepel does not ring a bell....

0
Entering edit mode
0
Entering edit mode

1
Entering edit mode

Good work dude!

0
Entering edit mode

Hi Kevin,

Great tool! I've been pretty much using it for all my volcano plots. I wanted to know if there was a way to use EnhancedVolcanos with results from sleuth?

cheers!

0
Entering edit mode

You're welcome. Yes, most likely there is a way. What are the columns in the output of Sleuth?

0
Entering edit mode

This is the output for sleuth, where b is equivalent to Log2FC

ens_gene    ext_gene    target_id   pval    qval      b se_b    mean_obs    var_obs tech_var    sigma_sq    smooth_sigma_sq final_sigma_sq

0
Entering edit mode

Hey, well, you have at least 2 required columns:

• ens_gene (label)
• pval (y)

What is missing is the fold-change for x! I have never used Sleuth, however, I searched the Web forums and am surprised to see that the Sleuth developers do not output a fold-change for Sleuth's results. You may consider using a different differential expression analysis program.

0
Entering edit mode

That's unfortunate. Thanks for looking though!

0
Entering edit mode

You could likely still use the b value, in which case you should change the x-axis label too. You may want to read through this thread on Google Groups: https://groups.google.com/forum/#!topic/kallisto-sleuth-users/kWodd7CQejE (Harold Pimentel is Sleuth developer)

0
Entering edit mode

Hi Kevin, Thanks for this great package. A basic question: you specify in the vignette that the default p-value cut-off is 0.05, but from the default plot, it looks to me as if it were 0.005. Is this a misunderstanding on my part? Best wishes, Patrick

0
Entering edit mode

Hey Patrick, thanks for noticing that. The default is actually:

pCutoff = 10e-6


I just need to update the text in the vignette!

1
Entering edit mode

Thanks, Kevin, I've just seen that, having read through the whole vignette: all makes sense!

0
Entering edit mode

I am updating that part of the vignette right now, so, the change will come through on the Vignette on GitHub in the next few minutes: https://github.com/kevinblighe/EnhancedVolcano

It will be until Bioc 3.10 before it is changed on the main Bioconductor branch.

0
Entering edit mode

Awesome package, Kevin! Is it possible to remove the log2FC cutoff and lines?

2
Entering edit mode

Hey, thank you!

To remove all cut-off lines, you just need to do:

EnhancedVolcano(..., gridlines.major = FALSE, gridlines.minor = FALSE, ...)


If you want to disable the actual cut-off itself (and, thus, the colouring of the points based on the cut-off), then I may recommend the use of colCustom ? - there is an example in the vignette:

Another possibility is to set FCcutoff to something crazy like 1 000 000 such that nothing passes it.

0
Entering edit mode

Is there any chance to change tick intervals for both horizontal and vertical axes? And thanks a lot this is awsome!

0
Entering edit mode

Hey, EnhancedVolcano is [thankfully] fully compatible with ggplot2 functionality. So, to modify, e.g., the y-axis, you can do:

p <- EnhancedVolcano(res1,
lab = rownames(res1),
x = 'log2FoldChange',
y = 'pvalue',
xlim = c(-5, 8))

p + scale_y_continuous(
name = bquote(~-Log[10]~italic(P)),
limits=c(0, 100),
breaks=c(0, 33, 40, 49, 77, 88))


Hopefully this helps!

0
Entering edit mode

Hi Kevin, EnhancedVolcano package was really helpful. I have two doubts:

1. when I used the below option legend=c('NS','Log (base 2) fold-change','P value', 'P value & Log (base 2) fold-change') it has printed Log (base 2) but not the Log2 (subscript). May I know how to do it.

2. when I have modified pointSize = c(ifelse(res$log2FoldChange>2, 8, 1)) as pointSize = c(ifelse(res$log2FoldChange>2 & res$padj>0.05, 8, 1)) it worked. But when I have used pointSize = c(ifelse(res$log2FoldChange>2 & res$log2FoldChange<-2 & res$padj>0.05, 8, 1)) it did not work. May I know how to do it.

1
Entering edit mode

when I used the below option legend=c('NS','Log (base 2) fold-change','P value', 'P value & Log (base 2) fold-change') it has printed Log (base 2) but not the Log2 (subscript). May I know how to do it.

I implemented that a few days ago, so, it is only currently in the development version, which you can install via:

devtools::install_github('kevinblighe/EnhancedVolcano')


Note: you may have to install the devtools package There are now 2 parameters for the legend:

legend = c("NS","Log2 FC","P","P & Log2 FC")
legendLabels = c('NS', expression(Log[2]~FC),
"p-value", expression(p-value~and~log[2]~FC))


Expressions (equations, super- and sub-script, etc) can only be used in legendLabels. This may seem silly to have 2 parameters for legend, but it relates to how ggplot2 (which is the underlying 'engine' behind EnhancedVolcano) utilises legends.

## ---------------------------

when I have modified pointSize = c(ifelse(res$log2FoldChange>2, 8, 1)) as pointSize = c(ifelse(res$log2FoldChange>2 & res$padj>0.05, 8, 1)) it worked. But when I have used pointSize = c(ifelse(res$log2FoldChange>2 & res$log2FoldChange<-2 & res$padj>0.05, 8, 1)) it did not work. May I know how to do it.

I think that you may want to do:

pointSize = c(ifelse(abs(res$log2FoldChange)>2 & res$padj>0.05, > 8, 1))

1
Entering edit mode

Hi Kevin,

Thank you for the response.

Both of your suggestions worked :)

pointSize = c(ifelse(abs(res$log2FoldChange)>2 & res$padj>0.05, > 8, 1))


I guess, there should not be '>' before 8.

Thank you!

0
Entering edit mode

Great. Oh, yes, not sure why I put the > there.

1
Entering edit mode

Hi Kevin,

I have two questions:

1. When I plot with results object (after shrinkage), the number of genes (visible on the plot) passing the cutoff criteria are relatively less when compared to the plot with results object (without shrinkage). The graph looks nice with shrinkage and much more dispersed without shrinkage. May I know how to overcome this issue.

2. When we use FCcuoff = 1.2, it represents log2FC (x-axis). But I want to set the cutoff for upregulation as log2FC = 0.585 (FC is >1.5) and downregulation as log2FC = -1 (FC < 0.5). Is it possible to give different log2FC cutoff values for up and downregulation?

Thanks in advance and looking forward to hearing from you.

0
Entering edit mode

Hey Bhanu,

The first question is more for the DESeq2 developer. However, the idea of lfcshrink() is, generally, to produce more realistic fold changes

For the second part, you would have to avail of the colCustom parameter and assign the colours before running the EnhancedVolcano() function. You could then also draw your own custom cutoff fold change lines with the following:

vline
vlineType
vlineCol
vlineWidth


To get rid of the main cut-off lines, just set cutoffLineType = 'blank'

There are many examples in the vignette: https://github.com/kevinblighe/EnhancedVolcano

2
Entering edit mode

Hi Kevin,

Thank you for the response. Now I could use the colCustom parameter and do the required by following the examples in the Github. Thank you again!

0
Entering edit mode

Hi Kevin,

Thank you so much for the EnhancedVolcano package - it's great! I have a quick question, I need to generate a high-res image of my volcano plot for publication and was wondering how to do that from EnhancedVolcano. I'm an R newbie and would very much appreciate any tips! I apologize if this is a bit off-topic.

Thank you! Sandra

1
Entering edit mode

Hey Sandra, I would generate the figure as a pdf with the pdf() function. PDFs are vector-based, not pixelated, so, they look 'perfect; when zoomed in and can also be easily manipulated by a journal graphics team. Also make use of the width and height arguments that are passed to the pdf() function.

pdf(out.pdf, width = 6, height = 9)
EnhancedVolcano(...)
dev.off()


Alternatively, I show how one can generate multiple volcanos side-by-side in the vignette via grid and gridExtra, e.g., HERE.

0
Entering edit mode

Hi Kevin, Thank you so much for the quick reply. The pdf looks beautiful! I have another question, I'm trying to add connectors (to fit more gene name labels)

EnhancedVolcanobaseline.no.NA,
lab=baseline.no.NA[, 1],
x = 'log2FoldChange',
xlim = c(-11, 11),
xlab = bquote(~Log[2]~ 'fold change'),
FCcutoff = 1.5,
transcriptPointSize = 2.0,
transcriptLabSize = 3.0,
legendPosition = 'top',
legendLabSize = 12,
legendIconSize = 3.0,
gridlines.major = FALSE,
gridlines.minor = FALSE,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = 'grey30')


but it gives me an error message:

Error in EnhancedVolcanobaseline.no.NA, lab = baseline.no.NA[, 1], x = "log2FoldChange",  :
unused argument (drawConnectors = TRUE)


Not sure what I'm doing wrong. Works great otherwise. Thanks again!

0
Entering edit mode

Great! Oh, which version are you using? - I changed the name of that argument. In your version, it may be DrawConnectors.

You can install the most up to date version of the package with:

devtools::install_github('kevinblighe/EnhancedVolcano')

0
Entering edit mode

aha! thank you so much, it's working now :)

0
Entering edit mode

Hi Kevin,

I frequently face the following error:

Error in grid.Call(C_convert, x, as.integer(whatfrom), as.integer(whatto),  :
Viewport has zero dimension(s)


Can you please suggest on how to rectify it?

0
Entering edit mode

Hey, which code are you running, exactly?

0
Entering edit mode
# set the base colour as 'black'
keyvals <- rep('#000000', nrow(res_A_W_lfc))

# set the base name/label as 'NS'
names(keyvals) <- rep('NS', nrow(res_A_W_lfc))

# modify keyvals for variables with fold change > 1.5
keyvals[which(res_A_W_lfc$log2FoldChange > 0.585 & res_A_W_lfc$padj < 0.05)] <- '#E50000'
names(keyvals)[which(res_A_W_lfc$log2FoldChange > 0.585 & res_A_W_lfc$padj < 0.05)] <- 'Up'

# modify keyvals for variables with fold change < -0.66
keyvals[which(res_A_W_lfc$log2FoldChange < -0.585 & res_A_W_lfc$padj < 0.05)] <- '#3C84FF'
names(keyvals)[which(res_A_W_lfc$log2FoldChange < -0.585 & res_A_W_lfc$padj < 0.05)] <- 'Down'

# Selected genes to represent
s_genes <- c('abo', 'Ace')

# Volcano plots
EnhancedVolcano(res_A_W_lfc,
x = 'log2FoldChange', y = 'padj',
lab = rownames(res_A_W_lfc), selectLab = s_genes,
xlim = c(-4, 4), ylim = c(0, 25),
xlab = bquote(~Log[2]~ 'Fold Change'),
ylab = bquote(~-Log[10]~italic(P)),
axisLabSize = 75,
title = 'Adult_A_W', subtitle = ' ',
titleLabSize = 75,
caption = bquote(~Log[2]~ 'FC cutoff, 0.585; padj cutoff, 0.05'), captionLabSize = 40,
pCutoff = 0.05, pLabellingCutoff = pCutoff,
FCcutoff = 0.585,
cutoffLineType = 'longdash', cutoffLineCol = 'black',
cutoffLineWidth = 1,
pointSize = c(ifelse(abs(res_A_W_lfc$log2FoldChange) > 0.585 & res_A_W_lfc$padj < 0.05, 8, 4)),
labSize = 30,
labCol = c('black', 'black'),
labFace = 'plain',
colCustom = keyvals,
colAlpha = 0.6,
legendLabels = c('NS', expression(Log[2]~FC),
legendPosition = 'right',
legendLabSize = 50, legendIconSize = 30,
gridlines.major = F, gridlines.minor = F,
border = 'full', borderWidth = 5, borderColour = 'black',
drawConnectors = T, widthConnectors = 4,
colConnectors = 'yellow')

2
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (text becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.

0
Entering edit mode

Thank you Ram :). I will take it as a learning experience.

0
Entering edit mode

Whoa! I think that I saw the problem instantly when I re-ran using my own dataset. Here is what I got:

Your label sizes are way too large. Here it is with smaller label sizes:

EnhancedVolcano(res_A_W_lfc,
x = 'log2FoldChange', y = 'padj',
lab = rownames(res_A_W_lfc), selectLab = s_genes,
xlim = c(-4, 4), #ylim = c(0, 25),
xlab = bquote(~Log[2]~ 'Fold Change'),
ylab = bquote(~-Log[10]~italic(P)),
axisLabSize = 12,
title = 'Adult_A_W', subtitle = ' ',
titleLabSize = 12,
caption = bquote(~Log[2]~ 'FC cutoff, 0.585; padj cutoff, 0.05'), captionLabSize = 10,
pCutoff = 0.05, pLabellingCutoff = pCutoff,
FCcutoff = 0.585,
cutoffLineType = 'longdash', cutoffLineCol = 'black',
cutoffLineWidth = 1,
pointSize = c(ifelse(abs(res_A_W_lfc$log2FoldChange) > 0.585 & res_A_W_lfc$padj < 0.05, 5, 1)),
labSize = 6,
labCol = c('black', 'black'),
labFace = 'plain',
colCustom = keyvals,
colAlpha = 0.6,
legendLabels = c('NS', expression(Log[2]~FC),
legendPosition = 'right',
legendLabSize = 12, legendIconSize = 12,
gridlines.major = F, gridlines.minor = F,
border = 'full', borderWidth = 2, borderColour = 'black',
drawConnectors = T, widthConnectors = 2,
colConnectors = 'gold')


1
Entering edit mode

Hi Kevin, Thank you! It is working now. Regards, Bhanu

0
Entering edit mode

Hi,

Does anyone know how to colour points that have FC>2.5 AND are above the p-value cut-off in 1 colour, and at the same time use a different colour for genes with FC<2.5 AND are above the p-value cut-off? I can see how to do the former from the instructions above, and colour by FC, but not both FC and p-value. I tried using keyvals for this but keep getting error messages.

Also, my labelling only uses row numbers rather than the rowname. I have gene symbols rather than ENS numbers - could this be the problem?

1
Entering edit mode

Hello, the way to do this is via keyvals. Can you paste a sample of your data, and also the error messages that you have been receiving?

The 'ENS' IDs are Ensembl gene IDs. The IDs that you use can be anything, but I have not tested EnhancedVolcano in situations where there are no gene names at all.

0
Entering edit mode

Hi, thanks for such a pretty and clear volcano plot! I am using it in my project write up.

I am using labelled volcano as my major one and I want smaller volcanos just to see trends in different conditions without labels, but I want to make the plot look consistent.

I was wondering if I can remove the labels of genes? I tried to remove 'lab=rownames(mydata)' in R but it keeps giving me errors. I tried putting it as <na> but it also did not work.

I'm sorry if it is a really basic question, I am a complete beginner in R coding:'(

Thank you.

0
Entering edit mode

Hey, sure thing and no problem, you just need to use:

lab = NA

0
Entering edit mode

Hi, I would like to have all point log2FC >= 1 and p value 0.05 in one color and all log2FC <= -1 and p value 0.05 in another one. I tried to run the code above but it gives me this error. I am able to run a classic enhanced plot with the green/blue/red color. if I remove the labels part, it gives me this error:

Error: Aesthetics must be either length 1 or the same as the data (1418): colour

if I include the "labels" part :

Error in EnhancedVolcano(res, x = "log2FC", y = "p_value", lab = NA, xlim = c(-12, : unused argument (legend = c("NS", "Log2FC", "pvalue", "pvalue & Log2FC"))

I wanted to get the same volcano plot as previously shown but I cannot get it.

# set the base colour as "black"
keyvals <- rep("#000000", nrow(res))

# set the base name/label as "NS"
names(keyvals) <- rep("NS", nrow(res))

# modify keyvals for variables with fold change >= 1
keyvals[which(res$log2FC >= 1 & res$p_value < 0.05)] <- "#E50000"
names(keyvals)[which(res$log2FC >= 1 & res$p_value < 0.05)] <- "Up"

# modify keyvals for variables with fold change <= -1
keyvals[which(res$log2FC <= -1 & res$p_value <= 0.05)] <- "#3C84FF"
names(keyvals)[which(res$log2FC <= -1 & res$p_value <= 0.05)] <- "Down"

EnhancedVolcano(res,
x = "log2FC", y = "p_value",
lab = NA,
xlim = c(-12, 12), ylim = c(-0, 5),
xlab = bquote(~Log[2]~ "Fold Change"),
ylab = bquote(~-Log[10]~italic(P)),
axisLabSize = 12,
title = "Test", subtitle = "",
titleLabSize = 15
pCutoff = 0.05,
FCcutoff = 1,
cutoffLineType = "longdash", cutoffLineCol = "black",
cutoffLineWidth = 1,
pointSize = c(ifelse(abs(res$log2FC) >= 1 & res$p_value < 0.05, 5, 1)),
labSize = 6,
labCol = c("black", "black"),
labFace = "plain",
colCustom = keyvals,
colAlpha = 0.6,
legend=c("NS", "Log2FC", "pvalue", "pvalue & Log2FC"),
legendLabels = c('NS', expression(Log[2]~FC),
"p_value", expression(p_value~and~Log[2]~FC)),
legendPosition = "right",
legendLabSize = 12, legendIconSize = 12,
gridlines.major = F, gridlines.minor = F,
border = "full", borderWidth = 2, borderColour = "black")

0
Entering edit mode

Hey, sorry, I am only seeing this now. What is the output of:

table(keyvals)

nrow(res)


?

By the way, you should not have to set the value of legend, depending on which version you are using. Which version is it, do you know?

0
Entering edit mode

Great tool! I am trying "Custom shape & colour over-ride" , . i have 4 cell types . can you tell me how to label genes names of different shapes ?

0
Entering edit mode

Hey, did you check the vignette for how you could do this? - https://github.com/kevinblighe/EnhancedVolcano

0
Entering edit mode
EnhancedVolcano(res5,
lab = res5$symbol, x = 'log2FoldChange', y = 'padj', selectLab = rownames(res5)[which(names(keyvals.shape) %in% c('adhesion', 'pluripotent', 'differentiation', 'others'))], xlim = c(-7.0,7.0), ylim = c(0, -log10(10e-250)), #xlab = bquote(~Log[2]~ 'fold change'), #title = 'Custom shape & colour over-ride', pCutoff = 0.05, FCcutoff = 0.5, pointSize = keyvals.size, labSize = 4.0, shapeCustom = keyvals.shape, colCustom = keyvals.shape, colAlpha = 1, legendPosition = 'right', legendLabSize = 15, legendIconSize = 5.0, drawConnectors = TRUE, widthConnectors = 0.5, colConnectors = 'grey50', gridlines.major = FALSE, gridlines.minor = FALSE, border = 'full', borderWidth = 1.0, borderColour = 'grey')  i had given selectLab , but it's not labelling the gene names. and also when i am switching the x axis with y axis it is giving warning - Warning messages: 1: In FUN(X[[i]], ...) : NaNs produced 2: Removed 1 rows containing missing values (geom_vline).  ADD REPLY 1 Entering edit mode Sure, but, what is the output of: rownames(res5)[which(names(keyvals.shape) %in% c('adhesion', 'pluripotent', 'differentiation', 'others'))],  Also, whatever is the output should be a subset of res5$symbol.

Regarding the warning message, it may imply that one of your fold change values is infinite or NA, and indicates that one gene / variable was removed.

0
Entering edit mode

Hi Kevin

Awesome tool! Thanks so much.

I have run the sample data (airway) and could reproduce the results. However, I am having trouble applying it to my data. I have Lipid (instead of gene) in the first column and I am trying to plot the other two columns log2FC vs. pvalue_neglog. I have replaced lab = rownames(res1) from the example data to res$Lipid (these return the lipid names or labels for the data points on the volcano plot. I am getting a warning message and an empty plot. For your reference, I am also pasting the values in the three required columns (lipid names, log2FC and pvalue_neglog). I would really appreciate if you could please let me know what could be going wrong here. Being a biologist, I'm fairly new to R! main_df <- read_csv("CA1_results.csv") res <- select(main_df, Lipid, ratio, p.value) res <- res %>% mutate(log2FC = log(ratio,2)) res <- res %>% mutate(pvalue_neglog = log10(p.value)) res$Lipid <- as.character(res$Lipid) EnhancedVolcano(res, lab=res$Lipid,
x = 'log2FC',
y = 'pvalue_neglog',
xlim = c(-8, 8),
title = 'N061011 versus N61311',
pCutoff = 10e-16,
FCcutoff = 1.5,
pointSize = 3.0,
labSize = 3.0)

**Warning messages:
1: In EnhancedVolcano(res, lab = res$Lipid, x = "log2FC", y = "pvalue_neglog", : NaNs produced 2: In max(-log10(toptable[[y]]), na.rm = TRUE) : no non-missing arguments to max; returning -Inf 3: In FUN(X[[i]], ...) : NaNs produced 4: Removed 1 rows containing missing values (geom_hline). 5: Position guide is perpendicular to the intended axis. Did you mean to specify a different guide position?** res$Lipid (Column with 60 labels)
[1] "CerP361"
[2] "CerP485"
[3] "CerPd381"
[4] "CL704"
[5] "CL724"
[6] "CL7610"
[7] "CL7812"
.
.
.
.

[60]

res$log2FC (x-axis) [1] -0.66058738 -0.83693303 -0.55855262 -0.10573464 [5] 0.37999245 -0.02444948 0.05840667 -0.26578594 [9] -0.64982482 -0.56821750 -0.65075623 0.09954326 [13] -0.74315124 -0.59875533 -0.66586575 -0.39043677 [17] -0.63314674 -0.63222629 -0.12685260 -1.30731657 [21] -0.86504856 -0.67044005 -0.25101607 -0.48051092 [25] -0.38830116 0.26791015 -0.83080707 -0.61917330 [29] -0.61897793 -0.45797393 -0.05618424 -0.49515706 [33] -0.76122163 -1.47032466 -0.80997882 -0.30809060 [37] -0.64093523 -0.87602607 -0.73308587 -1.28756608 [41] -0.58542035 -0.90416813 -1.23939549 -0.59378280 [45] -0.89083212 -0.84249253 -0.57355367 -1.15531133 [49] -0.75234802 -0.86414964 -1.02439860 -0.56914321 [53] -1.08891281 -0.67731372 -0.84341796 -0.61205092 [57] -0.67346121 -0.65733060 -1.05860064 -0.79851071 res$pvalue_neglog (y-axis)
[1] -1.52683584 -0.83272759 -1.30194900 -0.24911099
[5] -0.77815093 -0.04840497 -0.09584658 -0.72688240
[9] -1.40470978 -1.37855432 -1.13650444 -0.13497691
[13] -1.60255243 -1.50533793 -1.40224827 -1.03024879
[17] -1.31852825 -1.24099275 -0.16144983 -1.78094064
[21] -1.48942148 -1.24172425 -0.42912043 -0.98273558
[25] -0.45855962 -0.92362174 -1.41511331 -1.09787246
[29] -1.16725712 -1.46421147 -0.05612927 -1.18343895
[33] -1.43273176 -2.55024106 -1.61400023 -0.28980805
[37] -1.33481613 -1.60298327 -1.39676689 -2.16814148
[41] -1.07488917 -3.37675071 -2.89213840 -1.01557358
[45] -1.70021556 -1.58951410 -0.91409452 -1.93518506
[49] -1.18468409 -1.34470449 -1.52330306 -1.74500292
[53] -3.08301995 -0.98856395 -1.20625811 -0.87011188
[57] -1.03800629 -0.94564760 -1.74273752 -1.12590468


Cheers Farheen

1
Entering edit mode

Thanks! - the problem is that your y-axis values are already negative log [base10]. EnhancedVolcano does this conversion internally for you.

You can get around this by converting your y-values back to p-values:

res$pvalue <- 1 / (10^(res$pvalue_neglog))


Then:

EnhancedVolcano(res,
lab=res\$Lipid,
x = 'log2FC',
y = 'pvalue',
xlim = c(-8, 8),
title = 'N061011 versus N61311',
pCutoff = 10e-16,
FCcutoff = 1.5,
pointSize = 3.0,
labSize = 3.0)

0
Entering edit mode

Thanks for your reply kevin. Yes that seemed to have fixed the issue. Also, if I may ask, could you please let me know how we set a cutoff for logFC and p value? The default ones do not seem to be applying directly to my data.

0
Entering edit mode

Non c'è problema. These cut-offs are set via pCutoff and FCcutoff. There are no standards in terms of values to choose for these.

0
Entering edit mode

Hi Kevin

I have been able to plot the data as a volcano plot. However, I have some data points which share the exact same adjusted p value and a similar FC. And so the labels are probably getting overlapped and one label is obscuring the one closest to it. Please let me know if I can change the coordinates of the annotations/labels or is there a way to spread them out so all the labels are visible.

Cheers Farheen

0
Entering edit mode

Kevin Blighe in your vignette and here, you typically show examples with quite large FC cutoffs, and the default FC cutoff value in EnhancedVolcano is normally way more than one should consider as a starting point for a gene being biologically DE. What is usually indicated as an FC cutoff for a likely biologically DE gene is something like FC > 1.1 or 1.2 (i.e. 10% or 20% change).

Is your choice of very large FC cutoffs in your vignette examples and function default setting just arbitrary?

0
Entering edit mode

The default value is 1, which, if log [base 2] fold changes are supplied, represents a cut-off of absolute log [base 2] fold change of 1. The cut-off value here is again tied to whichever fold changes are supplied, be they linear, log [base 2], or something else.

For the values specified in the vignette, I undoubtedly chose these based on the dataset whose data I am analysing in the vignette, i.e., the data contained in the airway package.

0
Entering edit mode

Ok, I see, you aren't assuming people are supplying log FC and could be linear (even though DESeq2, edgeR, and limma each give results in log2FC)

1
Entering edit mode

True, but I note that people are using EnhancedVolcano for a wide diversity of studies, including metabolomics, microbiome, proteomics, etc., and that fold changes are not always on the log [base 2] scale in these.

0
Entering edit mode

Also, why do the vignette examples and function default pCutoffassume using raw p-values? Shouldn't adjusted p-values be used instead, and default pCutoff be 0.05 or similar?

1
Entering edit mode

No, the typical way to generate volcano plots is based on nominal / un-adjusted / 'raw' p-values. However, the pCutoff parameter will represent whatever p-values are supplied, be they adjusted or nominal / un-adjusted / 'raw'.

0
Entering edit mode

But then the appropriate cutoff always changes depending on the dataset because you haven't done multiple testing correction. And you have to look in your results each time for the appropriate raw p-value cutoff that represents significant DE (by basically comparing it to the padj column to find the right cutoff)

What's the reasoning behind volcano plots based on raw p-values? Seems easier, more consistent for people to understand why the cutoff was chosen (<0.05 for example), and appropriate to use adjusted p-values.

1
Entering edit mode

Yes, the cut-off needs to vary for each comparison that is being graphically represented by the volcano.

Regarding raw p-values, it relates to how, after p-value adjustment, many p-values may end up with constant values, such as having, for example, many genes set to 0.03, which makes the volcano look unusual.

0
Entering edit mode

Hi Kevin Blighe - is there a way to color a specific set points a different color irrespective of where they are on the graph while keeping the other points with their default colors?

I first tried testing colCustom if I could recreate the default green, red, blue, grey default colors using nested ifelse statements with the log2FC and p-value thresholds, but it's weird there are some angular point coloring artifacts near the threshold dotted lines, it doesn't look like the same plot without colCustom. If that worked I was going to use colCustom to do the same thing except in the ifelse statements to use the default colors except for the points of interest.