Question: enhancedvolcano plot rowname question
0
gravatar for nsmaan
6 months ago by
nsmaan10
nsmaan10 wrote:

Hello All, I am using one of your sample scripts to test my data in volcano plots. My files has 3 columns: Gene; Log2FoldchangeC; and pvalue But by default i keep getting the rownames as labels - What i want to plot is "Gene" name as labels (and not rownames like 1, 2, 3, etc.)

Could you please help?

This is the sample script:

res <- read.table("results.txt", header=TRUE)

head(res)

#rownames(res) <- sub("Gene", "", rownames(res))

EnhancedVolcano(res,
    lab = rownames(res),
    x = "log2FoldChange",
    y = "pvalue",
    ylab = bquote(~-Log[10]~italic(Pvalue)),
    pCutoff = 10e-5,
    FCcutoff = 1.5,
    #xlim=c(-5.5, 5.5),
    #ylim=c(0, -log10(10e-12)),
    transcriptLabSize = 3.5,
    title = "Drug+Toxin VS Ctrl results",
    legendPosition = "right",
    legendLabSize = 14,
    col = c("grey30", "forestgreen", "royalblue", "red2"),
    colAlpha=0.9,
    #DrawConnectors = TRUE,
    widthConnectors=0.2)
rna-seq • 759 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by nsmaan10

Kevin, Thank you very much for your prompt response. Please also let me know how to: Put different shapes, size and colors to left and right labeled genes (for e.g., to have genes on right to be Red, Star shape, and of larger size than default).

P.S. I am installing your newer ver. for EV plots

ADD REPLYlink written 6 months ago by nsmaan10

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

This comment belongs under @Kevin's answer.

ADD REPLYlink written 6 months ago by genomax70k
4
gravatar for Kevin Blighe
6 months ago by
Kevin Blighe46k
Kevin Blighe46k wrote:

Hey, you then just need to specify:

lab = res$Gene

Which version are you using, by the way? In the latest [devel] release, DrawConnectors has become drawConnectors. A lot of other new features have been added, too: https://github.com/kevinblighe/EnhancedVolcano

You can install the latest with:

devtools::install_github('kevinblighe/EnhancedVolcano')

Kevin

ADD COMMENTlink written 6 months ago by Kevin Blighe46k

Kevin, Other experts: Saw this volcano plot picture in this 2011 paper: They showed FDR and FC cutoff-based plot where top selected number of (e.g. 30) genes were labeled. They also represented the absolute fold change of all genes by the circle size and different color codes. Could anyone please provide sample scripts to do something similar if they have it available? Thank you,

Image Link: https://www.researchgate.net/figure/A-volcano-plot-representation-of-the-differentially-expressed-genes-in-a-pair-wise_fig2_51862161

ADD REPLYlink written 6 months ago by nsmaan10
1

That plot could be mostly reproduced using EnhancedVolcano. The functionality for point size scaling based on statistical significance is not yet available, but it will be [available] in later versions. The functionality for label boxes with lines drawn to the points is already available (look in the vignette). The functionality for drawing ellipses around points is not yet available. I instead chose to give users to identify groups of transcripts / genes by:

  • colour
  • shape
  • shade
ADD REPLYlink written 6 months ago by Kevin Blighe46k

Functionality to change the shape of the points was only recently added. See the vignette at these parts:

It will take you a bit of work to do this if you are a beginner in R.

ADD REPLYlink written 6 months ago by Kevin Blighe46k

Yes, unfortunately i am a beginner. However, to learn i am going through example-by-example that you have provided in "Publication-ready volcano plots with enhanced colouring and labeling". 1) The top few works fine, but when i get to add shape, it says unused argument (shape =8). I have version EnhancedVolcano_1.0.1 2) Also, how can we get number counts of data points that we have in each quadrangle of the plot?

Thank you, and i appreciate the time users take to answer the comments...

ADD REPLYlink written 6 months ago by nsmaan10
1

If you install via devtools::install_github('kevinblighe/EnhancedVolcano'), the version should be 1.1.3. Can you confirm?

For the issues with shape, you will have to post all commands that you're using, and also a sample of your input data.

Also, how can we get number counts of data points that we have in each quadrangle of the plot?

I would honestly just do that manually. For example, this will find genes with pvalue<0.01 & log2FoldChange > 2:

nrow(subset(res, pvalue<0.01 & log2FoldChange > 2))
ADD REPLYlink modified 6 months ago • written 6 months ago by Kevin Blighe46k

Thanks K. yes, with install_github('kevinblighe/EnhancedVolcano'), shape is working. I think. I am close to getting what I want using the script below:

1) However, I am still getting numbers instead of 'Gene' name in my labels,

2) Also, I am doing something wrong in the cutoff for labels, for me this line works

keyvals[which(res2$log2FoldChange > 2.0)] <- 'green'

but when i change it to

keyvals[which(res2, padj<0.05 & log2FoldChange > 2.0)] <- 'green'

it is all black.

head(res)
   Gene log2FoldChange   pvalue     padj
1 Ptprs      -1.044483 3.88e-14 4.73e-10
2 Cd163      -4.219374 4.16e-13 1.02e-09
3  Fcna      -3.046358 5.03e-13 1.02e-09
4 Ces2j       1.825855 2.54e-13 1.02e-09
5 Vsig4      -5.002890 4.33e-13 1.02e-09
6 Ces2a       1.739246 7.79e-13 1.31e-09

    library(dplyr)
library(ggplot2)
library(ggrepel)
library(EnhancedVolcano)
res2 <- read.table("results.txt", header=TRUE)
keyvals <- rep('black', nrow(res2))
names(keyvals) <- rep('Mid', nrow(res2))
keyvals[which(res2$log2FoldChange > 2.0)] <- 'green'
names(keyvals)[which(res2$log2FoldChange > 2.0)] <- 'high'
keyvals[which(res2$log2FoldChange < -2.0)] <- 'royalblue'
names(keyvals)[which(res2$log2FoldChange < -2.0)] <- 'low'
unique(names(keyvals))
unique(keyvals)
keyvals[1:20]
EnhancedVolcano(res2,
    lab = res2$Gene,
    x = 'log2FoldChange',
    y = 'padj',
    selectLab =res2$Gene[which(names(keyvals) %in% c('high', 'low'))],
    xlim = c(-8,8),
    xlab = bquote(~Log[2]~ 'fold change'),
    ylab = bquote(~Log[10]~ 'padj'),
    title = 'Custom colour over-ride',
    pCutoff = 10e-6,
    FCcutoff = 2.0,
    transcriptPointSize = 1,
    transcriptLabSize = 4.5,
    shape = c(6, 4, 2, 11),
    colCustom = keyvals,
    colAlpha = 1,
    legendPosition = 'top',
    legendLabSize = 15,
    legendIconSize = 5.0,
    drawConnectors = FALSE,
    widthConnectors = 0.5,
    colConnectors = 'grey50',
    gridlines.major = TRUE,
    gridlines.minor = FALSE,
    border = 'partial',
    borderWidth = 1.5,
    borderColour = 'black')
ADD REPLYlink modified 6 months ago by Kevin Blighe46k • written 6 months ago by nsmaan10

1) However, I am still getting numbers instead of 'Gene' name in my labels,

That is likely because your Gene variable is encoded as a factor. Try this before running anything else:

res$Gene <- as.character(res$Gene)

For the other part, you need to do:

keyvals[which(res2$padj<0.05 & res2$log2FoldChange > 2.0)] <- 'green'
ADD REPLYlink modified 6 months ago • written 6 months ago by Kevin Blighe46k

Yes. Gene was a factor. Working now. Still the script is showing error in reading object 'res2log2FoldChange'

I modified it as:

    res2 <- read.table("results.txt", header=TRUE)
res2$Gene <- as.character(res2$Gene)
keyvals <- rep('black', nrow(res2))
names(keyvals) <- rep('Mid', nrow(res2))
keyvals[which(res2$padj<0.05 & res2log2FoldChange > 2.0)] <- 'green'
names(keyvals)[which(res2$log2FoldChange > 2.0)] <- 'high'
keyvals[which(res2$padj<0.05 & res2log2FoldChange < -2.0)] <- 'royalblue'
names(keyvals)[which(res2$log2FoldChange < -2.0)] <- 'low'
unique(names(keyvals))
unique(keyvals)
keyvals[1:20]
EnhancedVolcano(res2,
    lab = res2$Gene,
    x = 'log2FoldChange',
    y = 'padj',
    selectLab =res2$Gene[which(names(keyvals) %in% c('high', 'low'))],
..........
ADD REPLYlink modified 6 months ago by Kevin Blighe46k • written 6 months ago by nsmaan10
1

Check your code again. You are missing a dollar, $, in one of your lines:

keyvals[which(res2$padj<0.05 & res2log2FoldChange > 2.0)] <- 'green'
ADD REPLYlink written 6 months ago by Kevin Blighe46k

Embarrassed.

Script working well now. Also, out of curiosity, is it possible to change just the 'labeled genes' as something different (for e.g., as filled markers or larger size markers).

Thanks for all your help!

ADD REPLYlink written 6 months ago by nsmaan10
1

Some shapes are by default filled or unfilled. If you look in the vignette, you can see how some are unfilled, for example, like here:

ex4-2

You can have certain shapes for your genes of interest via the shapeCustom parameter. It functions in the same way as colCustom. An example here: https://github.com/kevinblighe/EnhancedVolcano#over-ride-colour-andor-shape-scheme-with-custom-key-value-pairs

You can check all possible shapes here: http://sape.inf.usi.ch/quick-reference/ggplot2/shape

Regarding size, currently, you can only change the global size for all shapes via transcriptPointSize. In the next version, I will add functionality to have different sizes. Note, however, that by default some shapes are different sizes. For example 20 and 21 are both circles but 21 is larger.

ADD REPLYlink written 6 months ago by Kevin Blighe46k

This is very cool. I am playing around with these options now for the best view. Thanks. Though I noticed one more thing that i am not sure how to correct: In the script, if i toggle the drawConnectors as TRUE or FALSE (everything else same), I get lot more labeled genes in TRUE. My aim is to only have connector lines in the genes that are labeled when i used the FALSE option above.

ADD REPLYlink written 6 months ago by nsmaan10
1

Yes, more labels will fit in the plot space with drawConnectors = TRUE. If you are only interested in labeling certain genes, then just pass these genes as a vector to selectLab. You can also now draw a box around each label with boxedlabels = TRUE

ADD REPLYlink written 6 months ago by Kevin Blighe46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1712 users visited in the last hour