Question

Volcano plot help code

4

Entering edit mode

6.0 years ago

anasofiamoreira94 ▴ 80

Good Morning, I'm sorry to disturb, but as anyone done these volcano plot? I'm only able to do the traditional one, i'm kind knew too these field. Thanks Ana Sofia Moreira

R rna-seq • 19k views

ADD COMMENT • link updated 5.8 years ago by jordi.planells ▴ 480 • written 6.0 years ago by anasofiamoreira94 ▴ 80

1

Entering edit mode

You can check ggplot2 documentation, ggplot2 tutorial, ggrepel. These are enough to plot what you have asked for.

ADD REPLY • link 6.0 years ago by arta ▴ 670

1

Entering edit mode

Another link: Creating Interactive Volcano Plots with R and Plot.ly

ADD REPLY • link 6.0 years ago by WouterDeCoster 47k

1

Entering edit mode

One more: Repel overlapping text labels in ggplot2

ADD REPLY • link 6.0 years ago by venu 7.1k

0

Entering edit mode

Could you edit your question by adding some test data and also what you already tried.

ADD REPLY • link 6.0 years ago by Nicolas Rosewick 10k

0

Entering edit mode

Hi Kevin, I am trying to use your code with my data but I am not being able to produce any plot (eventhough I don't get any error messages back). So I wonder wether you could help me to localize my problem. The code I am using is the following

results <- read.table("DEseq2 output.txt", sep = "\t", header = T, as.is = T)
toptable <- as.data.frame(results)

pdf("Volcanoplot_DoubleKD_gene_type.pdf",width = 10.15, height = 8.10)
library(ggplot2)


EnhancedVolcanoDESeq2 <- function(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="DoubleKD VolcanoPlot")
{
  toptable$Significance <- "NS"
  toptable$Significance[(abs(toptable$log2FoldChange) > FCCutoff)] <- "FC"
  toptable$Significance[(toptable$padj<AdjustedCutoff)] <- "FDR"
  toptable$Significance[(toptable$padj<AdjustedCutoff) & (abs(toptable$log2FoldChange)>FCCutoff)] <- "FC_FDR"
  table(toptable$Significance)
  toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

  plot <- ggplot(toptable, aes(x=log2FoldChange, y=-log10(padj))) +

    #Add points:
    #   Colour based on factors set a few lines up
    #   'alpha' provides gradual shading of colour
    #   Set size of points
    geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

    #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
    scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

    #Set the size of the plotting window
    theme_bw(base_size=24) +

    #Modify various aspects of the plot text and legend
    theme(legend.background=element_rect(),
          plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

          panel.grid.major=element_blank(), #Remove gridlines
          panel.grid.minor=element_blank(), #Remove gridlines

          axis.text.x=element_text(angle=0, size=12, vjust=1),
          axis.text.y=element_text(angle=0, size=12, vjust=1),
          axis.title=element_text(size=12),

          #Legend
          legend.position="top",            #Moves the legend to the top of the plot
          legend.key=element_blank(),       #removes the border
          legend.key.size=unit(0.5, "cm"),  #Sets overall area/size of the legend
          legend.text=element_text(size=8), #Text size
          title=element_text(size=8),       #Title text size
          legend.title=element_blank()) +       #Remove the title

    #Change the size of the icons/symbols in the legend
    guides(colour = guide_legend(override.aes=list(size=2.5))) +

    #Set x- and y-axes labels
    xlab(bquote(~Log[2]~ "fold change")) +
    ylab(bquote(~-Log[10]~adjusted~italic(P))) +

    #Set the axis limits
    #xlim(-6.5, 6.5) +
    #ylim(0, 100) +

    #Set title
    ggtitle(main) +

    #Tidy the text labels for a subset of genes
    geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
              aes(label=rownames(subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff))),
              size=2.25,
              #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
              #segment.size=0.01,
              check_overlap=TRUE,
              vjust=1.0) +

    #Add a vertical line for fold change cut-offs
    geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

    #Add a horizontal line for P-value cut-off
    geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

  return(plot)
}

The DEseq2 output has the regular structure, being: Contig baseMean log2FoldChange lfcSE stat pvalue padj NReads FBgn0000003 135701.1304 3.841593895 0.218300714 17.59771566 2.56E-69 3.72E-66 513393 etc.

Thanks in advance,

Jordi Planells

ADD REPLY • link 5.8 years ago by jordi.planells ▴ 480

1

Entering edit mode

Hi Jordi,

First ensure that all devices are switched off by running dev.off() until there are no more devices to close.

Then, you should try to run the function like this:

library(ggplot2)
EnhancedVolcanoDESeq2 <- function(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="DoubleKD VolcanoPlot")
{
  toptable$Significance <- "NS"
  toptable$Significance[(abs(toptable$log2FoldChange) > FCCutoff)] <- "FC"
  toptable$Significance[(toptable$padj<AdjustedCutoff)] <- "FDR"
  toptable$Significance[(toptable$padj<AdjustedCutoff) & (abs(toptable$log2FoldChange)>FCCutoff)] <- "FC_FDR"
  table(toptable$Significance)
  toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

  plot <- ggplot(toptable, aes(x=log2FoldChange, y=-log10(padj))) +

    #Add points:
    #   Colour based on factors set a few lines up
    #   'alpha' provides gradual shading of colour
    #   Set size of points
    geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

    #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
    scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

    #Set the size of the plotting window
    theme_bw(base_size=24) +

    #Modify various aspects of the plot text and legend
    theme(legend.background=element_rect(),
          plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

          panel.grid.major=element_blank(), #Remove gridlines
          panel.grid.minor=element_blank(), #Remove gridlines

          axis.text.x=element_text(angle=0, size=12, vjust=1),
          axis.text.y=element_text(angle=0, size=12, vjust=1),
          axis.title=element_text(size=12),

          #Legend
          legend.position="top",            #Moves the legend to the top of the plot
          legend.key=element_blank(),       #removes the border
          legend.key.size=unit(0.5, "cm"),  #Sets overall area/size of the legend
          legend.text=element_text(size=8), #Text size
          title=element_text(size=8),       #Title text size
          legend.title=element_blank()) +       #Remove the title

    #Change the size of the icons/symbols in the legend
    guides(colour = guide_legend(override.aes=list(size=2.5))) +

    #Set x- and y-axes labels
    xlab(bquote(~Log[2]~ "fold change")) +
    ylab(bquote(~-Log[10]~adjusted~italic(P))) +

    #Set the axis limits
    #xlim(-6.5, 6.5) +
    #ylim(0, 100) +

    #Set title
    ggtitle(main) +

    #Tidy the text labels for a subset of genes
    geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
              aes(label=rownames(subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff))),
              size=2.25,
              #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
              #segment.size=0.01,
              check_overlap=TRUE,
              vjust=1.0) +

    #Add a vertical line for fold change cut-offs
    geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

    #Add a horizontal line for P-value cut-off
    geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

  return(plot)
}

results <- read.table("DEseq2 output.txt", sep = "\t", header = T, as.is = T)
toptable <- as.data.frame(results)

pdf("Volcanoplot_DoubleKD_gene_type.pdf",width = 10.15, height = 8.10)
    EnhancedVolcanoDESeq2(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="DoubleKD VolcanoPlot")
dev.off()

Does that work?

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

0

Entering edit mode

Sorry for late reply I've been out for some days. I am afraid it doesn't work, I have used dev.off() untill I got this message back: Error in dev.off() : cannot shut down device 1 (the null device). Still I don't get any output. I am not able to figure out why

ADD REPLY • link 5.8 years ago by jordi.planells ▴ 480

0

Entering edit mode

Can you show the exact sequence of commands that you're using? - sorry

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

0

Entering edit mode

Of course!

toptable <- read.table("topTable_DESeq2.tabular", sep = "\t", header = T, as.is = T)
attach(toptable)

library(ggplot2)
EnhancedVolcanoDESeq2 <- function(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="VolcanoPlot")

and the plot commands exactly the same code you posted in github

{
  toptable$Significance <- "NS"
  toptable$Significance[(abs(toptable$log2FoldChange) > FCCutoff)] <- "FC"
  toptable$Significance[(toptable$padj<AdjustedCutoff)] <- "FDR"
  toptable$Significance[(toptable$padj<AdjustedCutoff) & (abs(toptable$log2FoldChange)>FCCutoff)] <- "FC_FDR"
  table(toptable$Significance)
  toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

  plot <- ggplot(toptable, aes(x=log2FoldChange, y=-log10(padj))) +

    #Add points:
    #   Colour based on factors set a few lines up
    #   'alpha' provides gradual shading of colour
    #   Set size of points
    geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

    #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
    scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

    #Set the size of the plotting window
    theme_bw(base_size=24) +

    #Modify various aspects of the plot text and legend
    theme(legend.background=element_rect(),
          plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

          panel.grid.major=element_blank(), #Remove gridlines
          panel.grid.minor=element_blank(), #Remove gridlines

          axis.text.x=element_text(angle=0, size=12, vjust=1),
          axis.text.y=element_text(angle=0, size=12, vjust=1),
          axis.title=element_text(size=12),

          #Legend
          legend.position="top",            #Moves the legend to the top of the plot
          legend.key=element_blank(),       #removes the border
          legend.key.size=unit(0.5, "cm"),  #Sets overall area/size of the legend
          legend.text=element_text(size=8), #Text size
          title=element_text(size=8),       #Title text size
          legend.title=element_blank()) +       #Remove the title

    #Change the size of the icons/symbols in the legend
    guides(colour = guide_legend(override.aes=list(size=2.5))) +

    #Set x- and y-axes labels
    xlab(bquote(~Log[2]~ "fold change")) +
    ylab(bquote(~-Log[10]~adjusted~italic(P))) +

    #Set the axis limits
    #xlim(-6.5, 6.5) +
    #ylim(0, 100) +

    #Set title
    ggtitle(main) +

    #Tidy the text labels for a subset of genes
    geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
              aes(label=rownames(subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff))),
              size=2.25,
              #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
              #segment.size=0.01,
              check_overlap=TRUE,
              vjust=1.0) +

    #Add a vertical line for fold change cut-offs
    geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

    #Add a horizontal line for P-value cut-off
    geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

  return(plot)
}

Thank you so much again

ADD REPLY • link 5.8 years ago by jordi.planells ▴ 480

Kevin Blighe · Accepted Answer · 2018-04-18

13

Entering edit mode

6.0 years ago

Kevin Blighe 87k

Edit 28th March, 2019:

this is now a Bioconductor package: EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling

--------------------

That looks like an up-side down volcano plot, which is unusual. Why not just do a regular one?

You can use my EnhancedVolcano function on my GitHub page (in part developed with a Junior Bioinformatician): https://github.com/kevinblighe (https://github.com/kevinblighe/EnhancedVolcano)

Apart from anything else, it makes the use of ggrepel in order to fit as many gene labels as possible.

ADD COMMENT • link 5.1 years ago by Kevin Blighe 87k

2

Entering edit mode

PS - if you prefer to have the lines connecting the gene names to the points, then un-comment the following lines in my function:

#segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
#segment.size=0.01

ADD REPLY • link 6.0 years ago by Kevin Blighe 87k

0

Entering edit mode

Hi @ Kevin Blighe,

Is there any way to solve "package ‘EnhancedVolcano’ is not available (for R version 3.3.3 RC)" error in Rstudio?

ADD REPLY • link 5.9 years ago by Farbod ★ 3.4k

2

Entering edit mode

Ah, apologies, it is not on Bioconductor. You can just copy the R script via git clone or just download it, and then load it into your session with:

source("EnhancedVolcano.R")

I have not yet standardised the code such that it would be ready for Bioconductor. Hope that's okay.

ADD REPLY • link 5.9 years ago by Kevin Blighe 87k

1

Entering edit mode

Error in abs(toptable$log2FoldChange) : non-numeric argument to mathematical function i m trying to use this but i get this error ..sorry my bad i had some NA in data frame , now removed but i still dont get the plot...finally managed to get the plot "RStudioGD" something with the graphics output after clearing dev.off() many time somehow it worked

ADD REPLY • link 5.8 years ago by 1769mkc ★ 1.2k

0

Entering edit mode

hi @Kevin is it possible to do or label only few of the selected genes like , i see that most of them don;t show every thing but selected ones, i tried making a vector to label but i couldn't do it..it would be really helpful if you can show

ADD REPLY • link 5.8 years ago by 1769mkc ★ 1.2k

1

Entering edit mode

Yes, the function works as follows:

you can label all genes with or without overlaps
you can label all genes without overlaps

The key is the check_overlap=TRUE/FALSE parameter that is passed to geom_text() within the function

If you just want to supply a specific list of gene names, like myCustomLabel=c("GeneA", "", "GeneC", "", "", "GeneF", ...), then you should be able to pass this to the function and use it again in the geom_text() function:

geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
            aes(label=myCustomLabel),
            size=2.25,
            #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
            #segment.size=0.01,
            check_overlap=TRUE,
            vjust=1.0) +

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

0

Entering edit mode

How do solve the non-numeric argument? I've removed NA and convert he E-number but still having the same error. Can you share with me the solution? Thanks

ADD REPLY • link 5.8 years ago by kamalghazali ▴ 20

0

Entering edit mode

Can you paste the commands that you use before you attempt to generate the volcano plot?

Also, could you run the str() command on you data and paste the output?

Thanks!

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

0

Entering edit mode

Hi Kevin,

Apology for late response. I manage to solve the problem. My file is in csv format so there is an error when import the file using read.table command. So I just reformat the file into tab deliminated format, reran back the "EnhancedVolcano.R" script and manage to generate cool volcano plot. However the gene IDs were not based on ENSEMBLE ID. DId I do something wrong ?

Volcanoplot_DoubleKD_gene_type.pdf

ADD REPLY • link 5.8 years ago by kamalghazali ▴ 20

1

Entering edit mode

Hey, that's great! The gene names will be whatever you have used, so, they could be ENSEMBL IDs, RefSeq IDs, etc.

Note that I have updated this function and it is currently being submitted to Bioconductor: https://github.com/kevinblighe/EnhancedVolcano

It is now much more flexible. You can install the first version like this:

require(devtools)

install_github("kevinblighe/EnhancedVolcano")

require(EnhancedVolcano)

?EnhancedVolcano

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

0

Entering edit mode

Hi Kevin

I've tested the https://github.com/kevinblighe/EnhancedVolcano. however the volcano plot generated without any point. I'm using the same dataset for previous run. Here are the command I used

results2<-read.table("res.infected_norm-vs-control-noNA.txt", header=TRUE)
head(results2)
EnhancedVolcano(results2,
lab = rownames(results2),
x = "log2FoldChange",
y = "pvalue",
pCutoff = 0.05,
FCcutoff = 2,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0,
title= "infected vs uninfected")

EnhancedVolcano2

ADD REPLY • link 5.8 years ago by kamalghazali ▴ 20

0

Entering edit mode

Weird... what is the output of

head(results2)
str(results2)

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

0

Entering edit mode

here you are

> head(results2)
                    baseMean log2FoldChange    lfcSE      stat   pvalue
ENSMUSG00000033860 1168.9376      -27.24585 4.784735 -5.694328 1.24e-08
ENSMUSG00000057400  993.8006      -26.97533 4.784741 -5.637783 1.72e-08
ENSMUSG00000033831 1201.9378      -26.87136 4.784734 -5.616061 1.95e-08
ENSMUSG00000078520  551.5458      -26.32777 4.784771 -5.502410 3.75e-08
ENSMUSG00000059481  686.4623      -26.24442 4.784757 -5.485006 4.13e-08
ENSMUSG00000113555  807.0622      -25.97442 4.457547 -5.827064 5.64e-09
                       padj
ENSMUSG00000033860 2.19e-07
ENSMUSG00000057400 2.96e-07
ENSMUSG00000033831 3.32e-07
ENSMUSG00000078520 6.09e-07
ENSMUSG00000059481 6.65e-07
ENSMUSG00000113555 1.06e-07


> str(results2)
'data.frame':   33044 obs. of  6 variables:
 $ baseMean      : num  1169 994 1202 552 686 ...
 $ log2FoldChange: num  -27.2 -27 -26.9 -26.3 -26.2 ...
 $ lfcSE         : num  4.78 4.78 4.78 4.78 4.78 ...
 $ stat          : num  -5.69 -5.64 -5.62 -5.5 -5.49 ...
 $ pvalue        : num  1.24e-08 1.72e-08 1.95e-08 3.75e-08 4.13e-08 5.64e-09 6.14e-08 8.28e-08 8.95e-08 1.03e-07 ...
 $ padj          : num  2.19e-07 2.96e-07 3.32e-07 6.09e-07 6.65e-07 1.06e-07 9.57e-07 1.26e-06 1.35e-06 1.54e-06 ...

ADD REPLY • link 5.8 years ago by kamalghazali ▴ 20

0

Entering edit mode

Thanks! Nothing seems out of the ordinary. I was able to plot the data-points you pasted.

Have you got the ggrepel package installed? - it's required, but wasn't required in the 'pre-development' version that I posted on Biostars.

I can see that I need to improve the code to produce helpful error messages. It's still passing through the Bioconductor submission process.

If worse comes to worse, just continue to use the version that actually worked for you. Thanks!

ADD REPLY • link 5.8 years ago by Kevin Blighe 87k

1

Entering edit mode

yup, I've ggrepel installed.

the volcano plot generated using the script is good enough. Thank you very much, appreciate your help!!

ADD REPLY • link 5.8 years ago by kamalghazali ▴ 20

1

Entering edit mode

I think that i'v solved this issue in the most recent version (0.99.4) on GitHub. I encountered the problem myself recently. Thanks for the input!

ADD REPLY • link 5.7 years ago by Kevin Blighe 87k

1

Entering edit mode

I've rerun my dataset using the recent version. Yup now I'm able to generate the plot. Thanks Kevin

ADD REPLY • link 5.7 years ago by kamalghazali ▴ 20

0

Entering edit mode

Hi Kevin, This seems like some great package, however I am running into the same "empty plot" error as above. I am pasting the str(data) results of 'data'(i.e. my input file). Could you please let me know,is the file format is wrong? My code for the plot is same as above:

EnhancedVolcano(data,
                lab = rownames(data),
                x = "log2FoldChange",
                y = "qvalue",
                pCutoff = 0.05,
                FCcutoff = 2,
                transcriptPointSize = 1.5,
                transcriptLabSize = 3.0,
                title= "Treated vs untreated")

Edit: I don't know how to upload image, so copy pasting the str(data) results.( Sorry)

'data.frame':   58676 obs. of  14 variables:
 $ GeneId                            : Factor w/ 58676 levels "TBIG000001","TBIG000002",..: 641 4468 9003 9153 9274 9365 10275 11524 14429 15694 ...
 $ GeneAcc                           : Factor w/ 58676 levels "ENSG00000000003",..: 16655 5426 4353 36091 13318 6834 10698 622 13091 4184 ...
 $ NcbiGene                          : Factor w/ 26853 levels "1","10","100",..: 17517 22283 20484 17872 9266 25141 14238 7687 10038 17173 ...
 $ Type                              : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
 $ GeneName                          : Factor w/ 57131 levels "1-Dec","1-Mar",..: 41663 21692 22354 54930 13393 20863 30970 23741 27077 41135 ...
 $ Desc                              : Factor w/ 45340 levels "","-","1-acylglycerol-3-phosphate O-acyltransferase 1 [Source:HGNC Symbol;Acc:HGNC:324]",..: 26809 5467 2846 41883 914 2306 13008 4060 9260 26723 ...
 $ EXP.Control.FPKM                  : Factor w/ 5482 levels "-","0","0.01",..: 3796 3686 5247 599 3716 2286 1010 4941 5169 452 ...
 $ EXP.Case.FPKM                     : Factor w/ 5667 levels "-","0","0.01",..: 1691 4570 1429 553 4117 3494 3340 265 2342 1686 ...
 $ DEG.Control.vs.Case.QV.0.05.SELECT: Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
 $ DEG.Control.vs.Case.PV            : Factor w/ 6633 levels "-","0.0001","0.0002",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ qvalue                            : num  0.0287 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 ...
 $ log2FoldChange                    : num  -4.54 -2.79 -2.47 3.1 -3.04 -2.63 -2.03 -2.95 -1.83 -2.6 ...
 $ DEG.Control.vs.Case.UP_DOWN       : Factor w/ 3 levels "-","DOWN","UP": 2 2 2 3 2 2 2 2 2 2 ...
 $ GeneOntology                      : Factor w/ 16446 levels "-","GO:0000002,GO:0000122,GO:0000165,GO:0000790,GO:0000977,GO:0000978,GO:0000981,GO:0001077,GO:0001085,GO:0001205,G"| __truncated__,..: 11435 8571 8697 7835 10594 955 8730 11181 8371 8192 ...

ADD REPLY • link updated 4.5 years ago by Kevin Blighe 87k • written 4.5 years ago by Bioinformatician_in_trouble ▴ 30

0

Entering edit mode

Hey, are the rownames even set for your data object? Maybe try

lab = as.character(data$GeneId)

ADD REPLY • link 4.5 years ago by Kevin Blighe 87k

0

Entering edit mode

Yes, the row names are set. same thing even after trying lab= as.character(data$GeneId). I observed that lot of my rows are NA's for foldchange and qvalue. do you think this is affecting the results?

ADD REPLY • link 4.5 years ago by Bioinformatician_in_trouble ▴ 30

0

Entering edit mode

hi @Kevin Blighe, anything on this please?

ADD REPLY • link 4.5 years ago by Bioinformatician_in_trouble ▴ 30

0

Entering edit mode

Hey, when you say "a lot" [of your rows are NA], you equally imply that not everything is NA (?). Perhaps, therefore, the problem lies elsewhere outside of EnhancedVolcano().

What does the following return:

table( is.na(as.character(data$GeneId)) )

table( is.na(data$GeneId) )

ADD REPLY • link 4.5 years ago by Kevin Blighe 87k

0

Entering edit mode

Thank you for your reply,Kevin.

table( is.na(as.character(data$GeneId)) ) returns < table of extent 0 >
table( is.na(data$GeneId) ) returns < table of extent 0 >

I wonder what could the problem outside enhanced volcano be. My input is a csv file with the columns; GeneName, log2FoldChange, qvalue,Control Vs Case UP_Down

column "Control Vs Case UP_Down" just tells me whether a gene is up regulated or downregulated.

ADD REPLY • link updated 4.5 years ago by Kevin Blighe 87k • written 4.5 years ago by Bioinformatician_in_trouble ▴ 30

0

Entering edit mode

Interesting... what about:

table( as.character(data$GeneId) == 'NA' )

how did you produce the data object? It is strange that virtually all columns are factors.

ADD REPLY • link 4.5 years ago by Kevin Blighe 87k

0

Entering edit mode

table( as.character(data$GeneId) == 'NA' ) also gives me < table of extent 0 > I was given the csv file with above listed columns, so am not very sure which procedure was used to arrive at the results in csv file. Can we not convert factors into a type that is required by EnhancedVolcano function(i.e. if factors are creating a problem)?

ADD REPLY • link 4.5 years ago by Bioinformatician_in_trouble ▴ 30

0

Entering edit mode

Yes, but as.character() should convert the factors into characters. You have never shown your resulting plot; so, I can neither confirm that anything is even erroneous here. How do you read the data into R?

ADD REPLY • link 4.5 years ago by Kevin Blighe 87k

0

Entering edit mode

I read using read.csv. I am not sure how to upload the image. could you please tell me? or i can send it to you else where if that is okay with you.

ADD REPLY • link 4.5 years ago by Bioinformatician_in_trouble ▴ 30

0

Entering edit mode

kevin, this tool is very good in providing beautiful figure. but your script use airway dataset, right? but i got confused with the series of replies. can you redirect me to the newer ver so i can follow it using my data?

ADD REPLY • link 5.0 years ago by bioinfool ▴ 20

0

Entering edit mode

Hey, assuming that you have any differential expression analysis results table with p-value and fold changes, then you can use the package EnhancedVolcano. What have you produced so far? Can you post an example?

ADD REPLY • link 5.0 years ago by Kevin Blighe 87k

0

Entering edit mode

thanks for the quick reply. i am fairly new to this so i had difficulty understanding how things work here. anyway, yes, i already have my DE data, which i already filtered. it consist of my_transcript_name, log2FC, & FDR value. since my data is already filtered, i just wanted to plot it using volcano plot. i started with the basic command from your script because i just want to draw out of something with my data. however, i couldn't arrive with a figure.

results <- read.table("my_data", header=TRUE, sep = '\t')

toptable <- as.data.frame(results)

install.packages("ggplot2")

library(ggplot2)

plot <- ggplot (toptable, aes(x=log2FC,y=FDR))

ADD REPLY • link 5.0 years ago by bioinfool ▴ 20

0

Entering edit mode

Oh, this code is now a comprehensive package on Bioconductor: https://github.com/kevinblighe/EnhancedVolcano

In EnhancedVolcano, for the most basic plot, you would do:

library(EnhancedVolcano)
EnhancedVolcano(toptable,
    lab = rownames(toptable),
    x = 'log2FC',
    y = 'FDR')

You may have to install the package first.

ADD REPLY • link 5.0 years ago by Kevin Blighe 87k

0

Entering edit mode

tried installing the package but it says

Warning in install.packages: package 'EnhancedVolcano' is not available (for R ver 3.5.3)

what to do?

ADD REPLY • link 5.0 years ago by bioinfool ▴ 20

0

Entering edit mode

Try to install not via install.packages()

if (!requireNamespace('BiocManager', quietly = TRUE))
   install.packages('BiocManager')
   BiocManager::install('EnhancedVolcano')

In the unlikely event that that does not work, this will work:

devtools::install_github('kevinblighe/EnhancedVolcano')

(you may need to install devtools first)

ADD REPLY • link 5.0 years ago by Kevin Blighe 87k