Question: Volcano plot help code
4
gravatar for anasofiamoreira94
19 months ago by
anasofiamoreira9450 wrote:

Good Morning, I'm sorry to disturb, but as anyone done these volcano plot? I'm only able to do the traditional one, i'm kind knew too these field. Thanks Ana Sofia Moreira

Volcano Diff

rna-seq R • 7.5k views
ADD COMMENTlink modified 17 months ago by jordi.planells90 • written 19 months ago by anasofiamoreira9450
1

You can check ggplot2 documentation, ggplot2 tutorial, ggrepel. These are enough to plot what you have asked for.

ADD REPLYlink written 19 months ago by arta540
1

Another link: Creating Interactive Volcano Plots with R and Plot.ly

ADD REPLYlink written 19 months ago by WouterDeCoster42k
1

One more: Repel overlapping text labels in ggplot2

ADD REPLYlink written 19 months ago by venu6.3k

Could you edit your question by adding some test data and also what you already tried.

ADD REPLYlink written 19 months ago by Nicolas Rosewick8.4k

Hi Kevin, I am trying to use your code with my data but I am not being able to produce any plot (eventhough I don't get any error messages back). So I wonder wether you could help me to localize my problem. The code I am using is the following

results <- read.table("DEseq2 output.txt", sep = "\t", header = T, as.is = T)
toptable <- as.data.frame(results)

pdf("Volcanoplot_DoubleKD_gene_type.pdf",width = 10.15, height = 8.10)
library(ggplot2)


EnhancedVolcanoDESeq2 <- function(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="DoubleKD VolcanoPlot")
{
  toptable$Significance <- "NS"
  toptable$Significance[(abs(toptable$log2FoldChange) > FCCutoff)] <- "FC"
  toptable$Significance[(toptable$padj<AdjustedCutoff)] <- "FDR"
  toptable$Significance[(toptable$padj<AdjustedCutoff) & (abs(toptable$log2FoldChange)>FCCutoff)] <- "FC_FDR"
  table(toptable$Significance)
  toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

  plot <- ggplot(toptable, aes(x=log2FoldChange, y=-log10(padj))) +

    #Add points:
    #   Colour based on factors set a few lines up
    #   'alpha' provides gradual shading of colour
    #   Set size of points
    geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

    #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
    scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

    #Set the size of the plotting window
    theme_bw(base_size=24) +

    #Modify various aspects of the plot text and legend
    theme(legend.background=element_rect(),
          plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

          panel.grid.major=element_blank(), #Remove gridlines
          panel.grid.minor=element_blank(), #Remove gridlines

          axis.text.x=element_text(angle=0, size=12, vjust=1),
          axis.text.y=element_text(angle=0, size=12, vjust=1),
          axis.title=element_text(size=12),

          #Legend
          legend.position="top",            #Moves the legend to the top of the plot
          legend.key=element_blank(),       #removes the border
          legend.key.size=unit(0.5, "cm"),  #Sets overall area/size of the legend
          legend.text=element_text(size=8), #Text size
          title=element_text(size=8),       #Title text size
          legend.title=element_blank()) +       #Remove the title

    #Change the size of the icons/symbols in the legend
    guides(colour = guide_legend(override.aes=list(size=2.5))) +

    #Set x- and y-axes labels
    xlab(bquote(~Log[2]~ "fold change")) +
    ylab(bquote(~-Log[10]~adjusted~italic(P))) +

    #Set the axis limits
    #xlim(-6.5, 6.5) +
    #ylim(0, 100) +

    #Set title
    ggtitle(main) +

    #Tidy the text labels for a subset of genes
    geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
              aes(label=rownames(subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff))),
              size=2.25,
              #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
              #segment.size=0.01,
              check_overlap=TRUE,
              vjust=1.0) +

    #Add a vertical line for fold change cut-offs
    geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

    #Add a horizontal line for P-value cut-off
    geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

  return(plot)
}

The DEseq2 output has the regular structure, being: Contig baseMean log2FoldChange lfcSE stat pvalue padj NReads FBgn0000003 135701.1304 3.841593895 0.218300714 17.59771566 2.56E-69 3.72E-66 513393 etc.

Thanks in advance,

Jordi Planells

ADD REPLYlink written 17 months ago by jordi.planells90
1

Hi Jordi,

First ensure that all devices are switched off by running dev.off() until there are no more devices to close.

Then, you should try to run the function like this:

library(ggplot2)
EnhancedVolcanoDESeq2 <- function(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="DoubleKD VolcanoPlot")
{
  toptable$Significance <- "NS"
  toptable$Significance[(abs(toptable$log2FoldChange) > FCCutoff)] <- "FC"
  toptable$Significance[(toptable$padj<AdjustedCutoff)] <- "FDR"
  toptable$Significance[(toptable$padj<AdjustedCutoff) & (abs(toptable$log2FoldChange)>FCCutoff)] <- "FC_FDR"
  table(toptable$Significance)
  toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

  plot <- ggplot(toptable, aes(x=log2FoldChange, y=-log10(padj))) +

    #Add points:
    #   Colour based on factors set a few lines up
    #   'alpha' provides gradual shading of colour
    #   Set size of points
    geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

    #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
    scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

    #Set the size of the plotting window
    theme_bw(base_size=24) +

    #Modify various aspects of the plot text and legend
    theme(legend.background=element_rect(),
          plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

          panel.grid.major=element_blank(), #Remove gridlines
          panel.grid.minor=element_blank(), #Remove gridlines

          axis.text.x=element_text(angle=0, size=12, vjust=1),
          axis.text.y=element_text(angle=0, size=12, vjust=1),
          axis.title=element_text(size=12),

          #Legend
          legend.position="top",            #Moves the legend to the top of the plot
          legend.key=element_blank(),       #removes the border
          legend.key.size=unit(0.5, "cm"),  #Sets overall area/size of the legend
          legend.text=element_text(size=8), #Text size
          title=element_text(size=8),       #Title text size
          legend.title=element_blank()) +       #Remove the title

    #Change the size of the icons/symbols in the legend
    guides(colour = guide_legend(override.aes=list(size=2.5))) +

    #Set x- and y-axes labels
    xlab(bquote(~Log[2]~ "fold change")) +
    ylab(bquote(~-Log[10]~adjusted~italic(P))) +

    #Set the axis limits
    #xlim(-6.5, 6.5) +
    #ylim(0, 100) +

    #Set title
    ggtitle(main) +

    #Tidy the text labels for a subset of genes
    geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
              aes(label=rownames(subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff))),
              size=2.25,
              #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
              #segment.size=0.01,
              check_overlap=TRUE,
              vjust=1.0) +

    #Add a vertical line for fold change cut-offs
    geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

    #Add a horizontal line for P-value cut-off
    geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

  return(plot)
}

results <- read.table("DEseq2 output.txt", sep = "\t", header = T, as.is = T)
toptable <- as.data.frame(results)

pdf("Volcanoplot_DoubleKD_gene_type.pdf",width = 10.15, height = 8.10)
    EnhancedVolcanoDESeq2(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="DoubleKD VolcanoPlot")
dev.off()

Does that work?

ADD REPLYlink written 17 months ago by Kevin Blighe51k

Sorry for late reply I've been out for some days. I am afraid it doesn't work, I have used dev.off() untill I got this message back: Error in dev.off() : cannot shut down device 1 (the null device). Still I don't get any output. I am not able to figure out why

ADD REPLYlink written 16 months ago by jordi.planells90

Can you show the exact sequence of commands that you're using? - sorry

ADD REPLYlink written 16 months ago by Kevin Blighe51k

Of course!

toptable <- read.table("topTable_DESeq2.tabular", sep = "\t", header = T, as.is = T)
attach(toptable)

library(ggplot2)
EnhancedVolcanoDESeq2 <- function(toptable, AdjustedCutoff=0.05, LabellingCutoff=0.05, FCCutoff=2.0, main="VolcanoPlot")

and the plot commands exactly the same code you posted in github

{
  toptable$Significance <- "NS"
  toptable$Significance[(abs(toptable$log2FoldChange) > FCCutoff)] <- "FC"
  toptable$Significance[(toptable$padj<AdjustedCutoff)] <- "FDR"
  toptable$Significance[(toptable$padj<AdjustedCutoff) & (abs(toptable$log2FoldChange)>FCCutoff)] <- "FC_FDR"
  table(toptable$Significance)
  toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

  plot <- ggplot(toptable, aes(x=log2FoldChange, y=-log10(padj))) +

    #Add points:
    #   Colour based on factors set a few lines up
    #   'alpha' provides gradual shading of colour
    #   Set size of points
    geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

    #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
    scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

    #Set the size of the plotting window
    theme_bw(base_size=24) +

    #Modify various aspects of the plot text and legend
    theme(legend.background=element_rect(),
          plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

          panel.grid.major=element_blank(), #Remove gridlines
          panel.grid.minor=element_blank(), #Remove gridlines

          axis.text.x=element_text(angle=0, size=12, vjust=1),
          axis.text.y=element_text(angle=0, size=12, vjust=1),
          axis.title=element_text(size=12),

          #Legend
          legend.position="top",            #Moves the legend to the top of the plot
          legend.key=element_blank(),       #removes the border
          legend.key.size=unit(0.5, "cm"),  #Sets overall area/size of the legend
          legend.text=element_text(size=8), #Text size
          title=element_text(size=8),       #Title text size
          legend.title=element_blank()) +       #Remove the title

    #Change the size of the icons/symbols in the legend
    guides(colour = guide_legend(override.aes=list(size=2.5))) +

    #Set x- and y-axes labels
    xlab(bquote(~Log[2]~ "fold change")) +
    ylab(bquote(~-Log[10]~adjusted~italic(P))) +

    #Set the axis limits
    #xlim(-6.5, 6.5) +
    #ylim(0, 100) +

    #Set title
    ggtitle(main) +

    #Tidy the text labels for a subset of genes
    geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
              aes(label=rownames(subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff))),
              size=2.25,
              #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
              #segment.size=0.01,
              check_overlap=TRUE,
              vjust=1.0) +

    #Add a vertical line for fold change cut-offs
    geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

    #Add a horizontal line for P-value cut-off
    geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

  return(plot)
}

Thank you so much again

ADD REPLYlink written 16 months ago by jordi.planells90
14
gravatar for Kevin Blighe
19 months ago by
Kevin Blighe51k
Kevin Blighe51k wrote:

Edit 28th March, 2019:

this is now a Bioconductor package: EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling

--------------------

That looks like an up-side down volcano plot, which is unusual. Why not just do a regular one?

You can use my EnhancedVolcano function on my GitHub page (in part developed with a Junior Bioinformatician): https://github.com/kevinblighe (https://github.com/kevinblighe/EnhancedVolcano)

Apart from anything else, it makes the use of ggrepel in order to fit as many gene labels as possible.

Volcano

ADD COMMENTlink modified 7 months ago • written 19 months ago by Kevin Blighe51k
1

PS - if you prefer to have the lines connecting the gene names to the points, then un-comment the following lines in my function:

#segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
#segment.size=0.01
ADD REPLYlink written 19 months ago by Kevin Blighe51k

Hi @ Kevin Blighe,

Is there any way to solve "package ‘EnhancedVolcano’ is not available (for R version 3.3.3 RC)" error in Rstudio?

ADD REPLYlink written 18 months ago by Farbod3.3k
2

Ah, apologies, it is not on Bioconductor. You can just copy the R script via git clone or just download it, and then load it into your session with:

source("EnhancedVolcano.R")

I have not yet standardised the code such that it would be ready for Bioconductor. Hope that's okay.

ADD REPLYlink written 18 months ago by Kevin Blighe51k
1

Error in abs(toptable$log2FoldChange) : non-numeric argument to mathematical function i m trying to use this but i get this error ..sorry my bad i had some NA in data frame , now removed but i still dont get the plot...finally managed to get the plot "RStudioGD" something with the graphics output after clearing dev.off() many time somehow it worked

ADD REPLYlink modified 17 months ago • written 17 months ago by krushnach80630

hi @Kevin is it possible to do or label only few of the selected genes like , i see that most of them don;t show every thing but selected ones, i tried making a vector to label but i couldn't do it..it would be really helpful if you can show

ADD REPLYlink written 16 months ago by krushnach80630
1

Yes, the function works as follows:

  • you can label all genes with or without overlaps
  • you can label all genes without overlaps

The key is the check_overlap=TRUE/FALSE parameter that is passed to geom_text() within the function

If you just want to supply a specific list of gene names, like myCustomLabel=c("GeneA", "", "GeneC", "", "", "GeneF", ...), then you should be able to pass this to the function and use it again in the geom_text() function:

geom_text(data=subset(toptable, padj<LabellingCutoff & abs(log2FoldChange)>FCCutoff),
            aes(label=myCustomLabel),
            size=2.25,
            #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
            #segment.size=0.01,
            check_overlap=TRUE,
            vjust=1.0) +
ADD REPLYlink modified 16 months ago • written 16 months ago by Kevin Blighe51k

How do solve the non-numeric argument? I've removed NA and convert he E-number but still having the same error. Can you share with me the solution? Thanks

ADD REPLYlink written 16 months ago by kamalghazali20

Can you paste the commands that you use before you attempt to generate the volcano plot?

Also, could you run the str() command on you data and paste the output?

Thanks!

ADD REPLYlink written 16 months ago by Kevin Blighe51k

Hi Kevin,

Apology for late response. I manage to solve the problem. My file is in csv format so there is an error when import the file using read.table command. So I just reformat the file into tab deliminated format, reran back the "EnhancedVolcano.R" script and manage to generate cool volcano plot. However the gene IDs were not based on ENSEMBLE ID. DId I do something wrong ?

Volcanoplot_DoubleKD_gene_type.pdf

ADD REPLYlink modified 16 months ago • written 16 months ago by kamalghazali20
1

Hey, that's great! The gene names will be whatever you have used, so, they could be ENSEMBL IDs, RefSeq IDs, etc.

Note that I have updated this function and it is currently being submitted to Bioconductor: https://github.com/kevinblighe/EnhancedVolcano

It is now much more flexible. You can install the first version like this:

require(devtools)

install_github("kevinblighe/EnhancedVolcano")

require(EnhancedVolcano)

?EnhancedVolcano
ADD REPLYlink written 16 months ago by Kevin Blighe51k

Hi Kevin

I've tested the https://github.com/kevinblighe/EnhancedVolcano. however the volcano plot generated without any point. I'm using the same dataset for previous run. Here are the command I used

results2<-read.table("res.infected_norm-vs-control-noNA.txt", header=TRUE)
head(results2)
EnhancedVolcano(results2,
lab = rownames(results2),
x = "log2FoldChange",
y = "pvalue",
pCutoff = 0.05,
FCcutoff = 2,
transcriptPointSize = 1.5,
transcriptLabSize = 3.0,
title= "infected vs uninfected")

EnhancedVolcano2

ADD REPLYlink modified 16 months ago • written 16 months ago by kamalghazali20

Weird... what is the output of

head(results2)
str(results2)
ADD REPLYlink written 16 months ago by Kevin Blighe51k

here you are

> head(results2)
                    baseMean log2FoldChange    lfcSE      stat   pvalue
ENSMUSG00000033860 1168.9376      -27.24585 4.784735 -5.694328 1.24e-08
ENSMUSG00000057400  993.8006      -26.97533 4.784741 -5.637783 1.72e-08
ENSMUSG00000033831 1201.9378      -26.87136 4.784734 -5.616061 1.95e-08
ENSMUSG00000078520  551.5458      -26.32777 4.784771 -5.502410 3.75e-08
ENSMUSG00000059481  686.4623      -26.24442 4.784757 -5.485006 4.13e-08
ENSMUSG00000113555  807.0622      -25.97442 4.457547 -5.827064 5.64e-09
                       padj
ENSMUSG00000033860 2.19e-07
ENSMUSG00000057400 2.96e-07
ENSMUSG00000033831 3.32e-07
ENSMUSG00000078520 6.09e-07
ENSMUSG00000059481 6.65e-07
ENSMUSG00000113555 1.06e-07


> str(results2)
'data.frame':   33044 obs. of  6 variables:
 $ baseMean      : num  1169 994 1202 552 686 ...
 $ log2FoldChange: num  -27.2 -27 -26.9 -26.3 -26.2 ...
 $ lfcSE         : num  4.78 4.78 4.78 4.78 4.78 ...
 $ stat          : num  -5.69 -5.64 -5.62 -5.5 -5.49 ...
 $ pvalue        : num  1.24e-08 1.72e-08 1.95e-08 3.75e-08 4.13e-08 5.64e-09 6.14e-08 8.28e-08 8.95e-08 1.03e-07 ...
 $ padj          : num  2.19e-07 2.96e-07 3.32e-07 6.09e-07 6.65e-07 1.06e-07 9.57e-07 1.26e-06 1.35e-06 1.54e-06 ...
ADD REPLYlink modified 16 months ago • written 16 months ago by kamalghazali20

Thanks! Nothing seems out of the ordinary. I was able to plot the data-points you pasted.

Have you got the ggrepel package installed? - it's required, but wasn't required in the 'pre-development' version that I posted on Biostars.

I can see that I need to improve the code to produce helpful error messages. It's still passing through the Bioconductor submission process.

If worse comes to worse, just continue to use the version that actually worked for you. Thanks!

ADD REPLYlink modified 16 months ago • written 16 months ago by Kevin Blighe51k
1

yup, I've ggrepel installed.

the volcano plot generated using the script is good enough. Thank you very much, appreciate your help!!

ADD REPLYlink modified 16 months ago • written 16 months ago by kamalghazali20
1

I think that i'v solved this issue in the most recent version (0.99.4) on GitHub. I encountered the problem myself recently. Thanks for the input!

ADD REPLYlink written 16 months ago by Kevin Blighe51k
1

I've rerun my dataset using the recent version. Yup now I'm able to generate the plot. Thanks Kevin

ADD REPLYlink written 15 months ago by kamalghazali20

Hi Kevin, This seems like some great package, however I am running into the same "empty plot" error as above. I am pasting the str(data) results of 'data'(i.e. my input file). Could you please let me know,is the file format is wrong? My code for the plot is same as above:

EnhancedVolcano(data,
                lab = rownames(data),
                x = "log2FoldChange",
                y = "qvalue",
                pCutoff = 0.05,
                FCcutoff = 2,
                transcriptPointSize = 1.5,
                transcriptLabSize = 3.0,
                title= "Treated vs untreated")

Edit: I don't know how to upload image, so copy pasting the str(data) results.( Sorry)

'data.frame':   58676 obs. of  14 variables:
 $ GeneId                            : Factor w/ 58676 levels "TBIG000001","TBIG000002",..: 641 4468 9003 9153 9274 9365 10275 11524 14429 15694 ...
 $ GeneAcc                           : Factor w/ 58676 levels "ENSG00000000003",..: 16655 5426 4353 36091 13318 6834 10698 622 13091 4184 ...
 $ NcbiGene                          : Factor w/ 26853 levels "1","10","100",..: 17517 22283 20484 17872 9266 25141 14238 7687 10038 17173 ...
 $ Type                              : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
 $ GeneName                          : Factor w/ 57131 levels "1-Dec","1-Mar",..: 41663 21692 22354 54930 13393 20863 30970 23741 27077 41135 ...
 $ Desc                              : Factor w/ 45340 levels "","-","1-acylglycerol-3-phosphate O-acyltransferase 1 [Source:HGNC Symbol;Acc:HGNC:324]",..: 26809 5467 2846 41883 914 2306 13008 4060 9260 26723 ...
 $ EXP.Control.FPKM                  : Factor w/ 5482 levels "-","0","0.01",..: 3796 3686 5247 599 3716 2286 1010 4941 5169 452 ...
 $ EXP.Case.FPKM                     : Factor w/ 5667 levels "-","0","0.01",..: 1691 4570 1429 553 4117 3494 3340 265 2342 1686 ...
 $ DEG.Control.vs.Case.QV.0.05.SELECT: Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
 $ DEG.Control.vs.Case.PV            : Factor w/ 6633 levels "-","0.0001","0.0002",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ qvalue                            : num  0.0287 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 0.0154 ...
 $ log2FoldChange                    : num  -4.54 -2.79 -2.47 3.1 -3.04 -2.63 -2.03 -2.95 -1.83 -2.6 ...
 $ DEG.Control.vs.Case.UP_DOWN       : Factor w/ 3 levels "-","DOWN","UP": 2 2 2 3 2 2 2 2 2 2 ...
 $ GeneOntology                      : Factor w/ 16446 levels "-","GO:0000002,GO:0000122,GO:0000165,GO:0000790,GO:0000977,GO:0000978,GO:0000981,GO:0001077,GO:0001085,GO:0001205,G"| __truncated__,..: 11435 8571 8697 7835 10594 955 8730 11181 8371 8192 ...
ADD REPLYlink modified 6 weeks ago by Kevin Blighe51k • written 6 weeks ago by Bioinformatician_in_trouble10

Hey, are the rownames even set for your data object? Maybe try

lab = as.character(data$GeneId)
ADD REPLYlink written 6 weeks ago by Kevin Blighe51k

Yes, the row names are set. same thing even after trying lab= as.character(data$GeneId). I observed that lot of my rows are NA's for foldchange and qvalue. do you think this is affecting the results?

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by Bioinformatician_in_trouble10

hi @Kevin Blighe, anything on this please?

ADD REPLYlink written 29 days ago by Bioinformatician_in_trouble10

Hey, when you say "a lot" [of your rows are NA], you equally imply that not everything is NA (?). Perhaps, therefore, the problem lies elsewhere outside of EnhancedVolcano().

What does the following return:

table( is.na(as.character(data$GeneId)) )

table( is.na(data$GeneId) )
ADD REPLYlink modified 29 days ago • written 29 days ago by Kevin Blighe51k

Thank you for your reply,Kevin.

table( is.na(as.character(data$GeneId)) ) returns < table of extent 0 >
table( is.na(data$GeneId) ) returns < table of extent 0 >

I wonder what could the problem outside enhanced volcano be. My input is a csv file with the columns; GeneName, log2FoldChange, qvalue,Control Vs Case UP_Down

column "Control Vs Case UP_Down" just tells me whether a gene is up regulated or downregulated.

ADD REPLYlink modified 28 days ago by Kevin Blighe51k • written 28 days ago by Bioinformatician_in_trouble10

Interesting... what about:

table( as.character(data$GeneId) == 'NA' )

how did you produce the data object? It is strange that virtually all columns are factors.

ADD REPLYlink written 28 days ago by Kevin Blighe51k

table( as.character(data$GeneId) == 'NA' ) also gives me < table of extent 0 > I was given the csv file with above listed columns, so am not very sure which procedure was used to arrive at the results in csv file. Can we not convert factors into a type that is required by EnhancedVolcano function(i.e. if factors are creating a problem)?

ADD REPLYlink written 27 days ago by Bioinformatician_in_trouble10

Yes, but as.character() should convert the factors into characters. You have never shown your resulting plot; so, I can neither confirm that anything is even erroneous here. How do you read the data into R?

ADD REPLYlink written 27 days ago by Kevin Blighe51k

I read using read.csv. I am not sure how to upload the image. could you please tell me? or i can send it to you else where if that is okay with you.

ADD REPLYlink written 23 days ago by Bioinformatician_in_trouble10

kevin, this tool is very good in providing beautiful figure. but your script use airway dataset, right? but i got confused with the series of replies. can you redirect me to the newer ver so i can follow it using my data?

ADD REPLYlink modified 6 months ago • written 6 months ago by bioinfool20

Hey, assuming that you have any differential expression analysis results table with p-value and fold changes, then you can use the package EnhancedVolcano. What have you produced so far? Can you post an example?

ADD REPLYlink written 6 months ago by Kevin Blighe51k

thanks for the quick reply. i am fairly new to this so i had difficulty understanding how things work here. anyway, yes, i already have my DE data, which i already filtered. it consist of my_transcript_name, log2FC, & FDR value. since my data is already filtered, i just wanted to plot it using volcano plot. i started with the basic command from your script because i just want to draw out of something with my data. however, i couldn't arrive with a figure.

results <- read.table("my_data", header=TRUE, sep = '\t')

toptable <- as.data.frame(results)

install.packages("ggplot2")

library(ggplot2)

plot <- ggplot (toptable, aes(x=log2FC,y=FDR))

ADD REPLYlink modified 6 months ago • written 6 months ago by bioinfool20

Oh, this code is now a comprehensive package on Bioconductor: https://github.com/kevinblighe/EnhancedVolcano

In EnhancedVolcano, for the most basic plot, you would do:

library(EnhancedVolcano)
EnhancedVolcano(toptable,
    lab = rownames(toptable),
    x = 'log2FC',
    y = 'FDR')

You may have to install the package first.

ADD REPLYlink modified 6 months ago • written 6 months ago by Kevin Blighe51k

tried installing the package but it says

Warning in install.packages: package 'EnhancedVolcano' is not available (for R ver 3.5.3)

what to do?

ADD REPLYlink written 6 months ago by bioinfool20

Try to install not via install.packages()

if (!requireNamespace('BiocManager', quietly = TRUE))
   install.packages('BiocManager')
   BiocManager::install('EnhancedVolcano')

In the unlikely event that that does not work, this will work:

devtools::install_github('kevinblighe/EnhancedVolcano')

(you may need to install devtools first)

ADD REPLYlink written 6 months ago by Kevin Blighe51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1338 users visited in the last hour