Question: Volcano Plot from DEseq2
1
gravatar for krushnach80
13 months ago by
krushnach80440
krushnach80440 wrote:

Im using this code to make based on log2foldchange and padj value ,im getting the plot but i want those value for my reference how do i extract the same .

alpha <- 0.05 # Threshold on the adjusted p-value
cols <- densCols(res$log2FoldChange, -log10(res$pvalue))
plot(res$log2FoldChange, -log10(res$padj), col=cols, panel.first=grid(),
     main="Volcano plot", xlab="Effect size: log2(fold-change)", ylab="-log10(adjusted p-value)",
     pch=20, cex=0.6)
abline(v=0)
abline(v=c(-1,1), col="brown")
abline(h=-log10(alpha), col="brown")

gn.selected <- abs(res$log2FoldChange) > 2.5 & res$padj < alpha 
text(res$log2FoldChange[gn.selected],
     -log10(res$padj)[gn.selected],
     lab=rownames(res)[gn.selected ], cex=0.4)

when i view gn.selected i get only logical value that is true or false

Any help or suggestion would be highly appreciated

Update I'm doing this

> DF <- DF[DF$log2FoldChange > 1.5 & DF$padj < 0.05,]

is that suffice and am i doing it correctly ?

R • 6.1k views
ADD COMMENTlink modified 8 days ago by africainterpol1010 • written 13 months ago by krushnach80440
11
gravatar for Kevin Blighe
13 months ago by
Kevin Blighe33k
Republic of Ireland
Kevin Blighe33k wrote:

Edit (October 24, 2018):

This is now a Bioconductor package: EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling

---------------------------------------------

Your code appears to run fine on my DESeq2 results objects: yours


I normally do these simple volcano plots a different way:

par(mar=c(5,5,5,5), cex=1.0, cex.main=1.4, cex.axis=1.4, cex.lab=1.4)

topT <- as.data.frame(resultsObject)

#Adjusted P values (FDR Q values)
with(topT, plot(log2FoldChange, -log10(padj), pch=20, main="Volcano plot", cex=1.0, xlab=bquote(~Log[2]~fold~change), ylab=bquote(~-log[10]~Q~value)))

with(subset(topT, padj<0.05 & abs(log2FoldChange)>2), points(log2FoldChange, -log10(padj), pch=20, col="red", cex=0.5))

#with(subset(topT, padj<0.05 & abs(log2FoldChange)>2), text(log2FoldChange, -log10(padj), labels=subset(rownames(topT), topT$padj<0.05 & abs(topT$log2FoldChange)>2), cex=0.8, pos=3))

#Add lines for absolute FC>2 and P-value cut-off at FDR Q<0.05
abline(v=0, col="black", lty=3, lwd=1.0)
abline(v=-2, col="black", lty=4, lwd=2.0)
abline(v=2, col="black", lty=4, lwd=2.0)
abline(h=-log10(max(topT$pvalue[topT$padj<0.05], na.rm=TRUE)), col="black", lty=4, lwd=2.0)

volcano


There is an even better solution that I and colleagues developed using ggplot2, which allows you to easily fit labels into your plot using ggrepel().

ADD COMMENTlink modified 7 weeks ago • written 13 months ago by Kevin Blighe33k
1

@Kevin wow the author himself ...yes i took the code from this site code

ADD REPLYlink written 13 months ago by krushnach80440
1

Glad to have helped. Note the particular use of the bquote() function in order to get super- and sub-script. These are small modifications but can make a plot feel more professional. bquote() also works with ggplot2 (here I've used it with base functions).

ADD REPLYlink written 13 months ago by Kevin Blighe33k

@Kevin i have some issues Im not able to get the following with the code which you have given as im trying to include

so this is my condition

Up-regulated: =/> 0.58 (Green color)
Down-regulated: =/> -0.59 (Red color)
Significant: =/>1.3 (-Log10 p-Value)

its like all the data points are getting overlapped with same color no distinction

ADD REPLYlink written 10 months ago by krushnach80440
1

Can you paste your code so that I can put this in context?

ADD REPLYlink written 10 months ago by Kevin Blighe33k

okay here is the code im using

par(mar=c(5,5,5,5), cex=1.0, cex.main=1.4, cex.axis=1.4, cex.lab=1.4)
res <- read.csv("volcano.txt", header=TRUE,sep = '\t')

topT <- as.data.frame(res)
head(topT)

with(topT, plot(lfc, -log10(padj), pch=20, main="Volcano plot", cex=1.0, xlab=bquote(~Log[2]~fold~change), ylab=bquote(~-log[10]~P~value)))

with(subset(topT, padj<0.05 & abs(lfc)>=0.58), points(lfc, -log10(padj), pch=20, col="red", cex=0.5))

with(subset(topT, padj<0.05 & abs(lfc)>=-0.59), points(lfc, -log10(padj), pch=20, col="green", cex=0.5))
ADD REPLYlink written 10 months ago by krushnach80440

Yes, the last two lines contradict each other, specifically padj<0.05 & abs(lfc)>=0.58) and padj<0.05 & abs(lfc)>=-0.59).

I think that you want to colour positive fold-change genes as red, and negative as green? Try this:

par(mar=c(5,5,5,5), cex=1.0, cex.main=1.4, cex.axis=1.4, cex.lab=1.4)
res <- read.csv("volcano.txt", header=TRUE,sep = '\t')
topT <- as.data.frame(res)
head(topT)
with(topT, plot(lfc, -log10(padj), pch=20, main="Volcano plot", cex=1.0, xlab=bquote(~Log[2]~fold~change), ylab=bquote(~-log[10]~P~value)))
with(subset(topT, padj<0.05 & lfc>=0.58), points(lfc, -log10(padj), pch=20, col="red", cex=0.5))
with(subset(topT, padj<0.05 & lfc<=-0.59), points(lfc, -log10(padj), pch=20, col="green", cex=0.5))
ADD REPLYlink modified 10 months ago • written 10 months ago by Kevin Blighe33k

"I think that you want to colour positive fold-change genes as red, and negative as green? Try this:" yes thats what i need to show

ADD REPLYlink written 10 months ago by krushnach80440
1

Yep, take a look at the code that I pasted (above). You just needed to remove the abs() function call, and switch the 'greater than' sign to a 'less than' sign on the last line

ADD REPLYlink modified 10 months ago • written 10 months ago by Kevin Blighe33k

Hi Kevin,

I am wondering if you have rscript showing the upreguted and down regulated? Can you share them?

ADD REPLYlink written 6 months ago by bioinfool0

Yes, please take a look here: https://github.com/kevinblighe/EnhancedVolcano

In which program did you conduct differential expression?

ADD REPLYlink written 6 months ago by Kevin Blighe33k

Thanks for the quick reply.

I used edgeR in getting my DE. Once I got them, I pre-filtered my DE list in excel. I only extract FDR<1e-5 which is my significant. Then from those list, my up-reg is log2FC>0 while my down is log2FC<0. I wanted to plot my FDR against log2FC.

Can you embed the script here? Thank you very much!

ADD REPLYlink written 6 months ago by bioinfool0
1

Here is the code for data in an EdgeR results table:

EnhancedVolcanoEdgeR <- function(toptable, NominalCutoff, AdjustedCutoff, LabellingCutoff, FCCutoff, main)
{
    toptable$Significance <- "NS"
    toptable$Significance[(abs(toptable$logFC) > FCCutoff)] <- "FC"
    toptable$Significance[(toptable$FDR<AdjustedCutoff)] <- "FDR"
    toptable$Significance[(toptable$FDR<AdjustedCutoff) & (abs(toptable$logFC)>FCCutoff)] <- "FC_FDR"
    table(toptable$Significance)
    toptable$Significance <- factor(toptable$Significance, levels=c("NS", "FC", "FDR", "FC_FDR"))

    plot <- ggplot(toptable, aes(x=logFC, y=-log10(FDR))) +

        #Add points:
        #   Colour based on factors set a few lines up
        #   'alpha' provides gradual shading of colour
        #   Set size of points
        geom_point(aes(color=factor(Significance)), alpha=1/2, size=0.8) +

        #Choose which colours to use; otherwise, ggplot2 choose automatically (order depends on how factors are ordered in toptable$Significance)
        scale_color_manual(values=c(NS="grey30", FC="forestgreen", FDR="royalblue", FC_FDR="red2"), labels=c(NS="NS", FC=paste("LogFC>|", FCCutoff, "|", sep=""), FDR=paste("FDR Q<", AdjustedCutoff, sep=""), FC_FDR=paste("FDR Q<", AdjustedCutoff, " & LogFC>|", FCCutoff, "|", sep=""))) +

        #Set the size of the plotting window
        theme_bw(base_size=24) +

        #Modify various aspects of the plot text and legend
        theme(legend.background=element_rect(),
            plot.title=element_text(angle=0, size=12, face="bold", vjust=1),

            panel.grid.major=element_blank(),   #Remove gridlines
            panel.grid.minor=element_blank(),   #Remove gridlines

            axis.text.x=element_text(angle=0, size=12, vjust=1),
            axis.text.y=element_text(angle=0, size=12, vjust=1),
            axis.title=element_text(size=12),

            #Legend
            legend.position="top",          #Moves the legend to the top of the plot
            legend.key=element_blank(),     #removes the border
            legend.key.size=unit(0.5, "cm"),    #Sets overall area/size of the legend
            legend.text=element_text(size=8),   #Text size
            title=element_text(size=8),     #Title text size
            legend.title=element_blank()) +     #Remove the title

        #Change the size of the icons/symbols in the legend
        guides(colour = guide_legend(override.aes=list(size=2.5))) +

        #Set x- and y-axes labels
        xlab(bquote(~Log[2]~ "fold change")) +
        ylab(bquote(~-Log[10]~adjusted~italic(P))) +

        #Set the axis limits
        #xlim(-6.5, 6.5) +
        #ylim(0, 100) +

        #Set title
        ggtitle(main) +

        #Tidy the text labels for a subset of genes
        geom_text(data=subset(toptable, FDR<LabellingCutoff & abs(logFC)>FCCutoff),
            aes(label=rownames(subset(toptable, FDR<LabellingCutoff & abs(logFC)>FCCutoff))),
            size=2.25,
            #segment.color="black", #This and the next parameter spread out the labels and join them to their points by a line
            #segment.size=0.01,
            check_overlap=TRUE,
            vjust=1.0) +

        #Add a vertical line for fold change cut-offs
        geom_vline(xintercept=c(-FCCutoff, FCCutoff), linetype="longdash", colour="black", size=0.4) +

        #Add a horizontal line for P-value cut-off
        geom_hline(yintercept=-log10(AdjustedCutoff), linetype="longdash", colour="black", size=0.4)

    return(plot)
}

Requires

  • ggplot2

Execution

  • EnhancedVolcanoDESeq2(topTable, NominalCutoff, AdjustedCutoff, LabellingCutoff, FCCutoff, main)
  • EnhancedVolcanoEdgeR(topTable, NominalCutoff, AdjustedCutoff, LabellingCutoff, FCCutoff, main)

Example

  • EnhancedVolcanoDESeq2(topTable=results, AdjustedCutoff=0.05, LabellingCutoff0.05, FCCutoff=2.0, main="DESeq2 results")
  • EnhancedVolcanoEdgeR(topTable=results, AdjustedCutoff=0.05, LabellingCutoff0.05, FCCutoff=2.0, main="EdgeR results")

Parameters

  • toptable, data-frame of test statistics. Requires at least the following
  • gene names as rownames
  • column named 'log2FoldChange' for DESeq2 or 'logFC' for EdgeR
  • column named 'padj' for DESeq2 or 'FDR' for EdgeR
  • NominalCutoff, nominal p-value cut-off for statistical significance (obsolete)
  • AdjustedCutoff, adjusted p-value cut-off for statistical significance
  • LabellingCutoff, adjusted p-value cut-off for statistical significance for labels
  • FCCutoff, absolute log (base 2) fold change cut-off for statistical significance
  • main, title
ADD REPLYlink written 6 months ago by Kevin Blighe33k
6
gravatar for Renesh
9 days ago by
Renesh1.5k
United States
Renesh1.5k wrote:

Easy Volcano plot for gene expression data in Python https://reneshbedre.github.io/blog/volcano.html

ADD COMMENTlink written 9 days ago by Renesh1.5k

looks pretty cool will give it a try

ADD REPLYlink written 8 days ago by krushnach80440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 786 users visited in the last hour