Volcano plot: Using Multiple results
2
0
Entering edit mode
4.7 years ago

Hello, I am trying to make volcano plot of the multiple microarray results data in the single plot itself and annotating them by different colors. I know how to generate the Volcano plot for single result, but the same logic I can't apply for multiple results. I googled it but there also i didn't get any satisfactory results Following is the example dataset,

Gene                Fold      pvalue          Fold      pvalue 
LUZP1             -7.57373    0.0085       3.66623      0.0075
FAM71F1           -6.31787    0.00865    -6.23586        0.00013
MBD4              -5.81446    0.0087     2.68795        0.00264
HOXD13             4.85073    0.0234      -4.2358        0.0035
DNAJC2             -2.75493    0.0314     1.23585       0.045
CHD4                -2.49614    0.057     6.26588       0.0225

My question how will it be possible to make such volcanoplot. Any suggestions or guidance in this regard is deeply appreciated.

RNA-Seq Microarray Volcanoplot • 3.2k views
ADD COMMENT
3
Entering edit mode
4.7 years ago
shawn.w.foley ★ 1.3k

An alternative solution would be to use ggplot2. If you have results from multiple samples in df1, df2, and df3, then:

library(ggplot2)
df1$sample <- 'sample1'
df1$significant <- abs(df1$Fold) > 1 & df1$FDR < 0.05
df2$sample <- 'sample2'
df2$significant <- abs(df2$Fold) > 1 & df2$FDR < 0.05
df3$sample <- 'sample3'
df3$significant <- abs(df3$Fold) > 1 & df3$FDR < 0.05

df.combine <- rbind(df1,df2,df3)
ggplot(df.combine,aes(x=Fold,y=-log10(FDR),col=significant,shape=sample) + geom_point())

This will result in a Volcano plot graphing Fold change versus -log10(FDR), with each point colored by whether it's a significant difference (defined by both absolute value of log2FC and significance) and with a shape corresponding to the sample.

That being said, I think ATpoint makes a good suggestion with multiple different plots on the same page, I'd worry about overplotting.

ADD COMMENT
0
Entering edit mode

Yes this can be a good method, Thank you @shawn.w.foley

ADD REPLY
2
Entering edit mode
4.7 years ago
ATpoint 81k

Simply make the volcano with the first dataset using the standard plot command and then use points specifying a different color to add extra datapoints to the existing plot corresponding to the other studies. Later use legend to make a proper legend matching colors with the study name.

plot(study1$logFC, -log10(study1$pvalue), col="black")
points(study2$logFC, -log10(study2$pvalue), col="red")
(and so on...)

Can probably be done with a simple for loop. Make sure you check the data range for x and y-axis beforehand so that no added points go beyond the limits from the first plot. Still, will probably get a bit messy adding so many data points. Maybe independent plots on the same page par(mfrow=c(2,2)) to get four plots on one page might be better.

ADD COMMENT
0
Entering edit mode

Thanks a lot @ATpoint. It worked. and Since I am working on very small no. of genes, plot is not that messy. I appreciate your suggestions.

ADD REPLY

Login before adding your answer.

Traffic: 2906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6