volcano plot in R
2
0
Entering edit mode
15 months ago
parinv ▴ 30

I am trying to plot a volcano plot after DE analysis. The tutorial I'm using gives following code:

volcanoplot(fit2,highlight=10,names = genesymbol)

using this I am getting random genes from the list. I want to visualize the top 10 genes of my gene list.

• Can anyone explain how the plot choose the genes to highlight?
R • 635 views
2
Entering edit mode

post the the link to tutorial and paste your code here for future reference parinv

In the mean time, please follow solution furnished in: https://support.bioconductor.org/p/113985/ by James W. MacDonald or Gordon smyth in the same post.

0
Entering edit mode
2
Entering edit mode
15 months ago
ATpoint 52k

A volcano plot is simply the logFC on the x-axis and the -log10(p-value) (or any other significance metric) on the y-axis. It is on you if you include all genes or just the signficant ones. There is a Bioc package EnhancedVolcano from our user Kevin Blighe which wraps this into ggplot2 style and is very customizable. There you have full control over what is plotted or not. I doubt though that a top-10 plot will look pretty or be informative as the entire idea of Volcanos is to show how the significant data points separate from the non-significant cloud and if there is a trend towards up- or downregulated genes and the magnitude of the changes. I would rather plot all genes and then color maybe all significant ones in colorA and the top10 in colorB or alternatively indicate the gene name for the top genes in order to highlight them. In EnhancedVolcano I think this can be done via the selectLab option. Section 4.8 is probably what you want: https://bioconductor.org/packages/release/bioc/vignettes/EnhancedVolcano/inst/doc/EnhancedVolcano.html

2
Entering edit mode
15 months ago

Hi parinv

If you look at your tutorial, fit2 is a dataframe and looks like this:

##                Symbol Entrez_Gene_ID Chromosome       Cytoband     logFC
## ILMN_1704294     CDH3           1001         16       16q22.1c  3.269052
## ILMN_1813704 KIAA1199          57214         15       15q25.1b  4.749574
## ILMN_1724686    CLDN1           9076          3          3q28c  3.627208
## ILMN_1811598    ADH1B            125          4          4q23b -2.380786
## ILMN_1763749   GUCA2A           2980          1        1p34.2b -5.039637
## ILMN_1652431      CA1            759          8        8q21.2b -5.998422
## ILMN_1659990     HIG2          29923          7        7q32.1a  1.259224
## ILMN_1702858   ADHFE1         137872          8        8q13.1b -1.511602
## ILMN_1663866    TGFBI           7045          5 5q31.1f-q31.2a  2.038634
## ILMN_1657435     MT1M           4499         16         16q13b -3.354608

The below command is basically adding the headers to the columns in fit2 dataframe

anno <- anno[,c("Symbol","Entrez_Gene_ID","Chromosome","Cytoband")]
fit2$genes <- anno then the command to highlight top 10 genes is volcanoplot(fit2,highlight=10,names = fit2$genes$"Symbol") In order for this to work your dataframe should exactly be looking like this. Can you please post the output of • class(fit2) • head(fit2) • str(fit2) • colnames(fit2) ADD COMMENT 0 Entering edit mode In the tutorial I am not getting results from annotations code: anno <- fData(gse.expFilt) head(anno)[,1:5] It is giving error: Error in [.data.frame(head(anno), , 1:5) : undefined columns selected the output for following: class(fit2) [1] "MArrayLM" attr(,"package") [1] "limma" head(fit2) An object of class "MArrayLM"$coefficients Contrasts scz_v_control 1 0.044867450 3 0.128492334 5 -0.002271199 10 -0.026641735 11 -0.030649494 16 -0.006881393

$rank [1] 2$assign [1] 1 1

$qr$qr control scz 1 -4.3588989 0.000000 2 0.2294157 -3.872983 3 0.2294157 0.000000 4 0.2294157 0.000000 5 0.2294157 0.000000 29 more rows ...

$qraux [1] 1.229416 1.000000$pivot [1] 1 2

$tol [1] 1e-07$rank [1] 2

$df.residual [1] 32 32 32 32 32 32$sigma 1 3 5 10 11 16 0.03840097 0.17270467 0.05922681 0.05670645 0.03905891 0.05043108

$cov.coefficients Contrasts Contrasts scz_v_control scz_v_control 0.1192982$stdev.unscaled Contrasts scz_v_control 1 0.3453958 3 0.3453958 5 0.3453958 10 0.3453958 11 0.3453958 16 0.3453958

$Amean 1 3 5 10 11 16 3.194185 2.201624 2.435854 2.448577 2.674816 1.942989$method [1] "ls"

$t Contrasts scz_v_control 1 3.1507050 3 2.3497003 5 -0.1131795 10 -1.3776756 11 -2.1261766 16 -0.3921919$p.value Contrasts scz_v_control 1 0.003132177 3 0.023968712 5 0.910471426 10 0.176192902 11 0.039901841 16 0.697061003

str(fit2)

Formal class 'MArrayLM' [package "limma"] with 1 slot ..@ .Data:List of 23 .. ..$: num [1:10481, 1] 0.04487 0.12849 -0.00227 -0.02664 -0.03065 ... .. .. ..- attr(, "dimnames")=List of 2 .. .. .. ..$ : chr [1:10481] "1" "3" "5" "10" ... .. .. .. ..$Contrasts: chr "scz_v_control" .. ..$ : int 2 .. ..$: int [1:2] 1 1 .. ..$ :List of 5 .. .. ..$qr : num [1:34, 1:2] -4.359 0.229 0.229 0.229 0.229 ... .. .. .. ..- attr(, "dimnames")=List of 2 .. .. .. .. ..$ : chr [1:34] "1" "2" "3" "4" ... .. .. .. .. ..$: chr [1:2] "control" "scz" .. .. .. ..- attr(, "assign")= int [1:2] 1 1 .. .. .. ..- attr(, "contrasts")=List of 1 .. .. ..$ qraux: num [1:2] 1.23 1 .. .. ..$pivot: int [1:2] 1 2 .. .. ..$ tol : num 1e-07 .. .. ..$rank : int 2 .. .. ..- attr(, "class")= chr "qr" .. ..$ : int [1:10481] 32 32 32 32 32 32 32 32 32 32 ... .. ..$: Named num [1:10481] 0.0384 0.1727 0.0592 0.0567 0.0391 ... .. .. ..- attr(, "names")= chr [1:10481] "1" "3" "5" "10" ... .. ..$ : num [1, 1] 0.119 .. .. ..- attr(, "dimnames")=List of 2 .. ..$: num [1:10481, 1] 0.345 0.345 0.345 0.345 0.345 ... .. .. ..- attr(, "dimnames")=List of 2 .. .. .. ..$ : chr [1:10481] "1" "3" "5" "10" ... .. .. .. ..$Contrasts: chr "scz_v_control" .. ..$ : Named num [1:10481] 3.19 2.2 2.44 2.45 2.67 ... .. .. ..- attr(, "names")= chr [1:10481] "1" "3" "5" "10" ... .. ..$: chr "ls" .. ..$ : num [1:34, 1:2] 1 1 1 1 1 1 1 1 1 1 ... .. .. ..- attr(, "dimnames")=List of 2 .. .. .. ..$: chr [1:34] "1" "2" "3" "4" ... .. .. .. ..$ : chr [1:2] "control" "scz" .. .. ..- attr(, "assign")= int [1:2] 1 1 .. .. ..- attr(, "contrasts")=List of 1 .. ..$: num [1:2, 1] -1 1 .. .. ..- attr(, "dimnames")=List of 2 .. ..$ : num 0.00276 .. ..$: num 3.63 .. ..$ : num 0.01 .. ..$: Named num [1:10481] 0.0017 0.02507 0.00338 0.00313 0.00174 ... .. .. ..- attr(, "names")= chr [1:10481] "1" "3" "5" "10" ... .. ..$ : num [1:10481, 1] 3.151 2.35 -0.113 -1.378 -2.126 ... .. .. ..- attr(, "dimnames")=List of 2 .. .. .. ..$: chr [1:10481] "1" "3" "5" "10" ... .. ..$ : num [1:10481] 38.8 38.8 38.8 38.8 38.8 ... .. ..$: num [1:10481, 1] 0.00313 0.02397 0.91047 0.17619 0.0399 ... .. .. ..- attr(, "dimnames")=List of 2 .. .. .. ..$ : chr [1:10481] "1" "3" "5" "10" ... .. ..$: num [1:10481, 1] -1.95 -3.76 -6.31 -5.4 -4.2 ... .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:10481] "1" "3" "5" "10" ... .. ..$: num [1:10481] 9.9269 5.5211 0.0128 1.898 4.5206 ... .. ..$ : num [1:10481] 0.00313 0.02397 0.91047 0.17619 0.0399 ...

colnames(fit2)

[1] "scz_v_control"