How to tune and use the MetaVolcanoR
1
0
Entering edit mode
20 months ago
JACKY ▴ 140

I'm having some trouble with the MetaVolcanoR package. I asked about this at the Bioconductor support forum but there were no responses. I'll try my luck here since I need all the help I can get.

I have results of differential expression analysis from 17 datasets, using the LIMMA package. For each dataset, the genes are in a data frame (called deg), along with the log fold change and the p-value. I took those data frames (called deg1, deg2, deg3 ... and so on) and stored them in a list called totalDEG, I followed this guide.

Note - the data frames contain all the genes, including the unperturbed genes. I will specify that I want only the significant ones in the votecount_mv function.

The plot is showing much more genes than there really are for some datasets. For example, it appears in the plot that Dataset6 has more than 50,000 genes, while in fact, it only has about 17,000. The same for Dataset8. Why is this happening?

And, what is the metathr parameter? I read about it and still don't understand what it does. I set it as 0.01 for now.

enter image description here

Code:

totalDEG = list(Dataset1 = deg1, Dataset2 = deg2,
                Dataset3 = deg3, Dataset4 = deg4, Dataset5 = deg5, Dataset6 = deg6, 
                Dataset7 = deg7, Dataset8 = deg8 ,Dataset9 = deg9, Dataset10 = deg10,
                Dataset11 = deg11, Dataset12 = deg12, Dataset13 = deg13 , Dataset15 = deg15,
                Dataset16 = deg16, Dataset20 = deg20, Dataset21 = deg21)

totalDEG = map(totalDEG, ~ .x %>% rownames_to_column("symbol") %>% `row.names<-`(.$symbol))

meta_degs_vote <- votecount_mv(diffexp=totalDEG,
                               pcriteria='P.Value', 
                               foldchangecol='logFC',
                               genenamecol='symbol',
                               geneidcol=NULL,
                               pvalue = 0.05,
                               foldchange = 0.5, 
                               metathr=0.01,
                               collaps=FALSE,
                               jobname="MetaVolcano", 
                               outputfolder=".",
                               draw='HTML')

head(meta_degs_vote@metaresult, 50)
meta_degs_vote@degfreq
limma Bioconductor r MetaVolcanoR • 681 views
ADD COMMENT
0
Entering edit mode
10 months ago
Basti ★ 2.0k

Hello, I was facing the same issue. I dug into the functions and found an error in the draw_degbar function from the package :

draw_degbar <- function(degbar_data) {
    ggplot(degbar_data, aes(dataset)) +
        geom_bar(aes(fill = Regulation)) +
        theme_classic() +
        theme(panel.border= element_blank()) +
        theme(axis.text.x = element_text(angle=90, vjust = 0.5)) +
        theme(axis.line.x = element_line(color="black", size = 0.6, 
                     lineend = "square"),
              axis.line.y = element_line(color="black", size = 0.6, 
                     lineend = "square")) +
        guides(colour = guide_colorbar()) +
        labs(x = "Datasets",
             y = "Number of genes") +
        scale_fill_manual(values=c("#E41A1C", "grey", "#377EB8" )) +
        scale_x_discrete(labels=substr(unique(degbar_data[['dataset']]), 0, 30))
}

The scale_x_discrete function does not respect the true order of bars which is initially set by alphabetical order, whereas the function renames the x graduations with a totally different order : this is FALSE.

My solution is to remove the scale_x_discrete function, but you can also specify the order you want by calling degbar_data$dataset=factor(degbar_data$dataset,levels =...) before plotting.

I will open an issue on their GitHub page.

ADD COMMENT

Login before adding your answer.

Traffic: 2438 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6