Question

topGO GO enrichment analysis for common DEGs from multiple pairwise comparisons

0

Entering edit mode

3.6 years ago

anikng • 0

I was trying to apply GO enrichment analysis using topGO to rice dataset with the help of Reference 1 and Reference 2. As per the manual and these references, it is a good idea to provide to p-value for ranking the genes.

However, I am working on a DEG list that was commonly identified in different conditions (not a specific comparison). In that case, how can I rank the genes based on p-val or FDR? I am not sure how to input the DEGs and make a ranking vector
Fisher test and KM test produce different GO enrichment result. Which one to select?

I use following code,

 mart <- biomaRt::useMart(biomart = "plants_mart",
                     dataset = "osativa_eg_gene",
                     host = 'plants.ensembl.org')



get_go <- biomaRt::getBM(attributes = c( "ensembl_gene_id",
                                     "go_id"), mart = mart)
get_go <- get_go[get_go$go_id != '',]


geneID2GO <- by(get_go$go_id,
            get_go$ensembl_gene_id,
            function(x) as.character(x))


all.genes <- sort(unique(as.character(get_go$ensembl_gene_id)))



#Input list

?????

go.obj <- new("topGOdata", ontology='BP'
          , allGenes = int.genes
          , annot = annFUN.gene2GO
          , gene2GO = geneID2GO
          ,  nodeSize = 10
             )

#Fisher test
 results <- runTest(go.obj, algorithm = "elim", statistic = "fisher")
 results.tab <- GenTable(object = go.obj, elimFisher = results)


#Kolmogorov-Smirnov (K-S) test 
results.ks <- runTest(go.obj, algorithm="classic", statistic="ks")
goEnrichment <- GenTable(go.obj, KS=results.ks, orderBy="KS", topNodes=20)
goEnrichment <- goEnrichment[goEnrichment$KS<0.05,]
goEnrichment <- goEnrichment[,c("GO.ID","Term","KS")]
goEnrichment$Term <- gsub(" [a-z]*\\.\\.\\.$", "", goEnrichment$Term)
goEnrichment$Term <- gsub("\\.\\.\\.$", "", goEnrichment$Term)
goEnrichment$Term <- paste(goEnrichment$GO.ID, goEnrichment$Term, sep=", ")
goEnrichment$Term <- factor(goEnrichment$Term, levels=rev(goEnrichment$Term))
goEnrichment$KS <- as.numeric(goEnrichment$KS)

topGO GO enrichment Ensembl Biomart • 1.8k views

ADD COMMENT • link updated 3.6 years ago by antonioggsousa 3.2k • written 3.6 years ago by anikng • 0

score 3 · Accepted Answer · 2020-08-25

Hi,

Regarding your questions:

So, if you have a, let's call it, consensus DEG list that was obtained by comparing multiple DEG lists across a specific condition of interest, and retrieving only the DEG genes that were identified as being differentially expressed across all the multiple DEG list comparison, you cannot do GO enrichment based on ranks, because the same DEG gene can (will) have different ranks in the different multiple DEG list compared. Although there is the possibility of doing GO enrichment without using rank-based methods, giving a list of interesting genes, as it is your case. See the topGO vignette about that on the section 4.4Predefined list of interesting genes. There a factor gene list assuming 2 possible levels - 0 or 1 - where 1 is the interesting genes and 0 non-interesting genes.
See this Biostars post: topGO which statistical test (fisher or KS) to use ?

I hope this helps,

António