Question: automated gene ontology enrichment for simple gene list (not microarray data)
gravatar for ruth.stoney
3.8 years ago by
ruth.stoney10 wrote:


I need to find an automated way to do GO enrichment for 3000 sets of genes. Ive been working in R but problem I'm having is that the majority of the tools (topGO, goseq) accept microarray data and do not work for simple gene lists.

DAVIDWebService seems like a perfect solution, however I can't find a function to do actual enrichment analysis. It just seems to analyse/visualise existing enrichment files.

I am comfortable writing R and python (and could possibly get a Matlab licence) and would be willing to branch out if other tools are simple to use. 

Thanks for any advice!


gene • 1.8k views
ADD COMMENTlink modified 3.8 years ago by Giovanni M Dall'Olio26k • written 3.8 years ago by ruth.stoney10

goseq works with lists of genes, not with microarrays!

ADD REPLYlink written 3.8 years ago by Benn7.9k
gravatar for Giovanni M Dall'Olio
3.8 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

Have a look at the clusterProfiler package in Bioconductor. It accepts a list of Entrez gene ids as input, and it allows to calculate both a simple enrichment and a gsea from Geneontology and other databases.

> m = enrichGO(as.character(c(1,2,3,4,5)) )
> summary(m)
                   ID                                  Description GeneRatio   BgRatio
GO:0019966 GO:0019966                        interleukin-1 binding       1/2   6/18679
GO:0019958 GO:0019958                      C-X-C chemokine binding       1/2   7/18679
GO:0019956 GO:0019956                            chemokine binding       1/2  15/18679
GO:0048306 GO:0048306            calcium-dependent protein binding       1/2  60/18679
GO:0019955 GO:0019955                             cytokine binding       1/2  83/18679
GO:0004867 GO:0004867 serine-type endopeptidase inhibitor activity       1/2  94/18679
GO:0002020 GO:0002020                             protease binding       1/2 103/18679
GO:0019838 GO:0019838                        growth factor binding       1/2 116/18679
GO:0004866 GO:0004866             endopeptidase inhibitor activity       1/2 168/18679
GO:0061135 GO:0061135             endopeptidase regulator activity       1/2 173/18679
GO:0030414 GO:0030414                 peptidase inhibitor activity       1/2 177/18679
GO:0061134 GO:0061134                 peptidase regulator activity       1/2 212/18679
                 pvalue    p.adjust      qvalue geneID Count
GO:0019966 0.0006423467 0.007493844 0.002760890      2     1
GO:0019958 0.0007493844 0.007493844 0.002760890      2     1
GO:0019956 0.0016054798 0.010703199 0.003943284      2     1
GO:0048306 0.0064141802 0.030955323 0.011404593      2     1
GO:0019955 0.0088674776 0.030955323 0.011404593      2     1
GO:0004867 0.0100397218 0.030955323 0.011404593      2     1
GO:0002020 0.0109983147 0.030955323 0.011404593      2     1
GO:0019838 0.0123821292 0.030955323 0.011404593      2     1
GO:0004866 0.0179076991 0.034295408 0.012635150      2     1
GO:0061135 0.0184381870 0.034295408 0.012635150      2     1
GO:0030414 0.0188624742 0.034295408 0.012635150      2     1


ADD COMMENTlink written 3.8 years ago by Giovanni M Dall'Olio26k

Thank you, this works perfectly and is so simple!

ADD REPLYlink written 3.8 years ago by ruth.stoney10
gravatar for Kamil
3.8 years ago by
Kamil2.0k wrote:

You might start by considering a function in the limma package called goana. See the examples in the documentation. The function can perform an enrichment test even if you only provide a vector or Entrez Gene IDs, without any other inputs.

See all the other packages available for Gene Set Enrichment at Bioconductor.

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Kamil2.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1196 users visited in the last hour