Question: automated gene ontology enrichment for simple gene list (not microarray data)
gravatar for ruth.stoney
21 months ago by
ruth.stoney10 wrote:


I need to find an automated way to do GO enrichment for 3000 sets of genes. Ive been working in R but problem I'm having is that the majority of the tools (topGO, goseq) accept microarray data and do not work for simple gene lists.

DAVIDWebService seems like a perfect solution, however I can't find a function to do actual enrichment analysis. It just seems to analyse/visualise existing enrichment files.

I am comfortable writing R and python (and could possibly get a Matlab licence) and would be willing to branch out if other tools are simple to use. 

Thanks for any advice!


gene • 893 views
ADD COMMENTlink modified 21 months ago by Giovanni M Dall'Olio25k • written 21 months ago by ruth.stoney10

goseq works with lists of genes, not with microarrays!

ADD REPLYlink written 21 months ago by b.nota3.6k
gravatar for Giovanni M Dall'Olio
21 months ago by
London, UK
Giovanni M Dall'Olio25k wrote:

Have a look at the clusterProfiler package in Bioconductor. It accepts a list of Entrez gene ids as input, and it allows to calculate both a simple enrichment and a gsea from Geneontology and other databases.

> m = enrichGO(as.character(c(1,2,3,4,5)) )
> summary(m)
                   ID                                  Description GeneRatio   BgRatio
GO:0019966 GO:0019966                        interleukin-1 binding       1/2   6/18679
GO:0019958 GO:0019958                      C-X-C chemokine binding       1/2   7/18679
GO:0019956 GO:0019956                            chemokine binding       1/2  15/18679
GO:0048306 GO:0048306            calcium-dependent protein binding       1/2  60/18679
GO:0019955 GO:0019955                             cytokine binding       1/2  83/18679
GO:0004867 GO:0004867 serine-type endopeptidase inhibitor activity       1/2  94/18679
GO:0002020 GO:0002020                             protease binding       1/2 103/18679
GO:0019838 GO:0019838                        growth factor binding       1/2 116/18679
GO:0004866 GO:0004866             endopeptidase inhibitor activity       1/2 168/18679
GO:0061135 GO:0061135             endopeptidase regulator activity       1/2 173/18679
GO:0030414 GO:0030414                 peptidase inhibitor activity       1/2 177/18679
GO:0061134 GO:0061134                 peptidase regulator activity       1/2 212/18679
                 pvalue    p.adjust      qvalue geneID Count
GO:0019966 0.0006423467 0.007493844 0.002760890      2     1
GO:0019958 0.0007493844 0.007493844 0.002760890      2     1
GO:0019956 0.0016054798 0.010703199 0.003943284      2     1
GO:0048306 0.0064141802 0.030955323 0.011404593      2     1
GO:0019955 0.0088674776 0.030955323 0.011404593      2     1
GO:0004867 0.0100397218 0.030955323 0.011404593      2     1
GO:0002020 0.0109983147 0.030955323 0.011404593      2     1
GO:0019838 0.0123821292 0.030955323 0.011404593      2     1
GO:0004866 0.0179076991 0.034295408 0.012635150      2     1
GO:0061135 0.0184381870 0.034295408 0.012635150      2     1
GO:0030414 0.0188624742 0.034295408 0.012635150      2     1


ADD COMMENTlink written 21 months ago by Giovanni M Dall'Olio25k

Thank you, this works perfectly and is so simple!

ADD REPLYlink written 21 months ago by ruth.stoney10
gravatar for Kamil
21 months ago by
Kamil1.7k wrote:

You might start by considering a function in the limma package called goana. See the examples in the documentation. The function can perform an enrichment test even if you only provide a vector or Entrez Gene IDs, without any other inputs.

See all the other packages available for Gene Set Enrichment at Bioconductor.

ADD COMMENTlink modified 21 months ago • written 21 months ago by Kamil1.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 946 users visited in the last hour