Question: Retrieving GO terms from ensembl bacteria
gravatar for Diego
13 months ago by
Diego40 wrote:

What I want is retrieve GO terms associated to the genes from one organism directly from Ensemble Bacteria database.

I know this was possible at certain point after EnsembleGenomes abandoned the BioMart suite (acording to this post), but I couldn't fin anything in the documentation. Particularly, I searched this in

but I couldn't fin anything about GO terms.

I know this was possible back then with BioMart:

  1. Obtaining a list of genes of a certain organism (e.g. E. Coli Tax ID 562)
  2. Selecting the mart ensemblbacteria
  3. Using the dataset of the organism you needed
  4. Filter the dataset by your gene IDs and retrieve 'go_id' attribute from filtered rows.

but now that's no longer an option. I really don't know how to do this and even if it's possible at all.

I know there are tools (like GeneSFC) and services (like QuickGO) which can do similar things, but that's exactly why I'm trying to do this, because I want to benchmark and compare which results I obtain from the Emsemble and compare (and complement) it with other results.

ADD COMMENTlink modified 13 months ago • written 13 months ago by Diego40
gravatar for Denise - Open Targets
13 months ago by
UK, Hinxton, EMBL-EBI
Denise - Open Targets4.2k wrote:

Try the Ensembl Genomes REST API with this xref endpoint for example for the CAD01290 gene.

ADD COMMENTlink written 13 months ago by Denise - Open Targets4.2k

Hi! Thanks for your answer, of course this will be useful! Isn't there a way to ask for all genes given a certain species? Is it possible to restrict the results with 'compara' parameter?


ADD REPLYlink written 13 months ago by Diego40

The REST API would be good for extracts of the Ensembl Bacteria database and if you can get the GO info via the REST API with Perl, Python, Ruby, Java, Curl or Wget GO for hundreds of genes or perhaps even the entire genome of you favourite bacteria (around 4,000 genes?). For all genes in any given species you can also access the data via the Ensembl Perl API, if you know Perl. The Compara analysis has not been done for all bacterial species in Ensembl, rather for a subset of 202 genomes.

ADD REPLYlink written 13 months ago by Denise - Open Targets4.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 699 users visited in the last hour