How to get Go Terms IDs from given a file that contains gene Ensembl IDs with a python script
1
1
Entering edit mode
5.7 years ago
jopealfe ▴ 20

I have a file that contains some gene Ensembl IDs. My goal is to get all the GO Terms associated to those gene Ensembl IDs with a python script.

Anyone knows about the existence of some python package that allows me to reach my goal ? I'm currently aware of package as mygene, godb, etc. However, none of them as the option to get the GO Terms from gene Ensembl IDs. I've used R as well to get this done and worked just fine. However, I really want to do it with a python script and I'm getting a lot of difficulties.

gene gene ontology Ensemble IDs Go Terms • 2.8k views
ADD COMMENT
1
Entering edit mode

copy/pasted and modified from ensembl rest page (for python 3):

import requests, sys

server = "https://rest.ensembl.org"
ext = "/xrefs/id/ENST00000288602?external_db=GO;all_levels=1"

r = requests.get(server+ext, headers={ "Content-Type" : "application/json"})

if not r.ok:
  r.raise_for_status()
  sys.exit()

decoded = r.json()
print(repr(decoded))
ADD REPLY
0
Entering edit mode

Thanks! That's what I was looking for!

Do you know if there is a limit to input genes ID to query? I've adapt into a loop to perform the code to several ensembl IDs. What I notice so far is that when the number of input IDs is a large one, it gets me the following error message:

HTTPError: 400 Client Error: Bad Request for url: https://rest.ensembl.org/xrefs/id/ENSG00000068793?external_db=GO;all_levels=1

That Ensembl ID is the 21 input. I guess that it only takes 19/20 consecutive inputs, because when I reduce to 19 inputs it works just fine.

ADD REPLY
0
Entering edit mode

For more ids, use post instead of get. Examples are here https://rest.ensembl.org/documentation/info/lookup_post

ADD REPLY
0
Entering edit mode

For what I briefly read of the last link, that option does not give the GO ids...

I just now understand why that error. It was because some of the input ID's had no result. I've readapted the code and it works perfectly now, thanks!

ADD REPLY
0
Entering edit mode

jopealfe You are right. Ontologies and Taxonomy doesn't support post (as of now).

ADD REPLY
3
Entering edit mode
5.7 years ago

You can totally do this with MyGene:

import mygene

mg = mygene.MyGeneInfo()

mg.getgene('ENSG00000123374', fields='go')

Returns a whole bunch of GO terms that you can easily parse.

ADD COMMENT

Login before adding your answer.

Traffic: 2712 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6