Problem With Programmatic Access To Uniprot (Solved)
2
0
Entering edit mode
9.2 years ago
Olivier ▴ 440

Dear all

Is there a 'universal' way to programmatically (via biopython) access protein data/features without going to each of the different DBs (uniprot, swissprot, prosite, and pfam)? via ExPAsy? or InterPro?

urllib.urlretrieve('http://www.uniprot.org/uniprot/A2Z669.txt',filename='xxx.txt') has worked once or twice before but no longer does now. I can open the file on my browser though.

I tried uniprot's template for python programmatic access, for a ROSALIND's exercise (http://rosalind.info/problems/mprt/) but it doesn't work - IDLE gets frozen.

code used, from uniprot:

import urllib,urllib2

url = 'http://www.uniprot.org/mapping/'`  

params = {
    'from':'ACC',
    'to':'P_REFSEQ_AC',
    'format':'tab',
    'query':'A2Z669 B5ZC00 P07204_TRBM_HUMAN P20840_SAG1_YEAST'
}  

data = urllib.urlencode(params)  
request = urllib2.Request(url, data)  
contact = "my.email.address"  
request.add_header('User-Agent', 'Python contact')  
response = urllib2.urlopen(request)  
page = response.read(200000)`

Thanks

uniprot python • 8.2k views
ADD COMMENT
1
Entering edit mode

make sure to understand that the urlretrieve will only work if the URL exists and can be accessed say from a browser. It does not do anything more and it has no bioinformatics awareness.

For the second example you should edit your answer and include more of the code. It is impossible to troubleshoot as posted - we don't know all the ROSALIND examples by heart.

ADD REPLY
1
Entering edit mode
9.2 years ago

Works here, not sure what the output should be but finishes just fine

import urllib,urllib2
url = 'http://www.uniprot.org/mapping/' 

params = {
    'from':'ACC',
    'to':'P_REFSEQ_AC',
    'format':'tab',
    'query':'A2Z669 B5ZC00 P07204_TRBM_HUMAN P20840_SAG1_YEAST'
}  

data = urllib.urlencode(params)  
request = urllib2.Request(url, data)  
contact = "my.email.address"  
request.add_header('User-Agent', 'Python contact')  
response = urllib2.urlopen(request)  
page = response.read()

print page

produces

From    To
B5ZC00    YP_002284940.1
ADD COMMENT
0
Entering edit mode

IDLE (python 2.7) hangs for like 30secs and then I get that. That's discouraging:

Traceback (most recent call last):
  File "/home/olivier/ROSALIND_scripts/uniprotTest.py", line 15, in <module>
    response = urllib2.urlopen(request)  
  File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)  
  File "/usr/lib/python2.7/urllib2.py", line 400, in open
    response = self._open(req, data)  
  File "/usr/lib/python2.7/urllib2.py", line 418, in _open
    '_open', req)  
  File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)  
  File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
    return self.do_open(httplib.HTTPConnection, req)  
  File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
    raise URLError(err)  
URLError: <urlopen error [Errno 110] Connection timed out>

Thanks nevertheless. My aim is to get the records for the uniprot IDs, then I'll be able to manage. I guess I gotta try harder.

ADD REPLY
0
Entering edit mode

I think I have a modem problem Istvan. I had same in the past (I don't know why) with Entrez from Biopython. I added this line at the top, before using "urlib.urlretrieve()": os.environ['http_proxy'] = 'http://193.62.193.81:80'.
Hope it helps someone.. Btw I changed the title of the question. I don't really know whether to use this post as an answer.

ADD REPLY
0
Entering edit mode

your problem is very likely caused by local settings on your computer. Note that the Windows version of python will make use of the proxy and other network settings as specified in Internet Explorer - very unexpected but something to look out for if you are on Windows - it has to do the way network access is integrated into the OS

ADD REPLY
0
Entering edit mode

Thanks for your help. Using IDLE with python2.7 on Ubuntu12.04. I guess I have to set these parameters each time I have to access these databases.

ADD REPLY
1
Entering edit mode
8.1 years ago

You asked for Python, but in any case I thought it would be good to mention that there's a beautiful UniProt java API developed by the UniProt people: http://www.ebi.ac.uk/uniprot/remotingAPI/

ADD COMMENT

Login before adding your answer.

Traffic: 850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6