Problem With Programmatic Access To Uniprot (Solved)
2
0
Entering edit mode
10.0 years ago
Olivier ▴ 440

Dear all

Is there a 'universal' way to programmatically (via biopython) access protein data/features without going to each of the different DBs (uniprot, swissprot, prosite, and pfam)? via ExPAsy? or InterPro?

urllib.urlretrieve('http://www.uniprot.org/uniprot/A2Z669.txt',filename='xxx.txt') has worked once or twice before but no longer does now. I can open the file on my browser though.

I tried uniprot's template for python programmatic access, for a ROSALIND's exercise (http://rosalind.info/problems/mprt/) but it doesn't work - IDLE gets frozen.

code used, from uniprot:

import urllib,urllib2

url = 'http://www.uniprot.org/mapping/'

params = {
'from':'ACC',
'to':'P_REFSEQ_AC',
'format':'tab',
'query':'A2Z669 B5ZC00 P07204_TRBM_HUMAN P20840_SAG1_YEAST'
}

data = urllib.urlencode(params)
request = urllib2.Request(url, data)
response = urllib2.urlopen(request)


Thanks

uniprot python • 9.1k views
1
Entering edit mode

make sure to understand that the urlretrieve will only work if the URL exists and can be accessed say from a browser. It does not do anything more and it has no bioinformatics awareness.

For the second example you should edit your answer and include more of the code. It is impossible to troubleshoot as posted - we don't know all the ROSALIND examples by heart.

0
Entering edit mode

Thanks for the code. I have the following 'HTTPError: HTTP Error 405: Not Allowed' with urlopen. Hope you can help me with this?

1
Entering edit mode
10.0 years ago

Works here, not sure what the output should be but finishes just fine

import urllib,urllib2
url = 'http://www.uniprot.org/mapping/'

params = {
'from':'ACC',
'to':'P_REFSEQ_AC',
'format':'tab',
'query':'A2Z669 B5ZC00 P07204_TRBM_HUMAN P20840_SAG1_YEAST'
}

data = urllib.urlencode(params)
request = urllib2.Request(url, data)
response = urllib2.urlopen(request)

print page


produces

From    To
B5ZC00    YP_002284940.1

0
Entering edit mode

IDLE (python 2.7) hangs for like 30secs and then I get that. That's discouraging:

Traceback (most recent call last):
File "/home/olivier/ROSALIND_scripts/uniprotTest.py", line 15, in <module>
response = urllib2.urlopen(request)
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 110] Connection timed out>


Thanks nevertheless. My aim is to get the records for the uniprot IDs, then I'll be able to manage. I guess I gotta try harder.

0
Entering edit mode

I think I have a modem problem Istvan. I had same in the past (I don't know why) with Entrez from Biopython. I added this line at the top, before using "urlib.urlretrieve()": os.environ['http_proxy'] = 'http://193.62.193.81:80'.
Hope it helps someone.. Btw I changed the title of the question. I don't really know whether to use this post as an answer.

0
Entering edit mode

your problem is very likely caused by local settings on your computer. Note that the Windows version of python will make use of the proxy and other network settings as specified in Internet Explorer - very unexpected but something to look out for if you are on Windows - it has to do the way network access is integrated into the OS

0
Entering edit mode

Thanks for your help. Using IDLE with python2.7 on Ubuntu12.04. I guess I have to set these parameters each time I have to access these databases.

1
Entering edit mode
8.9 years ago

You asked for Python, but in any case I thought it would be good to mention that there's a beautiful UniProt java API developed by the UniProt people: http://www.ebi.ac.uk/uniprot/remotingAPI/