Question: Problem With Programmatic Access To Uniprot (Solved)
0
gravatar for Olivier
6.0 years ago by
Olivier440
Olivier440 wrote:

Dear all

Is there a 'universal' way to programmatically (via biopython) access protein data/features without going to each of the different DBs (uniprot, swissprot, prosite, and pfam)? via ExPAsy? or InterPro?

urllib.urlretrieve('http://www.uniprot.org/uniprot/A2Z669.txt',filename='xxx.txt') has worked once or twice before but no longer does now. I can open the file on my browser though.

I tried uniprot's template for python programmatic access, for a ROSALIND's exercise (http://rosalind.info/problems/mprt/) but it doesn't work - IDLE gets frozen.

code used, from uniprot:

import urllib,urllib2

url = 'http://www.uniprot.org/mapping/'`  

params = {
    'from':'ACC',
    'to':'P_REFSEQ_AC',
    'format':'tab',
    'query':'A2Z669 B5ZC00 P07204_TRBM_HUMAN P20840_SAG1_YEAST'
}  

data = urllib.urlencode(params)  
request = urllib2.Request(url, data)  
contact = "my.email.address"  
request.add_header('User-Agent', 'Python contact')  
response = urllib2.urlopen(request)  
page = response.read(200000)`

Thanks

python uniprot • 3.0k views
ADD COMMENTlink modified 4.9 years ago by Jose Manuel Duarte280 • written 6.0 years ago by Olivier440
1

make sure to understand that the urlretrieve will only work if the URL exists and can be accessed say from a browser. It does not do anything more and it has no bioinformatics awareness.

For the second example you should edit your answer and include more of the code. It is impossible to troubleshoot as posted - we don't know all the ROSALIND examples by heart.

ADD REPLYlink modified 6.0 years ago • written 6.0 years ago by Istvan Albert ♦♦ 79k
1
gravatar for Istvan Albert
6.0 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

Works here, not sure what the output should be but finishes just fine

import urllib,urllib2
url = 'http://www.uniprot.org/mapping/' 

params = {
    'from':'ACC',
    'to':'P_REFSEQ_AC',
    'format':'tab',
    'query':'A2Z669 B5ZC00 P07204_TRBM_HUMAN P20840_SAG1_YEAST'
}  

data = urllib.urlencode(params)  
request = urllib2.Request(url, data)  
contact = "my.email.address"  
request.add_header('User-Agent', 'Python contact')  
response = urllib2.urlopen(request)  
page = response.read()

print page

produces

From    To
B5ZC00    YP_002284940.1
ADD COMMENTlink modified 6.0 years ago • written 6.0 years ago by Istvan Albert ♦♦ 79k

IDLE (python 2.7) hangs for like 30secs and then I get that. That's discouraging:
Traceback (most recent call last): File "/home/olivier/ROSALIND_scripts/uniprotTest.py", line 15, in <module> response = urllib2.urlopen(request) File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python2.7/urllib2.py", line 400, in open response = self._open(req, data) File "/usr/lib/python2.7/urllib2.py", line 418, in _open '_open', req) File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain result = func(*args) File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open raise URLError(err) URLError: <urlopen error [Errno 110] Connection timed out>

Thanks nevertheless. My aim is to get the records for the uniprot IDs, then I'll be able to manage. I guess I gotta try harder.

ADD REPLYlink modified 6.0 years ago • written 6.0 years ago by Olivier440

I think I have a modem problem Istvan. I had same in the past (I don't know why) with Entrez from Biopython. I added this line at the top, before using "urlib.urlretrieve()": os.environ['http_proxy'] = 'http://193.62.193.81:80'.
Hope it helps someone.. Btw I changed the title of the question. I don't really know whether to use this post as an answer.

ADD REPLYlink modified 6.0 years ago • written 6.0 years ago by Olivier440

your problem is very likely caused by local settings on your computer. Note that the Windows version of python will make use of the proxy and other network settings as specified in Internet Explorer - very unexpected but something to look out for if you are on Windows - it has to do the way network access is integrated into the OS

ADD REPLYlink written 6.0 years ago by Istvan Albert ♦♦ 79k

Thanks for your help. Using IDLE with python2.7 on Ubuntu12.04. I guess I have to set these parameters each time I have to access these databases.

ADD REPLYlink written 6.0 years ago by Olivier440
1
gravatar for Jose Manuel Duarte
4.9 years ago by
Zurich
Jose Manuel Duarte280 wrote:

You asked for Python, but in any case I thought it would be good to mention that there's a beautiful UniProt java API developed by the UniProt people: http://www.ebi.ac.uk/uniprot/remotingAPI/

ADD COMMENTlink written 4.9 years ago by Jose Manuel Duarte280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1155 users visited in the last hour