Question

HTTP Error 502-Biopython-Entrez Files

1

Entering edit mode

9.4 years ago

jasminebro2 • 0

Hi I am using biopython to pull files from NCBI using Entrez. The program works on small files but on larger files I get an error. I would really appreciate some insight or help figuring out what went wrong.

Here is the program:

from Bio import Entrez
Entrez.email = "jbro262@lsu.edu"
search_handle = Entrez.esearch(db="nucleotide",term="Saimiri",usehistory="n")
search_results = Entrez.read(search_handle)
search_handle.close()

gi_list = search_results["IdList"]
count = int(search_results["Count"])
a = open("Numfile.txt", "a+")
a.write("The number of Saimiri files are :")
a.write(str(count))
a.write("\n")
a.close()

webenv = search_results["WebEnv"]
query_key = search_results["QueryKey"]

batch_size = 25
out_handle = open("SaimiriDNA.fasta", "w")

for start in range(0,count,batch_size):
    
    end = min(count, start+batch_size)
    print("Going to download record %i to %i" % (start+1, end))
    
    fetch_handle = Entrez.efetch(db="nucleotide", rettype="fasta", retmode="text", retstart=start, retmax=batch_size, webenv=webenv, query_key=query_key)
    data=fetch_handle.read()
    fetch_handle.close()
    out_handle.write(data)
out_handle.close()

HERE ARE THE ERRORS:

Traceback (most recent call last):
  File "Entrezfiles_Saimiri.py", line 53, in <module>
    fetch_handle = Entrez.efetch(db="nucleotide", rettype="fasta", retmode="text", retstart=start, retmax=batch_size, webenv=webenv, query_key=query_key)
  File "/usr/local/lib/python3.4/

dist-packages/Bio/Entrez/__init__.py", line 149, in efetch
    return _open(cgi, variables, post)
  File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/__init__.py", line 464, in _open
    raise exception
  File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/__init__.py", line 462, in _open
    handle = _urlopen(cgi)
  File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 461, in open
    response = meth(req, response)
  File "/usr/lib/python3.4/urllib/request.py", line 571, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.4/urllib/request.py", line 499, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 579, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 502: Bad Gateway

Does this mean something is going wrong with my server while the files are downloading?

Help is greatly appreciated.

biopython entrez python3 ubuntu error • 8.1k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by jasminebro2 • 0

1

Entering edit mode

502 has usually nothing to do with the client. Could you try again in a half hour or so and see if it still exists?

ADD REPLY • link 9.4 years ago by Ram 43k

0

Entering edit mode

Thank you. Okay I will try again in a few minutes. However, It took an hour or so for the error to occur the last time. What exactly does Error 502 mean and how does that relate to a urllib.error with python?

ADD REPLY • link 9.4 years ago by jasminebro2 • 0

1

Entering edit mode

Under Python 3, you would import the HTTPError class with: from urllib.error import HTTPError

Having done that you can use it to catch the exception, see also: http://stackoverflow.com/questions/3193060/catch-specific-http-error-in-python

HTTP error code 502 is a specific server problem (in this case, an NCBI problem). See http://en.wikipedia.org/wiki/List_of_HTTP_status_codes

ADD REPLY • link 9.4 years ago by Peter 6.0k

0

Entering edit mode

Thanks! I'll use this information to help edit my code.

ADD REPLY • link 9.4 years ago by jasminebro2 • 0

0

Entering edit mode

Hey. The try/except around Entrez.fetch fixed my program. Works great now. Thanks!

ADD REPLY • link 9.4 years ago by jasminebro2 • 0

Ram · Answer 1 · 2014-11-25

1

Entering edit mode

9.4 years ago

Peter 6.0k

When making heavy use of an online service like NCBI Entrez, you should expect to get intermittent network errors like HTTP Error 502: Bad Gateway from time to time. The standard approach would be to wrap the call in a try/except block and retry it (e.g. three retries, with a pause between each).

Or just wait and retry when the NCBI is less busy (i.e. avoid USA working hours), that is often easier ;)