Question: Biopython: Entrez.efetch causes UnboundLocalError
0
gravatar for jens.einloft
4.0 years ago by
Germany
jens.einloft0 wrote:

Hello,

 

i work with Biopyhton and try to find pathways in which certain Proteins are involved. Therefore i use the following code:

from Bio import Entrez
handle = Entrez.efetch(id = "1134002", db = "biosystems", retmode = "xml")
data = Entrez.read(handle)    
handle.close()   

That will cause the following error:

  File "/home/jens/Desktop/pathways.py", line 19, in <module>
    data = Entrez.read(handle)    
  File "/usr/lib/python2.7/dist-packages/Bio/Entrez/__init__.py", line 372, in read
    record = handler.read(handle)
  File "/usr/lib/python2.7/dist-packages/Bio/Entrez/Parser.py", line 187, in read
    self.parser.ParseFile(handle)
  File "/usr/lib/python2.7/dist-packages/Bio/Entrez/Parser.py", line 486, in externalEntityRefHandler
    self.dtd_urls.append(url)
UnboundLocalError: local variable 'url' referenced before assignment

 

I take a look into the Parser.py class and found this:

def externalEntityRefHandler(self, context, base, systemId, publicId):
    """The purpose of this function is to load the DTD locally, instead
    of downloading it from the URL specified in the XML. Using the local
    DTD results in much faster parsing. If the DTD is not found locally,
    we try to download it. If new DTDs become available from NCBI,
    putting them in Bio/Entrez/DTDs will allow the parser to see them."""
    urlinfo = _urlparse(systemId)
    #Following attribute requires Python 2.5+
    #if urlinfo.scheme=='http':
    if urlinfo[0]=='http':
        # Then this is an absolute path to the DTD.
        url = systemId
    elif urlinfo[0]=='':
        # Then this is a relative path to the DTD.
        # Look at the parent URL to find the full path.
        try:
            url = self.dtd_urls[-1]
        except IndexError:
            # Assume the default URL for DTDs if the top parent
            # does not contain an absolute path
            source = "http://www.ncbi.nlm.nih.gov/dtd/"
        else:
            source = os.path.dirname(url)
        # urls always have a forward slash, don't use os.path.join
        url = source.rstrip("/") + "/" + systemId
    self.dtd_urls.append(url)

I have done a little bit debugging and found the error. In my case, the urlinfo[0] contains "ftp". This case is not handled in the if/elif construct. And so the url parameter is not set.

Is this a bug in Biopython or do i handle it the wrong way?

biopython python • 2.4k views
ADD COMMENTlink modified 4.0 years ago by Peter5.8k • written 4.0 years ago by jens.einloft0
2
gravatar for Peter
4.0 years ago by
Peter5.8k
Scotland, UK
Peter5.8k wrote:

It's a bug, reported about the same time as your question here:

https://github.com/biopython/biopython/issues/527

ADD COMMENTlink written 4.0 years ago by Peter5.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1802 users visited in the last hour