Entering edit mode
9.0 years ago
jens.einloft
•
0
Hello,
I work with Biopython and try to find pathways in which certain Proteins are involved. Therefore I use the following code:
from Bio import Entrez
handle = Entrez.efetch(id = "1134002", db = "biosystems", retmode = "xml")
data = Entrez.read(handle)
handle.close()
That will cause the following error:
File "/home/jens/Desktop/pathways.py", line 19, in <module>
data = Entrez.read(handle)
File "/usr/lib/python2.7/dist-packages/Bio/Entrez/__init__.py", line 372, in read
record = handler.read(handle)
File "/usr/lib/python2.7/dist-packages/Bio/Entrez/Parser.py", line 187, in read
self.parser.ParseFile(handle)
File "/usr/lib/python2.7/dist-packages/Bio/Entrez/Parser.py", line 486, in externalEntityRefHandler
self.dtd_urls.append(url)
UnboundLocalError: local variable 'url' referenced before assignment
I take a look into the Parser.py class and found this:
def externalEntityRefHandler(self, context, base, systemId, publicId):
"""The purpose of this function is to load the DTD locally, instead
of downloading it from the URL specified in the XML. Using the local
DTD results in much faster parsing. If the DTD is not found locally,
we try to download it. If new DTDs become available from NCBI,
putting them in Bio/Entrez/DTDs will allow the parser to see them."""
urlinfo = _urlparse(systemId)
#Following attribute requires Python 2.5+
#if urlinfo.scheme=='http':
if urlinfo[0]=='http':
# Then this is an absolute path to the DTD.
url = systemId
elif urlinfo[0]=='':
# Then this is a relative path to the DTD.
# Look at the parent URL to find the full path.
try:
url = self.dtd_urls[-1]
except IndexError:
# Assume the default URL for DTDs if the top parent
# does not contain an absolute path
source = "http://www.ncbi.nlm.nih.gov/dtd/"
else:
source = os.path.dirname(url)
# urls always have a forward slash, don't use os.path.join
url = source.rstrip("/") + "/" + systemId
self.dtd_urls.append(url)
I have done a little bit debugging and found the error. In my case, the urlinfo[0]
contains "ftp". This case is not handled in the if/elif construct. And so the url
parameter is not set.
Is this a bug in Biopython or do I handle it the wrong way?