Entrez Biopython error?
1
1
Entering edit mode
7.7 years ago
Brian ▴ 10

hi everyone. i´m trying to get the protein sequence using Entrez from Biopython by the code below.

from Bio import Entrez
sec="gi|20138013|sp|Q9KQE9.1|DSBE_VIBCH"   
ids="".join((sec.split("|")[1:2]))
Entrez.email ="xxxxxx@gmail.com"
handle = Entrez.efetch(db="nucleotide", id=ids, retmode="xml")   
records = Entrez.read(handle)

A few weeks ago this code work perfectly, but now it give me the next error:

UnboundLocalError                         Traceback (most recent call last) <ipython-input-21-f61a624c64e8> in <module>()
      5 Entrez.email ="!!!censored!!!"
      6 handle = Entrez.efetch(db="nucleotide", id=ids, retmode="xml")
----> 7 records = Entrez.read(handle)
      8 print ">GI "+line.rstrip()+" "+records[0]["GBSeq_primary-accession"]+" "+records[0]["GBSeq_definition"]+"\n"+str(records[0]["GBSeq_sequence"]).upper()
      9 

/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-macosx-10.10-x86_64.egg/Bio/Entrez/__init__.pyc in read(handle, validate)
    374     from .Parser import DataHandler
    375     handler = DataHandler(validate)
--> 376     record = handler.read(handle)
    377     return record
    378 

/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-macosx-10.10-x86_64.egg/Bio/Entrez/Parser.pyc in read(self, handle)
    203             raise IOError("Can't parse a closed handle")
    204         try:
--> 205             self.parser.ParseFile(handle)
    206         except expat.ExpatError as e:
    207             if self.parser.StartElementHandler:

/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-macosx-10.10-x86_64.egg/Bio/Entrez/Parser.pyc in externalEntityRefHandler(self, context, base, systemId, publicId)
    511             # urls always have a forward slash, don't use os.path.join
    512             url = source.rstrip("/") + "/" + systemId
--> 513         self.dtd_urls.append(url)
    514         # First, try to load the local version of the DTD file
    515         location, filename = os.path.split(systemId)

**UnboundLocalError: local variable 'url' referenced before assignment**

¿does the way to call Entrez change? or ¿ the input data is wrong ?.

sequence Biopython Entrez • 3.0k views
ADD COMMENT
0
Entering edit mode

You can use the 101010 button to properly format your code as I already did for a part. I also removed your emailaddress from the Traceback.

ADD REPLY
1
Entering edit mode
7.7 years ago

I have some internet issues and can't even open the damn webpage I'm suggesting you to read: https://ncbiinsights.ncbi.nlm.nih.gov/2016/07/15/ncbi-is-phasing-out-sequence-gis-heres-what-you-need-to-know/

So might be related to GI numbers no longer being supported by NCBI.

ADD COMMENT
1
Entering edit mode

Here are the two critical things related to GI's from that page

  • Any code that parses GI numbers from sequence flat files (from web, FTP, E-utilities or any other NCBI source) will break. Why? Because the GI numbers will no longer be there.
  • Any code that parses GI numbers from NCBI FASTA records (again, from any NCBI source) will break. Why? Same reason. The GI numbers will no longer be in the FASTA definition lines.
ADD REPLY
0
Entering edit mode

High-five! Accessing other webpages takes about half a minute or longer but Biostars loads quickly so I got that going for me....

ADD REPLY

Login before adding your answer.

Traffic: 2483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6