Question: Entrez Biopython error?
1
gravatar for Brian
2.5 years ago by
Brian10
Brian10 wrote:

hi everyone. i´m trying to get the protein sequence using Entrez from Biopython by the code below.

from Bio import Entrez
sec="gi|20138013|sp|Q9KQE9.1|DSBE_VIBCH"   
ids="".join((sec.split("|")[1:2]))
Entrez.email ="xxxxxx@gmail.com"
handle = Entrez.efetch(db="nucleotide", id=ids, retmode="xml")   
records = Entrez.read(handle)

A few weeks ago this code work perfectly, but now it give me the next error:

UnboundLocalError                         Traceback (most recent call last) <ipython-input-21-f61a624c64e8> in <module>()
      5 Entrez.email ="!!!censored!!!"
      6 handle = Entrez.efetch(db="nucleotide", id=ids, retmode="xml")
----> 7 records = Entrez.read(handle)
      8 print ">GI "+line.rstrip()+" "+records[0]["GBSeq_primary-accession"]+" "+records[0]["GBSeq_definition"]+"\n"+str(records[0]["GBSeq_sequence"]).upper()
      9 

/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-macosx-10.10-x86_64.egg/Bio/Entrez/__init__.pyc in read(handle, validate)
    374     from .Parser import DataHandler
    375     handler = DataHandler(validate)
--> 376     record = handler.read(handle)
    377     return record
    378 

/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-macosx-10.10-x86_64.egg/Bio/Entrez/Parser.pyc in read(self, handle)
    203             raise IOError("Can't parse a closed handle")
    204         try:
--> 205             self.parser.ParseFile(handle)
    206         except expat.ExpatError as e:
    207             if self.parser.StartElementHandler:

/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-macosx-10.10-x86_64.egg/Bio/Entrez/Parser.pyc in externalEntityRefHandler(self, context, base, systemId, publicId)
    511             # urls always have a forward slash, don't use os.path.join
    512             url = source.rstrip("/") + "/" + systemId
--> 513         self.dtd_urls.append(url)
    514         # First, try to load the local version of the DTD file
    515         location, filename = os.path.split(systemId)

**UnboundLocalError: local variable 'url' referenced before assignment**

¿does the way to call Entrez change? or ¿ the input data is wrong ?.

entrez biopython sequence • 1.5k views
ADD COMMENTlink modified 2.5 years ago by genomax62k • written 2.5 years ago by Brian10

You can use the 101010 button to properly format your code as I already did for a part. I also removed your emailaddress from the Traceback.

ADD REPLYlink written 2.5 years ago by WouterDeCoster36k
1
gravatar for WouterDeCoster
2.5 years ago by
Belgium
WouterDeCoster36k wrote:

I have some internet issues and can't even open the damn webpage I'm suggesting you to read: https://ncbiinsights.ncbi.nlm.nih.gov/2016/07/15/ncbi-is-phasing-out-sequence-gis-heres-what-you-need-to-know/

So might be related to GI numbers no longer being supported by NCBI.

ADD COMMENTlink written 2.5 years ago by WouterDeCoster36k
1

Here are the two critical things related to GI's from that page

  • Any code that parses GI numbers from sequence flat files (from web, FTP, E-utilities or any other NCBI source) will break. Why? Because the GI numbers will no longer be there.
  • Any code that parses GI numbers from NCBI FASTA records (again, from any NCBI source) will break. Why? Same reason. The GI numbers will no longer be in the FASTA definition lines.
ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by genomax62k

High-five! Accessing other webpages takes about half a minute or longer but Biostars loads quickly so I got that going for me....

ADD REPLYlink written 2.5 years ago by WouterDeCoster36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1367 users visited in the last hour