Question: Problems With Biopython When Running The Ncbistandalone.Py Program
1
gravatar for saima
6.4 years ago by
saima10
saima10 wrote:

Hi, I am having problem while running NCBIStandalone library of biopython. i want to retrieve sequence titles from output of blast but it gives error on iterator.

Following is the code to retrieve these sequence titles.

result_handle= open("foo.txt")
blast_parser = NCBIStandalone.BlastParser()
blast_iterator = NCBIStandalone.Iterator(result_handle, blast_parser)
print blast_iterator
for blast_record in blast_iterator:
  print blast_record
  E_VALUE_THRESH = 0.0
  for alignment in blast_record.alignments:
      for hsp in alignment.hsps:
          if hsp.expect < E_VALUE_THRESH:
              print 'sequence:', alignment.title

But it gives following errors. it makes iterator properly but does not loop it.

<Bio.Blast.NCBIStandalone.Iterator object at 0x1aa6450>
Traceback (most recent call last):
  File "blast2.py", line 45, in <module>
    for blast_record in blast_iterator:
  File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 1659, in next
    return self._parser.parse(File.StringHandle(data))
  File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 818, in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 112, in feed
    read_and_call_until(uhandle, consumer.noevent, contains='BLAST')
  File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 337, in read_and_call_until
    line = safe_readline(uhandle)
  File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 413, in safe_readline
    raise ValueError("Unexpected end of stream.")
ValueError: Unexpected end of stream.

Any help would be highly appreciated. Thanx Saima

biopython blast • 2.2k views
ADD COMMENTlink modified 5.6 years ago by Biostar ♦♦ 20 • written 6.4 years ago by saima10

The message "Unexpected end of stream." means the parser reached the end of the file before it expected it. Either your file is truncated, or the format has changed slightly (again). I hope you'd read the tutorial which warns that the plain text parser is fragile and this is (almost) to be expected if you try the latest BLAST release?

ADD REPLYlink written 6.4 years ago by Peter5.8k
0
gravatar for Damian Kao
6.4 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

The NCBIStandalong.Iterator object can't be looped over with the "for x in list" syntax. You need to use the .next() function to loop through the file. So something like this:

while 1:
   record = blast_iterator.next()
   if record is None
      break

   #do stuff with your record

Yeah, the syntax is not very pythonic and kinda ugly.

ADD COMMENTlink modified 6.4 years ago • written 6.4 years ago by Damian Kao15k

thanx for your help but record = blast_iterator.next() gives the same error.

traceback (most recent call last):
File "blast2.py", line 46, in <module>
record = blast_iterator.next()
File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 1659, in next
  return self._parser.parse(File.StringHandle(data))
File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 818, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 112, in feed
read_and_call_until(uhandle, consumer.noevent, contains='BLAST')
File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 337, in read_and_call_until
line = safe_readline(uhandle)
File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 413, in safe_readline
raise ValueError("Unexpected end of stream.")
ValueError: Unexpected end of stream.

I don't know what is the reason?

ADD REPLYlink modified 6.4 years ago • written 6.4 years ago by saima10

Ugly yes. Actually it is just old-fashioned Python code before iterators were made easier to use. This module will probably be formally deprecated, but see also the forthcoming Biopython SearchIO module which will offer a more consistent API, http://biopython.org/wiki/SearchIO

ADD REPLYlink written 6.4 years ago by Peter5.8k
0
gravatar for bow
6.4 years ago by
bow780
Netherlands
bow780 wrote:

Looks like a problem with the BLAST output file you're trying to parse. I tried your snippet with my own sample file and it works ok. Can you provide a sample output or attach the file? What BLAST version did you use to generate the file?

Also, your snippet sets a threshold limit of '< 0.0'. You will have to increase this limit if you want to see any sequence alignment titles, since you can't have negative E-values.

ADD COMMENTlink written 6.4 years ago by bow780

Blast version is BLASTN 2.2.27+, output file is quiet large and i don't know how to upload it . Sorry for that.(can you plz help me for that?)

ADD REPLYlink written 6.4 years ago by saima10

The simplest answer is that recommended in the Biopython tutorial, don't use the plain text BLAST output. The XML is very detailed, but for many tasks the simple BLAST tabular output is smaller and easier to work with.

However, if there is a new problem with plain text from BLASTN 2.2.27+ we can try to fix the parser.

ADD REPLYlink modified 6.4 years ago • written 6.4 years ago by Peter5.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1500 users visited in the last hour