Question: Problem Parsing Xml In Biopython
1
gravatar for Amno
7.3 years ago by
Amno10
Amno10 wrote:

Hi mates, I m new in python and I m trying to parse a result from local Blast...here the code.

from Bio.Blast.Applications import NcbiblastpCommandline
from Bio.Blast import NCBIStandalone
from Bio.Blast import NCBIXML

name = "virusprotein.fa"

data_base = "virusprot.fa"
out_file = "blast_test.xml"
blastp_cline = NcbiblastpCommandline(query=name, db=data_base, evalue=0.001, out=out_file)
print "hi" # code works until here to parse results in xml

result_handle = open("blast_test.xml")
blast_records = NCBIXML.parse(result_handle)
blast_record = blast_records.next()

#for alignment in blast_record.alignments:
    #for hsp in alignment.hsps:
         #if hsp.expect < evalue:
             #print 'Sequence:', alignment.title
            #print 'Length:', alignment.length
            #print 'E value:', hsp.expect
            #print hsp.query[0:50] + '...'
            #print hsp.match[0:50] + '...'
            #print hsp.sbjct[0:50] + '...'

As you can see in commented code, its a module for parsing and a class to print a summary, actually this work when I do Blast on Internet, but not when I do it locally. The code ony works until (print "hi"). When I try to execute the code bellow it says :

Traceback (most recent call last):
  File "blast_all.py", line 14, in <module>
    blast_record = blast_records.next()
  File "/usr/local/lib/python2.7/dist-packages/biopython-1.58-py2.7-linux-i686.egg/Bio/Blast/NCBIXML.py", line 624, in parse
    % (XML_START, repr(text[:20])))
ValueError: Your XML file did not start with <?xml... but instead 'BLASTP 2.2.25+\n\n\nRef

This may by easy, but I was and still whole the day with it, please any suggestion is welcome. thanks in advance

python biopython blast parsing • 4.7k views
ADD COMMENTlink modified 7.3 years ago by Brad Chapman9.4k • written 7.3 years ago by Amno10
6
gravatar for Brad Chapman
7.3 years ago by
Brad Chapman9.4k
Boston, MA
Brad Chapman9.4k wrote:

The error message from the XML parser indicates that 'blast_test.xml' is not actually an XML file. If you look at the file, you should see Blast 'human readable' output. To have blastp generate XML output, you want to set '-outfmt':

blastp_cline = NcbiblastpCommandline(query=name, db=data_base, evalue=0.001,
                                     outfmt=5, out=out_file)

The Biopython Tutorial has a full example in section 7.2.3:

http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc86

ADD COMMENTlink modified 7.3 years ago • written 7.3 years ago by Brad Chapman9.4k

Thank you man, you great.

ADD REPLYlink written 7.3 years ago by Amno10
1
gravatar for Chris Maloney
7.3 years ago by
Chris Maloney330
Bethesda, MD
Chris Maloney330 wrote:

First rule in debugging code: "Read the error message!".

ADD COMMENTlink written 7.3 years ago by Chris Maloney330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 904 users visited in the last hour