Question: Problem Parsing Xml In Biopython
7.3 years ago by
Amno10 wrote:

Hi mates, I m new in python and I m trying to parse a result from local the code.

from Bio.Blast.Applications import NcbiblastpCommandline
from Bio.Blast import NCBIStandalone
from Bio.Blast import NCBIXML

name = "virusprotein.fa"

data_base = "virusprot.fa"
out_file = "blast_test.xml"
blastp_cline = NcbiblastpCommandline(query=name, db=data_base, evalue=0.001, out=out_file)
print "hi" # code works until here to parse results in xml

result_handle = open("blast_test.xml")
blast_records = NCBIXML.parse(result_handle)
blast_record =

#for alignment in blast_record.alignments:
    #for hsp in alignment.hsps:
         #if hsp.expect < evalue:
             #print 'Sequence:', alignment.title
            #print 'Length:', alignment.length
            #print 'E value:', hsp.expect
            #print hsp.query[0:50] + '...'
            #print hsp.match[0:50] + '...'
            #print hsp.sbjct[0:50] + '...'

As you can see in commented code, its a module for parsing and a class to print a summary, actually this work when I do Blast on Internet, but not when I do it locally. The code ony works until (print "hi"). When I try to execute the code bellow it says :

Traceback (most recent call last):
  File "", line 14, in <module>
    blast_record =
  File "/usr/local/lib/python2.7/dist-packages/biopython-1.58-py2.7-linux-i686.egg/Bio/Blast/", line 624, in parse
    % (XML_START, repr(text[:20])))
ValueError: Your XML file did not start with <?xml... but instead 'BLASTP 2.2.25+\n\n\nRef

This may by easy, but I was and still whole the day with it, please any suggestion is welcome. thanks in advance

python biopython blast parsing • 4.7k views
7.3 years ago by Amno10
7.3 years ago by
Brad Chapman9.4k
Boston, MA
Brad Chapman9.4k wrote:

The error message from the XML parser indicates that 'blast_test.xml' is not actually an XML file. If you look at the file, you should see Blast 'human readable' output. To have blastp generate XML output, you want to set '-outfmt':

blastp_cline = NcbiblastpCommandline(query=name, db=data_base, evalue=0.001,
                                     outfmt=5, out=out_file)

The Biopython Tutorial has a full example in section 7.2.3:

7.3 years ago by Brad Chapman9.4k

Thank you man, you great.

7.3 years ago by Amno10
7.3 years ago by
Chris Maloney330
Bethesda, MD
Chris Maloney330 wrote:

First rule in debugging code: "Read the error message!".

7.3 years ago by Chris Maloney330
