Hi there,
I'm trying to use BioPython to parse the xml output generated by psiblast (local execution, -outfmt 5). I aim to identify each of the 'iteration round', which should be one of the attributes of the Bio.Blast.Record.PSIBlast
class, so I can extract the information for the last iteration only. However, I'm failing at extracting this 'round' information. Moreover, I think the parser is not guessing the class correctly since Bio.Blast.Record.PSIBlast
and Bio.Blast.Record.Blast
are two different classes:
from Bio.Blast import NCBIXML
# read the xml
handle = open("test.xml")
# parse the file to obtain 'records'
records = NCBIXML.parse(handle)
# print the records to see what's going on
for record in records:
print(record)
>>> <Bio.Blast.Record.Blast object at 0x7fce35a6f6d0>
>>> <Bio.Blast.Record.Blast object at 0x7fce359ba650>
>>> <Bio.Blast.Record.Blast object at 0x7fce358c1610>
I have also tried the xml2 output from psiblast (-outfmt 16) with the same result.
According to the class diagrams shown in documentation, I should be able to retrieve the round information if the object is of the Bio.Blast.Record.PSIBlast
class. On the other hand, I'm able to retrieve the 'alignments' information:
for record in records:
print(record.rounds)
>>> AttributeError: 'Blast' object has no attribute 'rounds'
for record in records:
print(record.alignments)
>>> <Bio.Blast.Record.Alignment object at 0x7fce35fee390>, <Bio.Blast.Record.Alignment object at 0x7 ...
I'm sure I must be missing something. 'rounds' is supposed to be one of the attributes of the Bio.Blast.Record.PSIBlast
class, so I don't know why I can not retrieve this information.
Any advice is welcome.
can you please post a snapshot of the XML ?
Thank you for your response Pierre. Sure, you can look at or download it here.
there is no "round" in the xml.
but there is
Iteration_iter-num
You are right, there is no 'round' in the xml, but there is not either an 'alignments' node and yet I can get it from the record. I assume the NCBIXML parser reads the xml and assigns the xml nodes to objects of the
Bio.Blast.Record.Blast
orBio.Blast.Record.PSIBlast
class. These classes and their attributes are depicted here.