I try to write a Biopython script that performs a blast search and returns the id of the first hit. Although my python skills are rather rudimentary, this seems to work. But for my specific purpose it would be better if the script returns the first hit according to Blast score, not E value. Is it possible to sort the blast output by score first? Cheers david
In your code snippet,
record.alignments is just a list so you could sort it, for example using as the sort key the alignment with the highest HSP bitscore. Try:
from Bio.Blast import NCBIXML handle = ... for record in NCBIXML.parse(handle): if record.alignments: #Resort using bit score rather than e-value record.alignments.sort(key = lambda align: -max(hsp.score for hsp in align.hsps)) print record.alignments.hit_id
Perhaps too much magic? I'm sorting using the negative of the max bitscore as a shorthand for using the sort method's
reverse=True option. i.e.
from Bio.Blast import NCBIXML handle = ... for record in NCBIXML.parse(handle): if record.alignments: #Resort using bit score rather than e-value record.alignments.sort(key = lambda align: max(hsp.score for hsp in align.hsps), reverse=True) print record.alignments.hit_id
The other tricks here are using a lambda anonymous (unnamed) function, and a simple generator expression to find the best HSP bitscore.
Note that this may not do what you want if there is a good hit but split into two HSPs.
What is the output you are getting? I assume it is a *.blast file, html or something like that. I'm afraid that you will need to parse the outputs into to objects and then sort them according to the score. This is rather IT approach. I'd bet there are plenty of blast output parsers, otherwise you will need to write one on your own.
Or you might find the answer in the following webpage http://seqanswers.com/forums/showthread.php?t=6869