where is "% Identity" column in blast-xml
1
3
Entering edit mode
8.5 years ago
helenhvalask ▴ 30

I am trying to parse blast-xml file from blastp search using searchIO in Biopython. However, I am not sure which one I should use for extracting % identity.

In blast-tab file, I can use hsp.pident, does anyone know the equivalent attribute name for blast-xml. Or I should derive myself, hsp.ident_num/hsp.aln_span*100. Thanks

searchIO biopython blast • 3.7k views
ADD COMMENT
2
Entering edit mode
8.4 years ago
Peter 6.0k

The BLAST XML output format does not contain the percentage identify as an explicit field, so yes, you must calculate it from the number of identities and the alignment length.

See for example my BLAST XML to tabular conversion script: https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/blastxml_to_tabular.py#L230

(Note if you are using Python 2, beware of integer division if you do the calculation as currently written!)

ADD COMMENT

Login before adding your answer.

Traffic: 2820 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6