Biopython--Filter BLAST Results By Percent Identity
1
0
Entering edit mode
2.0 years ago
schmiggle • 0

Is there any way to filter BLAST results from the XML you get from Biopython's NCBIWWW module on the basis of percent identity? I can't find anything like that in the XML, which looks like this in what I think is the relevant section for a given result:

<Hit_hsps>
  <Hsp>
      <Hsp_num>1</Hsp_num>
      <Hsp_bit-score>2673.88</Hsp_bit-score>
      <Hsp_score>2964</Hsp_score>
      <Hsp_evalue>0</Hsp_evalue>
      <Hsp_query-from>1</Hsp_query-from>
      <Hsp_query-to>1482</Hsp_query-to>
      <Hsp_hit-from>4596</Hsp_hit-from>
      <Hsp_hit-to>3115</Hsp_hit-to>
      <Hsp_query-frame>1</Hsp_query-frame>
      <Hsp_hit-frame>-1</Hsp_hit-frame>
      <Hsp_identity>1482</Hsp_identity>
      <Hsp_positive>1482</Hsp_positive>
      <Hsp_gaps>0</Hsp_gaps>
      <Hsp_align-len>1482</Hsp_align-len>

Here's the code I used to generate that XML:

from Bio.Blast import NCBIWWW, NCBIXML
from Bio import SeqIO, Entrez

file_to_read = 'liberibacter_16s_sequences.fasta'

blast_list = []

for record in SeqIO.parse(file_to_read, 'fasta'):
    result_handle = NCBIWWW.qblast("blastn", "nt", record.seq)
    blast_list.append(result_handle)

with open('results.xml', 'w') as save_file: 
    for handle in blast_list:
        blast_results = handle.read()
        save_file.write(blast_results)
save_file.close

Is there a way to parse this XML to pull out what I'm looking for, and if not, is there some way to adjust the parameters of my code to pull down that information from BLAST?

biopython BLAST • 799 views
ADD COMMENT
0
Entering edit mode

Do you require it to be in XML format? You could easily return a BLAST tab format for example and filter by column.

ADD REPLY
0
Entering edit mode

The Biopython docs say everything else breaks easily, but there's no particular reason my project needs it to be in XML. I'll give that a shot.

ADD REPLY
1
Entering edit mode
2.0 years ago

Parsing XML can be done very quickly with tools like xml2dict that turns an XML into a dictionary, looks like there are now multiple ways to do that:

or a more generic tool called untangle.

Note that Biopython can also directly parse BLAST XML with:

ADD COMMENT

Login before adding your answer.

Traffic: 3038 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6