get match names from blast using biopython
2
0
Entering edit mode
17 months ago

Hi All, I am trying to make an overview of the most found organisms in a blast query by using biopython. How do i get the names of the organisms found in the search?

from Bio.Blast import NCBIXML
result_handle = open("../data/blastOutput/test_2.xml")
blast_records = list(NCBIXML.parse(result_handle))
for item in blast_records:
    print("\n", item.match,"\n")
blast biopython • 654 views
ADD COMMENT
0
Entering edit mode
17 months ago

just for fun, a one-liner using xpath expressions ad wget...

xmllint --xpath '//Hit_accession/text()' input.blastn.xml  |\
sort | uniq |\
awk  'BEGIN {printf("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=nucleotide&id=");} {printf("%s%s",(NR==1?"":","),$1);}' |\
xargs wget -q -O - |\
xmllint --xpath '//Item[@Name="TaxId"]/text()' - |\
sort | uniq |\
awk  'BEGIN {printf("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=taxonomy&id=");} {printf("%s%s",(NR==1?"":","),$1);}' |\
xargs wget -q -O - |\
xmllint --xpath '//Item[@Name="ScientificName"]/text()' -

Acinetobacter baumannii AC12
Acinetobacter baumannii AC30
Amycolatopsis lurida NRRL 2430
Thermosensitive cloning vector pTN1
Sphingobacterium sp. PM2-P1-29
Desulfitobacterium hafniense
Escherichia coli
ADD COMMENT
0
Entering edit mode
16 months ago
Alban Nabla ▴ 30

if you'd like to remain in Biopython, you could extract this information from the accession or title of each alignment:

from Bio.Blast import NCBIXML
import collections

result = open("blastoutput.xml")
records = NCBIXML.parse(result)
item = next(records) 
organisms = []

def get_organism(title):
    """Given an item title, return the organism as a string.
    """
    parts = title.split("|")
    words = parts[4].split(" ")
    return words[1]

for alignment in item.alignments:
    organisms.append(get_organism(alignment.title))

print(collections.Counter(organisms))
ADD COMMENT

Login before adding your answer.

Traffic: 2038 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6