This is probably a fairly basic question, so I apologize in advance, but I can't seem to figure out how to output xml format using Biopython. Basically, I have a fairly large BLAST results file in xml format and I'm trying to extract a portion of that file using a list of specific queries I am interest in. I can find the queries in the larger file, but I can't seem to output them into xml format. Here is the script I am currently using:
#!/usr/bin/env python
import sys
import os
import sets
import Bio
from sets import Set
from Bio.Blast import NCBIXML
# Usage.
if len(sys.argv) < 2:
print ""
print "This program extracts blast results from an xml file given a list of query sequences"
print "Usage: %s -list file1 -xml file2 -out file3"
print "-list: list of sequence names"
print "-xml: fasta file"
print "-out: outfile name"
print ""
sys.exit()
# Parse args.
for i in range(len(sys.argv)):
if sys.argv[i] == "-list":
infile1 = sys.argv[i+1]
elif sys.argv[i] == "-xml":
infile2 = sys.argv[i+1]
elif sys.argv[i] == "-out":
outfile = sys.argv[i+1]
fls = [infile1,infile2,outfile]
results_handle = open(fls[1], "r")
fin1 = open(fls[0],"r")
save_file = open(fls[2], "w")
geneContigs = Set([])
results_list = list()
blast_records = NCBIXML.parse(results_handle)
for line in fin1:
temp=line.lstrip('>').split()
geneContigs.add(temp[0])
fin1.close()
for blast_record in blast_records:
if(blast_record.query in geneContigs):
save_file.write(blast_record)
save_file.close()
When I do this, I get the following error:
TypeError: argument 1 must be string or read-only character buffer, not Blast
Anyone have any suggestions on how to turn this Blast record into a string in xml output format?
Thanks
I don't think biopython supports writing blast_record objects to XML. You might have to just write something yourself if you really need it. The XML output description is located here: ftp://ftp.ncbi.nlm.nih.gov/blast/documents/xml/
There is a .mod file that describes the xml. Maybe you can find a python library that'll let you map data to .mod files.
The latest version supports writing to a BLAST XML format, actually. I've written a quick example in my answer :).