I'm trying to use the Biopython wrapper for blastp to download matching protein sequences for some sequences that I have stored on my computer. I would like these matching sequences in FASTA format, similar to how on the web server one can select all sequences producing significant alignment and download "FASTA (aligned sequences)". This was my attempt:
from Bio.Blast import NCBIWWW from Bio.Blast import NCBIXML base_dir = "/Users/kjsdhjasv/Documents/cs2017/coevo/" inputs = ['rnapBeta', 'S3', 'S4', 'S5'] for input in inputs: fasta_string = open(base_dir + 'data/e_coli_k12/' + input + '.fa').read() out = base_dir + 'results/' + input + '.fa' with NCBIWWW.qblast("blastp", "nr", fasta_string) as result_handle: with open(out, 'w') as out_file: blast_record = NCBIXML.read(result_handle) for alignment in blast_record.alignments: for hsp in alignment.hsps: out_file.write(alignment.title) out_file.write(hsp.sbjct)
How can I extract a fasta from the blast result?