Hi all, I have been trying to write a program to search for specific proteins in an organisms genome using Biopython, but i have had some trouble in getting results. I am new to programing and my program gives me the error:
File "/Users/brendanreilly/anaconda/lib/python3.5/site-packages/Bio/Blast/NCBIXML.py", line 572, in parse text = handle.read(BLOCK)
AttributeError: 'str' object has no attribute 'read'
I think that my program is having trouble parsing my blast output but i am not sure. this is how i set up my blast and parse:
from Bio.Seq import Seq from Bio.Alphabet import IUPAC from Bio.Blast import NCBIWWW resultclk = NCBIWWW.qblast(program="blastp", database="refseq_genomic", sequence="clk, nemo" , entrez_query="txid743[ORGN]") resultclk #stdout, stderr = resultclk() save_clk = open("clk.xml", "w") save_clk.write(resultclk.read()) save_clk.close() #resultclk.close() #resultclk = open("clk_1.xml") from Bio.Blast import NCBIXML blast_records = NCBIXML.parse("save_clk") for blast_record in blast_records: for alignment in blast_record.alignments: for hsp in alignment.hsps: print('****Alignment****') print('sequence:', alignment.title) print('length:', alignment.length) print('score:', hsp.score) print('gaps:', hsp.gaps) print('e-value:', hsp.expect) print(hsp.query[0:90] +'...') print(hsp.match[0:90] +'...') print(hsp.subject[0:90] +'...') resultclk.close()
I made the protein sequences Seq objects(clk and demo) so i did not include them here to save space but they are in the code. not really sure what the problem is or how to fix it but I was hoping someone might know where i when wrong.