Hi all, I have been trying to write a program to search for specific proteins in an organisms genome using Biopython, but i have had some trouble in getting results. I am new to programing and my program gives me the error:
File "/Users/brendanreilly/anaconda/lib/python3.5/site-packages/Bio/Blast/NCBIXML.py", line 572, in parse text = handle.read(BLOCK)
AttributeError: 'str' object has no attribute 'read'
I think that my program is having trouble parsing my blast output but i am not sure. this is how i set up my blast and parse:
from Bio.Seq import Seq
from Bio.Alphabet import IUPAC
from Bio.Blast import NCBIWWW
resultclk = NCBIWWW.qblast(program="blastp", database="refseq_genomic", sequence="clk, nemo" , entrez_query="txid743[ORGN]")
resultclk
#stdout, stderr = resultclk()
save_clk = open("clk.xml", "w")
save_clk.write(resultclk.read())
save_clk.close()
#resultclk.close()
#resultclk = open("clk_1.xml")
from Bio.Blast import NCBIXML
blast_records = NCBIXML.parse("save_clk")
for blast_record in blast_records:
for alignment in blast_record.alignments:
for hsp in alignment.hsps:
print('****Alignment****')
print('sequence:', alignment.title)
print('length:', alignment.length)
print('score:', hsp.score)
print('gaps:', hsp.gaps)
print('e-value:', hsp.expect)
print(hsp.query[0:90] +'...')
print(hsp.match[0:90] +'...')
print(hsp.subject[0:90] +'...')
resultclk.close()
I made the protein sequences Seq objects(clk and demo) so i did not include them here to save space but they are in the code. not really sure what the problem is or how to fix it but I was hoping someone might know where i when wrong.
thank you
There are no dumb questions :)