Question

how to parse blast in biopython

2

Entering edit mode

7.9 years ago

Breilly92 ▴ 20

Hi all, I have been trying to write a program to search for specific proteins in an organisms genome using Biopython, but i have had some trouble in getting results. I am new to programing and my program gives me the error:

File "/Users/brendanreilly/anaconda/lib/python3.5/site-packages/Bio/Blast/NCBIXML.py", line 572, in parse text = handle.read(BLOCK)

AttributeError: 'str' object has no attribute 'read'

I think that my program is having trouble parsing my blast output but i am not sure. this is how i set up my blast and parse:

from Bio.Seq import Seq
from Bio.Alphabet import IUPAC
from Bio.Blast import NCBIWWW
resultclk = NCBIWWW.qblast(program="blastp", database="refseq_genomic", sequence="clk, nemo" , entrez_query="txid743[ORGN]")
resultclk
#stdout, stderr = resultclk()
save_clk = open("clk.xml", "w")
save_clk.write(resultclk.read())
save_clk.close()
#resultclk.close()
#resultclk = open("clk_1.xml")
from Bio.Blast import NCBIXML
blast_records = NCBIXML.parse("save_clk")

for blast_record in blast_records:
    for alignment in blast_record.alignments:
        for hsp in alignment.hsps:
            print('****Alignment****')
            print('sequence:', alignment.title)
            print('length:', alignment.length)
            print('score:', hsp.score)
            print('gaps:', hsp.gaps)
            print('e-value:', hsp.expect)
            print(hsp.query[0:90] +'...')
            print(hsp.match[0:90] +'...')
            print(hsp.subject[0:90] +'...')
resultclk.close()

I made the protein sequences Seq objects(clk and demo) so i did not include them here to save space but they are in the code. not really sure what the problem is or how to fix it but I was hoping someone might know where i when wrong.

thank you

blast python • 6.6k views

ADD COMMENT • link 7.9 years ago by Breilly92 ▴ 20

score 3 · Answer 1 · 2016-05-25

3

Entering edit mode

7.9 years ago

Philipp Bayer 8.3k

 AttributeError: 'str' object has no attribute 'read'

means that this line here

 blast_records = NCBIXML.parse("save_clk")

expects a file handle, not a string - "save_clk" is a string.

You have to open the file first:

 blast_records = NCBIXML.parse(open("save_clk"))

ADD COMMENT • link 7.9 years ago by Philipp Bayer 8.3k

score 0 · Answer 2 · 2016-05-26

0

Entering edit mode

7.9 years ago

Breilly92 ▴ 20

thanks man that was a huge help sorry if it was a dumb question

ADD COMMENT • link 7.9 years ago by Breilly92 ▴ 20

5

Entering edit mode

There are no dumb questions :)

ADD REPLY • link 7.9 years ago by Philipp Bayer 8.3k