Question: Parsing Blast Output Biopython Error
gravatar for Ankur
9.7 years ago by
Ankur40 wrote:

Hi, I have the following code

 def runBLAST(self):
        print "Running BLAST .........."
        cmd=subprocess.Popen("blastp -db nr -query repeat.txt -out out.faa -evalue 0.001 -gapopen 11 -gapextend 1 -matrix BLOSUM62 -remote -outfmt 5",shell=True)
        blast_records = NCBIXML.parse(f1)
        save_file = open("my_fasta_seq.fasta", 'w')
        for blast_record in blast_records[:10]:
            for alignment in blast_record.alignments:
                for hsp in alignment.hsps:
                    save_file.write('>%s\n' % (alignment.hseq,))
        for record in SeqIO.parse(f2,"fasta"):

I get the error on TypeError: for blastrecord in blastrecords[:10]: saying 'generator' object is not subscriptable. I am looking to get top 10 blast hits (sequences)

ncbi python biopython blast error • 3.9k views
ADD COMMENTlink modified 9.7 years ago by Michael Kuhn5.0k • written 9.7 years ago by Ankur40
gravatar for Michael Kuhn
9.7 years ago by
Michael Kuhn5.0k
EMBL Heidelberg
Michael Kuhn5.0k wrote:

This is not a specific BioPython problem, but a general Python question, answered e.g. on StackOverflow. It might be that BioPython only parses the next result on demand, in this case you might be better off with:

for i, blast_record in enumerate(blast_records):
    if i == 10: break
ADD COMMENTlink written 9.7 years ago by Michael Kuhn5.0k

It's also a follow-up to the previous question and perhaps should have continued there instead. It's fine to edit your questions and discuss answers in the comments, rather than starting a new question for every variation of the same problem.

ADD REPLYlink modified 17 months ago by Ram32k • written 9.7 years ago by Neilfws49k

As Michael says, blast_records is a generator/iterator. You can loop over it or iterate explicitly by calling next(), but you cannot access records by index. This is a general design pattern for coping with very large files composed of multiple smaller records, also used in the the Biopython SeqIO parse function etc.

ADD REPLYlink written 9.7 years ago by Peter5.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1691 users visited in the last hour