How To Get A Blast Record From Stdout Of Ncbiblastxcommandline
3
1
Entering edit mode
12.3 years ago
Wrf ▴ 210

i am blasting sequences from a fasta file individually since the output will be handled differently depending on the quality of the hit.

currently i have this, which works fine:

blast_handle = Blastx_Command(stdin=seq_record.format("fasta"))

however, this involves writing the output to a temporary xml file and then reading it with NCBIXML.parse(file), which is quite slow.

NCBIXML.read() or parse() expects a file, so as far as i understand, it cannot directly take the standard output.

does anyone know a way to take the stdout from the Blastx_Command and somehow turn that into a BlastRecord?

thanks in advance

python blast blast xml biopython • 3.5k views
ADD COMMENT
3
Entering edit mode
12.3 years ago

I am not sure how exactly you are forming your blast commands. But you can try using popen objects. For example

import os
result = os.popen('my blast command')

result is now a file handle that NCBIXML.parse should be able to read.

A simple example that'll list the current directory in linux:

import os
result = os.popen('ls')
for line in result:
    print line
ADD COMMENT
0
Entering edit mode

thanks DK, with some tweaking, that worked. here was the final command:

result_handle = os.popen("echo ""+seq_record.format("fasta")+"" | "+str(Blastx_Command)) record = NCBIXML.read(result_handle)

ADD REPLY
1
Entering edit mode
12.3 years ago
Peter 6.0k

No, NCBIXML.read() or parse() expects a handle - not a file. This means if you run BLAST via subprocess (or one of the dreprecated alternatives like os.popen) you can get a handle for BLAST's stdout and parse that directly. There are examples of this in the Biopython Tutorial (for multiple sequence alignment tools at least).

ADD COMMENT
0
Entering edit mode
12.3 years ago
Wrf ▴ 210

to answer my own question, someone suggested cStringIO, which was implemented as:

import cStringIO
BlastX_Command = NcbiblastxCommandline (db="nt", query="-", outfmt=5)
blast_handle = cStringIO.StringIO(Blastx_Command(stdin=seq_record.format("fasta"))[0])

for record in NCBIXML.parse(blast_handle):

etc...

ADD COMMENT

Login before adding your answer.

Traffic: 1998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6