Biopython Local Blast Not Producing File
3
0
Entering edit mode
9.4 years ago

Hello, I am a Computer Science & Molecular Biology Student at MIT. I am currently working on my research project in the Research Laboratory of Electronics and I am trying to solve a problem while using Blast locally but I have been running in to problems doing so even though I am doing exactly what the documentation needs me to do. I was wondering if there is anyone used blast locally before? There's no one in my lab who can help me with it so was wondering if there's anyone who can help me with this?

Basically to get a blast output I am using the command line:

result_handler = NcbiblastnCommandline(query= "r70.fasta", db = "MG1655.fasta", out = 'resultr.xml')
os.system(str(result_handler))


But it does not output anything in the resultr.xml file # MG1655.fasta is the data base that I want to search against. I tried the doing the MG1655.txt version too and it still did not work.

Thanks,

biopython blast • 3.9k views
2
Entering edit mode

Does the BLAST search run properly outside of BioPython? Do you have BLAST installed on the machine you're working with?

2
Entering edit mode

In addition to matt's comment, you should check out the value returned by os.system which will give your he exit status of your call (i.e. let you know if it worked). Using subprocess instead will give you more information. Also, if you want xml you should set outfmt to 5 and the db should be a blast db as created by makeblastdb (not a .fasta file, which it might be now?).

0
Entering edit mode

How would I go about creating the blast db using makeblastdb? I tried looking at the documentation but it did not seem very clear.

Thanks,

0
Entering edit mode

something like makeblastdb -in MG1655.fasta -out MG1655 -dbtype nucl the help from makeblastdb -h is pretty good, I think.

0
Entering edit mode

Update: I created the database and made sure blast was correctly installed.

   result_handlel= NcbiblastnCommandline(query="l70.fasta", db = "MG1655.nsq", out = 'resultl.xml', outfmt= 5)
os.system(str(result_handlel))


However, the database is not being read I think. Are there ways of reading databases in biopython? Thanks a lot for the help!

0
Entering edit mode

You shouldn't set the database name as MG1655.nsq - assuming DB creation worked and you have files named MG1655.nsq, MG1655.nhr, etc then BLAST expects you to refer to the database as MG1655 (without these extensions).

1
Entering edit mode

As an alternative to using os.system, you could have asked Biopython to run BLASTN with stderr, stdout = result_handler() which would give an error message if the command failed (non-zero return code), and captures and logging output as strings (stdout and stderr).

The most likely problems are your files are not in the current directory, you didn't create a BLAST database, or BLAST is not installed on your \$PATH.

0
Entering edit mode

Hello, I got it to work. Thank you for all your help. I have another question related to the same project though. When I check my results with the online version of blast it seems like the stand alone blast is only returning to me highly similar sequences result (the results if I choose the megablast option in the online version). How do I reduce the specificity of this to return to me all the results somewhat similar sequences (blastn)?

Thank you.

0
Entering edit mode
9.3 years ago
Zealseeker • 0

Maybe the parameter "outfmt=5" is needed. Or the computer doesn't know want type(format) you want it to output.

0
Entering edit mode
9.3 years ago
Carlos Borroto ★ 2.0k

If your MG1655.fasta input file has only one sequence, the easiest way would be to use subject instead of db. Something like this:

>>> from Bio.Blast.Applications import NcbiblastnCommandline
>>> blastn_cline = NcbiblastnCommandline(query= "r70.fasta", subject = "MG1655.fasta", outfmt = 5, out = 'result.xml')
>>> stdout, stderr = blastn_cline()


This way you don't need to worry about creating a blast database first.

Please also notice that if you want the output in xml you need to include option outfmt = 5 as others mentioned.

0
Entering edit mode
9.0 years ago

Check if blast[n/p] are in your path, and also output cline. I would use it like this:

Best Wishes, Umer