Command not found (Trivial possibly, but I'm stuck)
0
0
Entering edit mode
7.5 years ago

Hi, I'm trying to install BLAST and use it on my local machine. I've installed BLAST, but I'm not quite sure why it isn't working - I've even tried throwing the blast executables (blastp, etc) and it doesn't seem to work. Here's and example output:

python myblastscript.py
Running blastp, using min e-value, retrieving hits
Extracted 1 features
BLAST-ing gene_1
ID: gene_1
Name: <unknown name>
Description: <unknown description>
Number of features: 1
Seq('MDGRRSRHTDDTDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRV...LEI', ExtendedIUPACProtein())

sh: blastp: command not found

query cover: 0.151515151515    max_iden: 0.95
904667897    CSK81904.1       Uncharacterised protein [Shigella sonnei]      145     264
905276864    CSH89806.1       Uncharacterised protein [Shigella sonnei] >gi  145     264

Here's snips of what's not working, essentially the blast_cline works, but it's not recognizing blastp as a function.

def Blast_Features_Local(seqname):
# open contents of seq file, add features from file of orfs
# blast each file and return a single output table
# loop through ORFs and print results in outFile


blast_type="blastp" # set type of blast: n, p or x
blast_db = "nr" # use nr for protein alignments
blast_db_path = "/Users/username/Desktop/db/"
min_evalue = 0.0001
max_hits = 5


    #make a call to blast
    #create command line call to blast

    blast_cline = NcbiblastpCommandline(query='Temp.fasta', db=blast_db_path+blast_db, evalue=min_evalue, outfmt=5, out="temp_blast.xml", max_target_seqs = max_hits)


    os.system(str(blast_cline))

No idea why it's saying blastp command not found, I imported "from Bio.Blast.Applications import NcbiblastpCommandline" as well so I know that isn't the issue. Please help, thank you!!

blast • 2.8k views
ADD COMMENT
0
Entering edit mode

It looks like you are not providing type of blast (blast_type) you want to run in your command line (not sure if that is happening based on the snippet you have provided)? Also add the directory where blast executables are installed to your $PATH.

ADD REPLY
0
Entering edit mode

I'm not quite sure how to add the directory to my blast executables to my $path in the script, I know my path is this, /Volumes/Macintosh HD2/usr/local/ncbi/blast/bin

How do I incorporate that into the script..

ADD REPLY
0
Entering edit mode

Here is how you can modify $PATH (and make the changes permanent, if you wish).

ADD REPLY
0
Entering edit mode

How and where did you install blast? This looks like the system is unable to find the blastp binary in your system's executable path.

ADD REPLY
0
Entering edit mode

I downloaded the .dmg and ran the .pkg file to install BLAST to my machine. It installed here /Volumes/Macintosh HD2/usr/local/ncbi/blast/bin

ADD REPLY
0
Entering edit mode

Follow's genomax2's link and you'll be all set with the installation. :)

ADD REPLY
0
Entering edit mode

Ok great, $PATH is now allowing the blastp function to work, but not I'm having trouble referencing the database.

    blast_db = "nr" # use nr for protein alignments and blastx, use nt for blastn
   #blast_db_path = "/usr/local/ncbi-blast-2.2.29+/db/"
    blast_db_path = "/Users/username/Desktop/db/"

    blast_cline = NcbiblastpCommandline(query='Temp.fasta', db=blast_db_path+blast_db, \
                            evalue=min_evalue, outfmt=5, out="temp_blast.xml",\
                                         max_target_seqs = max_hits)

And now here's the error....

BLAST Database error: No alias or index file found for protein database [/Users/username/Desktop/db/nr/] in search path [/Users/username/Desktop/plasmidannotations::]

Traceback (most recent call last):

  File "AnnotationTools.py", line 436, in <module>
    Blast_Features_Local(sqname)
  File "AnnotationTools.py", line 239, in Blast_Features_Local
    blast_record = NCBIXML.read(result_handle)
  File "/Library/Python/2.7/site-packages/Bio/Blast/NCBIXML.py", line 530, in read
    first = next(iterator)
  File "/Library/Python/2.7/site-packages/Bio/Blast/NCBIXML.py", line 575, in parse
    raise ValueError("Your XML file was empty")
ValueError: Your XML file was empty

I tried putting the db folder containing my databases into plasmidannotations and that isn't working either... Basically I'm not sure how to accurately reference the databases to be accessed. A commented line above the path I'm trying to work with was something that someone else had used previously but I can't quite get that to work either.... Any help appreciated.

ADD REPLY
0
Entering edit mode

If you want to blast against nr then that base name needs to be used in the command (-db /path_to/nr ). Are there several files there that start with nr*

ADD REPLY
0
Entering edit mode

I downloaded all 50+ nr databases into a single folder, they start as nr.XX.tar.gz (XX = 00 through 55) and then unpacking each file I now have 56 folders called nr.XX (0-55) which each contain an allotment of files. To be frankly honest, I've no idea what any of the files do individually. I'm thoroughly confused as to how I'm to access all of these from the command line.. Do I need to take all of the inner files from each of the 50+ folders and place all of them in one folder? I really don't know what I'm doing at this point.

ADD REPLY
0
Entering edit mode

Do I need to take all of the inner files from each of the 50+ folders and place all of them in one folder?

Bingo! There should also be at least one more file ending in .pal extension. All these files need to be in the same folder. Once you have this setup. Point -db to /path_to_new_folder/nr and that should do it.

ADD REPLY
0
Entering edit mode

Agh! Ok, so all of the files should be taken out of their folders and placed in db, or only the .pal files? I'm assuming there's a .pal file in every single one, I don't have access to the main I'm doing the work on so I can't verify that and try this solution at the moment, but I want to have a good grasp when I'm able to work on it

ADD REPLY
0
Entering edit mode

There should be only one .pal file (which describes the pieces that form the whole nr database). Take all other files out and put them in a single folder along with the .pal file. There should be 670+ files that start with nr* (don't count the gz files).

ADD REPLY

Login before adding your answer.

Traffic: 2683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6