blastp search shows Blast database error: no alias or index file found for protein database
1
0
Entering edit mode
22 months ago
garfield320 ▴ 20

I'm running blast using the Ubuntu app on a Windows 10 desktop with Windows Subsystem for Linux feature turned on.

I'm trying to do a blastp against the NCBI nr database. I kept getting errors with the update_blastdb.pl script, so I just downloaded all nr files in a directory /mnt/f/blast and unzipped them like below.

wget 'ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.*.tar.gz
wget 'ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.*.tar.gz.md5
tar -zxvof nr.*.tar.gz

This produced 485 files, including nr.pto, nr.ptf, nr.pos, nr.pdb, nr.pal and 8 different formats (phd, phi, phr, pin, pog, ppd, ppi, psq) for all files that start with nr.##.

Then I tried running blastp like below:

blastp -query /mnt/c/Users/BL/Desktop/Blast/query_test.fasta -db /mnt/f/blast -out test.txt

This returned:

BLAST Database error: No alias or index file found for protein database [/mnt/f/blast] in search path [/mnt/f/blast::]

I looked up other threads in this forum and checked that I have a nr.pal file and that the - in the blastp command are actually minus signs.

I also checked my blastp version (2.10.1+) and checked the location of blastp using which blastp, which returned /home/BL/ncbi-blast-2.10.1+/bin/blastp. What else should I look into to resolve this error?

nr blastp blast ubuntu • 1.6k views
ADD COMMENT
1
Entering edit mode
22 months ago
Mensur Dlakic ★ 27k

Assuming that all your nr.* files are in /mnt/f/blast directory, your command should be:

blastp -query /mnt/c/Users/BL/Desktop/Blast/query_test.fasta -db /mnt/f/blast/nr -out test.txt

That's what the error is telling you: that there is no database index for /mnt/f/blast if that is a directory.

If I am assuming correctly that /mnt/f is an external drive, be prepared for that search to be very slow as nr is a big database.

ADD COMMENT
0
Entering edit mode

Thank you! This was exactly the problem. (And yes your assumption is also correct, /mnt/f is an external drive. I had downloaded the nr database in there exactly because it was too large to be installed in my C drive. Would you say that blast users generally make space in their C drives or use some other ways to avoid using an external drive?)

ADD REPLY
1
Entering edit mode

Generally speaking, external drives are slower, so it won't help your case with a big database. You already made a significant effort to download nr, but if bandwidth is not an issue, I suggest you try the UniRef90 database:

https://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz

It is a database clustered at 90% identity. It should make no significant difference in results whether you search against nr or uniref90, but the latter is less than half in size. If you decide to do it, once you unpack the database you will have to create the indices manually using makeblastdb.

Would you say that blast users generally make space in their C drives or use some other ways to avoid using an external drive?

I would say that most BLAST users don't have C drives, because I assume that most run it under Linux. A database of that size, especially if used frequently, should ideally be on the fastest disk available.

ADD REPLY
0
Entering edit mode

Ok, I'll try out the UniRef90 database. Thank you for your suggestions!

ADD REPLY

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6