Trouble With Local Psiblast
4
1
Entering edit mode
8.2 years ago

Hello everyone,

My requirement is to generate a PSSM (Position Specific Scoring Matrix) from a given protein fasta sequence against a protein database. I have downloaded and installed the latest version of BLAST+ software and have also updated the "path" environment variable to the bin sub-directory, which has enabled me to run the programs under bin sub-directory from anywhere. After that I downloaded all the 13 files of nr database from the ftp site. I extracted the nr.pal file from nr.00.tar.gz. So I have all the 13 nr.##.tar.gz, the nr.pal file, the protein file in fasta format all under the same directory. Now I execute the command from command line "psiblast -query prot1.fasta -db nr -num_iterations 3 -out_ascii_pssm ascii_pssm.txt" This results in the error "BLAST Database error: Could not find volume or alias file (nr.00) referenced in alias file (F:\workspace\Protein_Secondary_Structure_Prediction_Multi_Class_SVM\testProteins\nr)."

blast+ pssm • 4.2k views
ADD COMMENT
1
Entering edit mode
8.2 years ago
SRKR ▴ 180

I don't see any mistake in your command.

Check the rar files you have downloaded using the MD5 tags provided by NCBI on the FTP site.

Make sure you have extracted nr database without any errors.

ADD COMMENT
0
Entering edit mode

u r right.....In the documentation it mentions that all the nr.##.tar.gz packages must have the same timestamp. I downloaded the first 6 files prev week and I downloaded the remaining files yesterday. I saw that within this week they have updated the entire database....if I run update_blastdb will it download only the changes or will it download the database all over again? Since i am facing a lot of problems with nr....can I run the search against swissprot database and still expect good results

ADD REPLY
0
Entering edit mode

If the existing database is not of the latest version, update_blastdb will download the whole database allover again. Yeah you can get good results by running a BLAST against SwissProt database, but it has less number of entries that nr and also I don't know if you can download all the protein sequences from SwissProt in FASTA format so as to convert it into a BLAST db using makeblastdb. If that's possible it would be good as it is a curated one, with nr we do get a lot of hypothetical and putative proteins.

ADD REPLY
0
Entering edit mode

The NCBI BLAST 'swissprot' database provided by NCBI (ftp://ftp.ncbi.nih.gov/blast/db/) requires that 'nr' is installed (it is implemented as a mask over the 'nr' database).

UniProt provide fasta format files of all their databases (see http://www.uniprot.org/downloads) which can be formatted for use with NCBI BLAST+ using the 'makeblastdb' program. Alternative fasta format files for the UniProt databases, which use an alternative header format can also be found on the EMBL-EBI FTP site: ftp://ftp.ebi.ac.uk/pub/databases/fastafiles

ADD REPLY
1
Entering edit mode
8.2 years ago
Hamish ★ 3.2k

From the error message and the description of how the database was unpacked, my guess is that you have not extracted the database volumes from the archive files. It sounds like you have:

nr.pal
nr.00.tar.gz
nr.01.tar.gz
nr.02.tar.gz
nr.03.tar.gz
nr.04.tar.gz
nr.05.tar.gz
nr.06.tar.gz
nr.07.tar.gz
nr.08.tar.gz
nr.09.tar.gz
nr.10.tar.gz
nr.11.tar.gz
nr.12.tar.gz

In your database directory, but you should have something like:

nr.pal
nr.00.phd nr.00.phi nr.00.phr nr.00.pin nr.00.pnd nr.00.pni
nr.00.pog nr.00.ppd nr.00.ppi nr.00.psd nr.00.psi nr.00.psq
nr.01.phd nr.01.phi nr.01.phr nr.01.pin nr.01.pnd nr.01.pni
nr.01.pog nr.01.ppd nr.01.ppi nr.01.psd nr.01.psi nr.01.psq
nr.02.phd nr.02.phi nr.02.phr nr.02.pin nr.02.pnd nr.02.pni
nr.02.pog nr.02.ppd nr.02.ppi nr.02.psd nr.02.psi nr.02.psq
nr.03.phd nr.03.phi nr.03.phr nr.03.pin nr.03.pnd nr.03.pni
nr.03.pog nr.03.ppd nr.03.ppi nr.03.psd nr.03.psi nr.03.psq
nr.04.phd nr.04.phi nr.04.phr nr.04.pin nr.04.pnd nr.04.pni
nr.04.pog nr.04.ppd nr.04.ppi nr.04.psd nr.04.psi nr.04.psq
nr.05.phd nr.05.phi nr.05.phr nr.05.pin nr.05.pnd nr.05.pni
nr.05.pog nr.05.ppd nr.05.ppi nr.05.psd nr.05.psi nr.05.psq
nr.06.phd nr.06.phi nr.06.phr nr.06.pin nr.06.pnd nr.06.pni
nr.06.pog nr.06.ppd nr.06.ppi nr.06.psd nr.06.psi nr.06.psq
nr.07.phd nr.07.phi nr.07.phr nr.07.pin nr.07.pnd nr.07.pni
nr.07.pog nr.07.ppd nr.07.ppi nr.07.psd nr.07.psi nr.07.psq
nr.08.phd nr.08.phi nr.08.phr nr.08.pin nr.08.pnd nr.08.pni
nr.08.pog nr.08.ppd nr.08.ppi nr.08.psd nr.08.psi nr.08.psq
nr.09.phd nr.09.phi nr.09.phr nr.09.pin nr.09.pnd nr.09.pni
nr.09.pog nr.09.ppd nr.09.ppi nr.09.psd nr.09.psi nr.09.psq
nr.10.phd nr.10.phi nr.10.phr nr.10.pin nr.10.pnd nr.10.pni
nr.10.pog nr.10.ppd nr.10.ppi nr.10.psd nr.10.psi nr.10.psq
nr.11.phd nr.11.phi nr.11.phr nr.11.pin nr.11.pnd nr.11.pni
nr.11.pog nr.11.ppd nr.11.ppi nr.11.psd nr.11.psi nr.11.psq
nr.12.phd nr.12.phi nr.12.phr nr.12.pin nr.12.pnd nr.12.pni
nr.12.pog nr.12.ppd nr.12.ppi nr.12.psd nr.12.psi nr.12.psq

If this is the case then you need to unpack each of the archives (nr.*.tar.gz). On Linux/UNIX systems I would suggest using a simple loop with 'tar', for example:

for archive in `ls nr.*.tar.gz`; do tar xzvf $archive; done

However since you appear to be on Windows, I am not sure what the best option would be for unpacking the archives.

ADD COMMENT
0
Entering edit mode
8.2 years ago
sanchezcavani ▴ 220

try blastpgp from blastall package?

ADD COMMENT
0
Entering edit mode
8.1 years ago

Have you moved your database files to a different directory? Once, I had to edit my .pal file, so that all paths pointed to the current directory.

ADD COMMENT

Login before adding your answer.

Traffic: 2330 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6