BLAST Database error: No alias or index file found for protein database [nr] in search path [D:\Test_Experiment;D:\Data\NCBI\db;]
1
0
Entering edit mode
17 months ago
Eric Wang ▴ 40

Dear All

I have downloaded the nr database by this command wget 'ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz' and have extracted the file "nr". When I run psiblast to obtain the PSSM matrix. I got this error: BLAST Database error: No alias or index file found for protein database [nr] in search path [D:\Test_Experiment;D:\Data\NCBI\db;]

the command is:

psiblast -query 1_dp.fasta -db D:\Data\NCBI\NR\nr -out 1.txt -num_iterations 3 -out_ascii_pssm 1.pssm


I would appreciate to guide me how I can fix it.

Best Regards Dylan

SNP PSSM PSI-BLAST NCBI • 821 views
0
Entering edit mode
17 months ago
GenoMax 108k

That is just the fasta format nr sequence data file. You need to download all nr* pre-formatted database files that are in this folder to use them with blast+. Un-tar the files in a directory after you download them.

You could create the database from the sequence file you downloaded but it may not work since you will need a significant amount of memory (which may also turn out to be the case for actual search) and time to do that.

0
Entering edit mode

Thank you for your answer. Is there a method to calculate the PSSM matrix in the webserver? I tried this "https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE=Proteins&PROGRAM=blastp&RUN_PSIBLAST=on" (program selection: Algorithm PSI-BLAST (Position-Specific Iterated BLAST)). However, I didn't find the PSSM matrix in the results. This is my first time to use PSI-BLAST, thank you for your help.

0
Entering edit mode

From Blast help page:

To save a PSSM file:

Run a protein BLAST search.
Check the PSI-BLAST box on formatting page.
Click the "Format" Button.
On the PSI-BLAST results page, click the "Run PSI-BLAST Iteration 2" button.


To use the PSSM in a new protein BLAST search against other databases:

Open a new protein BLAST page.
Select PSI-BLAST as the Algorithm under "Program Selection" (this may already be set).
Select the "+" next to "Algorithm parameters" at the bottom of the search page.
Scroll to the "PSI/PHI/DELTA BLAST" section and use the "Choose File" button to upload the PSSM that you saved in step 5 above.
Select a different target database.
Click "BLAST" button to start the search


If the database is the same as when the PSSM was stored, you'll reproduce the iteration on which you've saved the PSSM; A different database will yield a different hit list.

0
Entering edit mode

I have obtained a pssm.asn file. Do you know how to transfer it to a matrix? For example, a 27-length protein sequence was uploaded as the input data to psi-blast server. Finally, I got the pssm.asn file. However, I don't know how to understand the content of this file.

PssmWithParameters ::= { pssm {
numRows 28,
numColumns 27,
query seq {
id {
local str "Query_66026"
},
descr {
user {
data {
{
label str "DefLine",
data str ">NP_000007.1"
}
}
},
title "NP_000007.1"
},
inst {
repr raw,
mol aa,
length 27,
seq-data ncbieaa "MAAGFGRCCRVLRSISRFHWRSQHTKA"
}
},


I goolge the pssm matrix, I was told that this matrix contains n(27)-rows and 20-columns. The pssm file generated by psi-blast contains 28-rows and n(27)-columns. This made me confused so much.