Question

BLAST Database error: No alias or index file found for protein database [nr] in search path [D:\Test_Experiment;D:\Data\NCBI\db;]

0

Entering edit mode

3.9 years ago

Eric Wang ▴ 50

Dear All

I have downloaded the nr database by this command wget 'ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz' and have extracted the file "nr". When I run psiblast to obtain the PSSM matrix. I got this error: BLAST Database error: No alias or index file found for protein database [nr] in search path [D:\Test_Experiment;D:\Data\NCBI\db;]

the command is:

psiblast -query 1_dp.fasta -db D:\Data\NCBI\NR\nr -out 1.txt -num_iterations 3 -out_ascii_pssm 1.pssm

I would appreciate to guide me how I can fix it.

Best Regards Dylan

SNP PSSM PSI-BLAST NCBI • 1.9k views

ADD COMMENT • link updated 3.9 years ago by GenoMax 141k • written 3.9 years ago by Eric Wang ▴ 50

score 0 · Answer 1 · 2020-05-26

0

Entering edit mode

3.9 years ago

GenoMax 141k

That is just the fasta format nr sequence data file. You need to download all nr* pre-formatted database files that are in this folder to use them with blast+. Un-tar the files in a directory after you download them.

You could create the database from the sequence file you downloaded but it may not work since you will need a significant amount of memory (which may also turn out to be the case for actual search) and time to do that.

ADD COMMENT • link 3.9 years ago by GenoMax 141k

0

Entering edit mode

Thank you for your answer. Is there a method to calculate the PSSM matrix in the webserver? I tried this "https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE=Proteins&PROGRAM=blastp&RUN_PSIBLAST=on" (program selection: Algorithm PSI-BLAST (Position-Specific Iterated BLAST)). However, I didn't find the PSSM matrix in the results. This is my first time to use PSI-BLAST, thank you for your help.

ADD REPLY • link 3.9 years ago by Eric Wang ▴ 50

0

Entering edit mode

From Blast help page:

To save a PSSM file:

Run a protein BLAST search.
Check the PSI-BLAST box on formatting page.
Click the "Format" Button.
On the PSI-BLAST results page, click the "Run PSI-BLAST Iteration 2" button.
Select the Download link at the top of the page and download the PSSM to your computer.

To use the PSSM in a new protein BLAST search against other databases:

Open a new protein BLAST page.
Select PSI-BLAST as the Algorithm under "Program Selection" (this may already be set).
Select the "+" next to "Algorithm parameters" at the bottom of the search page.
Scroll to the "PSI/PHI/DELTA BLAST" section and use the "Choose File" button to upload the PSSM that you saved in step 5 above.
Select a different target database.
Click "BLAST" button to start the search

If the database is the same as when the PSSM was stored, you'll reproduce the iteration on which you've saved the PSSM; A different database will yield a different hit list.

ADD REPLY • link 3.9 years ago by GenoMax 141k

0

Entering edit mode

I have obtained a pssm.asn file. Do you know how to transfer it to a matrix? For example, a 27-length protein sequence was uploaded as the input data to psi-blast server. Finally, I got the pssm.asn file. However, I don't know how to understand the content of this file.

PssmWithParameters ::= { pssm {
numRows 28,
numColumns 27,
query seq {
  id {
    local str "Query_66026"
  },
  descr {
    user {
      type str "CFastaReader",
      data {
        {
          label str "DefLine",
          data str ">NP_000007.1"
        }
      }
    },
    title "NP_000007.1"
  },
  inst {
    repr raw,
    mol aa,
    length 27,
    seq-data ncbieaa "MAAGFGRCCRVLRSISRFHWRSQHTKA"
  }
},

I goolge the pssm matrix, I was told that this matrix contains n(27)-rows and 20-columns. The pssm file generated by psi-blast contains 28-rows and n(27)-columns. This made me confused so much.

I really appreciate your answer. Thx.

ADD REPLY • link 3.9 years ago by Eric Wang ▴ 50