Question: Using nr database for BLAST search
2
gravatar for cookm346
3.2 years ago by
cookm34620
cookm34620 wrote:

I have downloaded the nr database from ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz and have extracted the files using 7-zip. When I do this I only get one file "nr".

Yet, when I download only part of the database (ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.00.tar.gz) I can extract the .gz file, then the .tar file, and I get a whole bunch of files (e.g., nr.00.phd, nr.00.phi, etc.) and have been able to successfully BLAST against this database.

I am not sure what to do with the single "nr" file when downloading the entire database? Running a blast search with a command such as "blastp -infile.fasta -db db/nr -out outfile.txt -num_alignments 1" does not work, but "blastp -infile.fasta -db db/nr.00 -out outfile.txt -num_alignments 1" will work perfectly. With the full nr database I get an error saying "No Alias or Index File found for protein in dtabase [db/nr]".

Thank you.

blastp blast nr database • 11k views
ADD COMMENTlink modified 3.0 years ago by Biostar ♦♦ 20 • written 3.2 years ago by cookm34620

nr means non-redundant

It is the whole database

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by natasha.sernova3.6k

The comment after this one was helpful (it appears to have been removed) so I am going to copy it here:

"This 'nr' file is in fasta format, right? You probably just need to run 'makeblastdb' first. Something like this should work: 'makeblastdb -dbtype prot -in nr'."

ADD REPLYlink written 3.2 years ago by cookm34620

It may take a while to build the indexes for the nr databases yourself. Just get the pre-made indexes from ftp://ftp.ncbi.nih.gov/blast/db/. You want to get all nr*.tar.gz files and then unarchive them in a folder. Running the search will only need the basename of the database which would be nr.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by genomax71k

See this post, for example:

A: blastn execution error, the correct command line format

or this one:

nr- protein database

and scan biostars.org for makeblastdb, there are many of such posts about this command.

To do it press LATEST - button in the upper left corner

and type 'makeblastdb' in the empty line in the middle. 'Live search: start typing...'

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by natasha.sernova3.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1461 users visited in the last hour