Blast Help On Nucleotide Collection Nr/Nt
2
5
Entering edit mode
12.6 years ago
Matt ▴ 70

I am running a local blast server. I can format and blast my own databases. However, I am unsure of how to setup the "Nucleotide collection nr/nt" database from this NCBI Blast URL.

Can I just download a preformatted db and use the update script? Which database is it? Is it just both the nr and nt databases? Isn't blastn used for the nt database and blastp used for the nr database? Can I blast them both at the same time? If so how?

Also, downloading nr downloads two files nr.01.tar.gz and nr.00.tar.gz. Is this right? How can I setup to blast just "nr" rather than "nr.00 nr.01"?

I have been using the following commands:

blastp -word_size 7 -evalue 10 -query test.fasta -db "nr.00 nr.01"

and blastn -word_size 11 -evalue 10 -query test.fasta -db nt.00

Thank you for your help!

Matt

blast ncbi fasta alignment blast • 33k views
ADD COMMENT
0
Entering edit mode

Thanks --- I was wondering too!

ADD REPLY
12
Entering edit mode
12.6 years ago
Neilfws 49k

I think your confusion stems from the use of the term "Nucleotide collection nr/nt", on the BLAST page to which you linked.

In that case, "nr/nt" stands for "non-redundant nucleotide." However, as you point out, NCBI also make separate databases available for download. In this case, "nr" is non-redundant protein, "nt" is non-redundant nucleotide.

Yes: you would blastn versus nt and blastp versus nr. No: you cannot BLAST both "at the same time." You need to choose an appropriate combination of BLAST program and database. For example, you can BLAST nucleotide queries against the protein database by using blastx, which first translates the queries in 6 frames.

The 2 files nr.00 and nr.01 simply mean that the database has been split into two parts, because it is very large. Older BLAST versions used an additional index file - it used to be called "nr.pal" and may still be called that. Provided that 00, 01 and the index file all reside in the same location, local BLAST will "stitch" the 2 parts together in the background and you just specify "nr" as the database. Alternatively (since I have not upgraded to BLAST+ myself), it may be that the index file is no longer required.

ADD COMMENT
0
Entering edit mode

Thanks for the help!

ADD REPLY
0
Entering edit mode

So I have the same issue except the nt databases are now in 27 parts. I downloaded all of them but cannot extract any of them because there is absolutely no space. I extracted the nt.00 file first and that had a nt.pal file. Is that all I need?

Am I required to download ALL the nt files because I don't see how this is possible given the space requirements.

ADD REPLY
0
Entering edit mode

I have the same issue. nt is now in 31 parts. How should I do?

ADD REPLY
0
Entering edit mode

You'll need ~ 34.2 GB for the current nt database (once extracted from the .tar.gz files). If you don't have that, you can't run it locally.

ADD REPLY
1
Entering edit mode
12.6 years ago
Digiomics ▴ 170

Actually, the "nr" database has currently 6 parts, so it should be nr.00 to nr.05. If you have trouble using the update script, you can also download preformated blast databases from the NCBI ftp server

ADD COMMENT
0
Entering edit mode

I will do that. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6