Question: blastdbcmd issue
0
gravatar for Toto26
6 months ago by
Toto2610
Toto2610 wrote:

Hi all, I'm actually trying to use the nr database from blast and add some taxonomic informations into my blat output.

So, I actually downloaded and uncompressed :

-taxcat.zip
-taxdump.tar.gz
-prot.accession2taxid.gz
-the nr database (huge file)

And i put all of them into one folder named blast_database

Then, i changed my BLASTDB path as:

export BLASTDB=/pandata/me/LEPIWASP/blast_database

and when I want to generate the gi_to_des.tab databse by doing: blastdbcmd -entry 'all' -db nr > nr.faa

I actually get:

BLAST Database error: No alias or index file found for nucleotide database [nr]

Does someone have an idea where is my mistake? The nr file it however in the directory blast_database I do not understand. Here are the files inside my directory:

total 106892000
-rw-r--r-- 1   16783992 May 11 12:20 citations.dmp
-rw-r--r-- 1    3568599 May 11 12:20 delnodes.dmp
-rw-r--r-- 1     442 May 11 12:20 division.dmp
-rw-r--r-- 1    15188 May 11 12:20 gc.prt
-rw-r--r-- 1    4575 May 11 12:20 gencode.dmp
-rw-r--r-- 1    919089 May 11 12:20 merged.dmp
-rw-r--r-- 1    154534803 May 11 12:20 names.dmp
-rw-r--r-- 1    119658024 May 11 12:20 nodes.dmp
-rw-r--r-- 1    93133265049 May 10 19:38 nr
-rw-r--r-- 1      0 May 11 13:31 nr.faa
-rw-r--r-- 1    3766079372 May 11 13:31 prot.accession2taxid.gz
-rw-r--r-- 1      58 May 11 13:10 prot.accession2taxid.gz.md5
-rw-r----- 1    2652 Jun 13  2006 readme.txt
-rw-r--r-- 1   6766010 May 11 13:08 taxcat.zip
-rw-r--r-- 1    43086159 May 11 13:09 taxdump.tar.gz

output asked: for grep "^>" nr | head -3

>WP_003131952.1 30S ribosomal protein S18 [Lactococcus lactis]NP_268346.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis Il1403]Q9CDN0.1 RecName: Full=30S ribosomal protein S18Q02VU1.1 RecName: Full=30S ribosomal protein S18A2RNZ2.1 RecName: Full=30S ribosomal protein S18AAK06287.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis Il1403]ABJ73931.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris SK11]CAL99037.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris MG1363]ADA65983.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis KF147]ADJ61439.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris NZ9000]ADZ64834.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis CV56]EHE92602.1 hypothetical protein LLCRE1631_01913 [Lactococcus lactis subsp. lactis CNCM I-1631]AEU41715.1 SSU ribosomal protein S18p [Lactococcus lactis subsp. cremoris A76]BAL52156.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis IO-1]AFW92578.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris UC509.9]CDG05746.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis A12]EQC53187.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis str. TIFN4]EQC53393.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis str. TIFN2]EQC54683.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN6]EQC56744.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN5]EQC82878.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN7]EQC91162.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN1]EQC94448.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN3]AGV74185.1 ribosomal protein S18 RpsR [Lactococcus lactis subsp. cremoris KW2]AGY45032.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis KLDS 4.0325]ESK79551.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis str. LD61]KEY61992.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris GE214]AII13743.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis NCDO 2118]KGF77556.1 SSU ribosomal protein S18p SSU ribosomal protein S18p, zinc-independent [Lactococcus lactis]AIS04718.1 SSU ribosomal protein S18P [Lactococcus lactis]KGH32949.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris]KHE77803.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis 1AA59]KKW69436.1 ribosomal protein bS18, rpsR [Lactococcus lactis subsp. cremoris]KKW70341.1 ribosomal protein bS18, rpsR [Lactococcus lactis subsp. cremoris]KLK95226.1 ribosomal protein bS18, rpsR [Lactococcus lactis subsp. lactis]KRO21588.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]KST41693.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis]KST76534.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST79241.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST81638.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST85642.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST88531.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST92921.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST97154.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST98471.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST99285.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU03686.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU05991.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU09388.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU13881.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU20925.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU23615.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU25349.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU27070.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU28321.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU32404.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KZK07251.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK08880.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis bv. diacetylactis]KZK09361.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK33282.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK44117.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK46962.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK52810.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK53814.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]OAJ97698.1 30S ribosomal protein S18 [Lactococcus lactis]OAZ16676.1 30S ribosomal protein S18 [Lactococcus lactis RTB018]SBW31684.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]OEU38668.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris IBB477]OJH46247.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis]ONK31551.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]ARD92294.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARD97280.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARD99957.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE04690.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE06709.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARE09571.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE12078.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE14468.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE16888.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE19344.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARE21948.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE24261.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARE27001.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]OSP86582.1 30S ribosomal protein S18 [Lactococcus lactis]ARR87601.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis]PAK66121.1 30S ribosomal protein S18 [Lactococcus lactis]PAK87984.1 30S ribosomal protein S18 [Lactococcus lactis]PAL02283.1 30S ribosomal protein S18 [Lactococcus lactis]PCS13431.1 30S ribosomal protein S18 [Lactococcus lactis subsp. hordniae]PCS17241.1 30S ribosomal protein S18 [Lactococcus lactis subsp. tructae]PEN18002.1 30S ribosomal protein S18 [Lactococcus lactis]PFG75654.1 30S ribosomal protein S18 [Lactococcus lactis]PFG79860.1 30S ribosomal protein S18 [Lactococcus lactis]PFG84386.1 30S ribosomal protein S18 [Lactococcus lactis]PFG87566.1 30S ribosomal protein S18 [Lactococcus lactis]PFG90835.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris]PFG90892.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]ATY88684.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]ATZ02303.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]PLW60021.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]AUS70574.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]PPA66113.1 30S ribosomal protein S18 [Lactococcus lactis]BBC75095.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris]
>XP_642131.1 hypothetical protein DDB_G0277827 [Dictyostelium discoideum AX4]P54670.1 RecName: Full=Calfumirin-1; Short=CAF-1BAA06266.1 calfumirin-1 [Dictyostelium discoideum AX2]EAL68086.1 hypothetical protein DDB_G0277827 [Dictyostelium discoideum AX4]
>XP_642837.1 hypothetical protein DDB_G0276911 [Dictyostelium discoideum AX4]EAL68957.1 hypothetical protein DDB_G0276911 [Dictyostelium discoideum AX4]
blast nr database db • 223 views
ADD COMMENTlink modified 6 months ago by genomax58k • written 6 months ago by Toto2610
0
gravatar for genomax
6 months ago by
genomax58k
United States
genomax58k wrote:

nr database should include many files (nr.NN.*). Why do you have only one file called nr?

I want to generate the gi_to_des.tab databse by doing:

What is that part of? Do you just need the fasta serqueces for nr DB?

ADD COMMENTlink modified 6 months ago • written 6 months ago by genomax58k

I downloaded the nr file from :

ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz

Is it the good one?

ADD REPLYlink written 6 months ago by Toto2610

That is not the blast index. It is the fasta format sequences file for nr. If you need that then there is no need to use blastdbcmd.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax58k

In fact here is the tutorial for what I need to do :

do the following:

        wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.gz.md5 wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.gz md5sum -c prot.accession2taxid.gz.md5 gunzip prot.accession2taxid.gz

    wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxcat.zip unzip taxcat.zip

    wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz tar -zxvf taxdump.tar.gz

export BLASTDB=/path_to/blast/ncbi/

To generate the gi_to_des.tab databse: blastdbcmd -entry 'all' -db nr > nr.faa

or download the nr database from, and the other 64 ish folders: ftp://ftp.ncbi.nih.gov/blast/db/nr.00.tar.gz

So, I do not need to do all these things if I dowloaded the huge nr file?

ADD REPLYlink written 6 months ago by Toto2610

I would think so.

That said, if you want to follow the tutorial exactly then you should download all nr.tar.gz files from db directory (ftp://ftp.ncbi.nih.gov/blast/db).

Can you post the output of grep "^>" nr | head -3? I want to compare what the headers look like in that fasta file with nr blast index.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax58k

yep sure, I wrote it in my first comment

ADD REPLYlink written 6 months ago by Toto2610

That looks identical to what I get from blastdbcmd -entry 'all' -db nr. So you should be good to go to next step in your workflow. May want to rename nr to nr.faa if that file name is expected.

If whatever you are trying to do needs the nr blast indexes then you would need to download them from the link in one of the comments above.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax58k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 807 users visited in the last hour