Question: nr- protein database
3
gravatar for akhilvbioinfo
3.3 years ago by
akhilvbioinfo130
India, chennai
akhilvbioinfo130 wrote:

hai

  i want to download all nr-protein database from ncbi.. is there any link is available for this ?  

ADD COMMENTlink modified 20 months ago by gu.fernandezn0 • written 3.3 years ago by akhilvbioinfo130
4

Either download entire fasta and make your own database ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz or download database which can be downloaded in multiple chunks.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by geek_y8.8k

Hi @geek_y, the FTP that @glihm has provided seems that is an updating FTP that new files are going to add to it after a while. Is the "one-file-format" that you have mentioned above is an updating file or just it is always old? thanks

ADD REPLYlink written 13 months ago by Farbod3.2k
5
gravatar for glihm
3.3 years ago by
glihm580
France
glihm580 wrote:

Hi there, 


You have the FTP site of the NCBI where all databases are available (Url, if the link does not work : ftp://ftp.ncbi.nlm.nih.gov/blast/db/)
Then, in the README, you can find all descriptions of these databases.

For instance:

nr.*tar.gz                    | Non-redundant protein sequences from GenPept, 
                                Swissprot, PIR, PDF, PDB, and NCBI RefSeq
ADD COMMENTlink written 3.3 years ago by glihm580

thank you for your reply

i want to  download all available nr - protein database as a single file  

ADD REPLYlink written 3.3 years ago by akhilvbioinfo130
4

Try this:

wget 'ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr.*.tar.gz'
cat nr.*.tar.gz | tar -zxvi -f - -C .
ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by Eliad50

Files size is huge. You can not have one file with all data. The solutions proposed by Eliad allows you to download all "nr" databases subfiles in one command. 

ADD REPLYlink written 3.3 years ago by glihm580

Hi, may i know how to format the nr databases subfiles before using for blast?

ADD REPLYlink written 2.4 years ago by snar860

Hi, i found a command to format nr db ./blast-2.2.18/bin/formatdb -i NR -p T -o T http://zhanglab.ccmb.med.umich.edu/bbs/?q=node/100 it's an older report. i didn't prove it.

ADD REPLYlink modified 20 months ago • written 20 months ago by gu.fernandezn0
0
gravatar for gu.fernandezn
20 months ago by
gu.fernandezn0 wrote:

Hi, is there a way to download just a file with the taxonomy information. i mean, a tab delimiter with: name_of_protein organism_source(plant, bacteria, other) i need getting the organism source, but if i take a look for nr db directly have a huge header for each protein and dont exist any pattern a priori to getting that.

ADD COMMENTlink written 20 months ago by gu.fernandezn0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1167 users visited in the last hour