What is the best way to download GenBank locally? What is the best method for creating databases with the data? BioPerl? Thanks!
What is the best way to download GenBank locally? What is the best method for creating databases with the data? BioPerl? Thanks!
NCBI puts out a script, update_blastdb.pl to download the databases: http://www.ncbi.nlm.nih.gov/BLAST/docs/update\_blastdb.pl
See ftp://ftp.ncbi.nlm.nih.gov/blast/documents/blastdb.html for details.
Part two of your question: usually the best kind of database is a blast database.
I'm getting an error when running the updateblastdb.pl. I do perl update_blastdb.pl
and get the error that perl docs needs to be installed, so installed it but now when I do the same command I'm getting a different page giving me options. So no I'm using perl update_blastdb.pl -showall blastdb
and I get the error "use of uninitialized value $retval[0] in concatenation (.) or string at updateblast.db line 110. How do I run this script to download the BLAST database?
That's strange. I just started the est_mouse command and it seems to be running fine. It also works okay on my 12.04 Ubuntu bare install (just tested), but it seems like it is failing in the virtual image I just spun up (vanilla install of 12.04). I am not entirely sure what is going on.
If all else fails, you can download it from their FTP site directly: ftp::ftp.ncbi.nlm.nih.gov/blast/db
I got the following error:
Connected to NCBI
Downloading nr (46 volumes) ...
Downloading nr.00.tar.gz...Cannot open Local file nr.00.tar.gz: Permission denied
at update_blastdb.pl line 202
Cannot open Local file nr.00.tar.gz.md5: Permission denied
at update_blastdb.pl line 203
Failed to download nr.00.tar.gz.md5!
Would you mind please let me know what's the problem?
Although it has been a few years. I ran into the same issue! I am not used to windows and installed blast under C:\Programs
. What I didn't know is, that under windows you will never be able to set write-rights in this directory on a specific script. Just run the script in a different directory other than C:\Programs
.
mkdir C:\blastdbs
cd C:\blastdbs\
update_blastdb.pl --decompress nt
It has been roughly 6.3 years since this was posted, but in case you or anyone is still wondering, I think your issue is the use of the perl command at the beginning. with the installation of BLAST+ the path to update_blastdb.pl is likely added to the PATH variable. The perl command takes a script in the current directory and runs it, however, the update_blastdb.pl document can be ran without the use of perl (as it initiates perl by itself already). therefore you can run the following
update_blastdb.pl --help
or
update_blastdb.pl <dbname>
update_blastdb.pl nt
update_blastdb.pl taxdb
This will download the *.tar.gz files from the NCBI website. However, you still have to decompress them with either the --decompress
option of update_blastdb.pl
or with tar
:
update_blastdb.pl nt taxdb --decompress
for subdb in *.tar.gz; do tar -zxvf $subdb; done
With enough processors one might consider using an "&" though, as this runs the extraction parallel for all files
for subdb in *.tar.gz; do tar -zxvf $subdb & done
------EDIT------
It is worthwhile realizing that the blastn command must be ran from within the directory where the database is saved. If only the path is specified blastn
will not be able to output taxonomic / scientific names.
Here is a tool for Windows and Mac users that allows you to download some or all databases:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Perfect.Thanks!
As a side note. The link up there is not working, but this should take you to where you need to go http://www.ncbi.nlm.nih.gov/BLAST/docs/update_blastdb.pl (for anyone looking at this thread)
Sweet i need this too, this tar has the latest scripts and tools to use blast locally ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/