I'm trying to run Rosetta to do a fold and dock of the C-terminal of SARS-CoV-2 spike protein.
When I run make_fragments.pl for the first time, it downloads a bunch of nr*.gz files, and then fails with the following error:
[fastacmd] ERROR: ERROR: Cannot initialize readdb for nr database
This is a new Rosetta install of the latest version.
Digging into this a bit, make_fragments.pl runs install_dependencies.pl which runs fastacmd -D 1, like this:
my $cmd = "$Bin/blast/bin/fastacmd -D 1 > $datdir/nr";
In my case the data is in /mnt/data/rosetta/tools/fragment_tools/databases so I tried changing to that directory and running /mnt/data/rosetta/tools/fragment_tools/blast/bin/fastacmd -D 1 directly; same error.
The databases directory is rather large, 217GB. It contains nr.00.phd nr.00.phi nr.00.phr nr.00.pin nr.00.pog nr.00.ppd nr.00.ppi nr.00.psq nr.00.tar.gz.md5 and so on, for nr.00 through nr.38; also nr.pal nr.pdb nr.pos nr.pot nr.ptf nr.pto. It doesn't seem like there were any issues with downloading and unpacking the nr database.
I'm not sure why fastacmd produces an error or how to fix it. What is it looking for? Why is the thing it's looking for not there?
Environment: Rosetta 2020.08.61146 (rosetta_bin_linux_3.12_bundle.tgz), Ubuntu 18.04 on AWS instance
Thanks! How recent of a change is this? My Rosetta seems to be from March 9, 2020.
It looks like the nr download is handled by ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.10.0/ncbi-blast-2.10.0+-src.tar.gz in c++/src/app/blast/update_blastdb.pl (it was expecting ncbi-blast-2.9 but couldn't find it so I had to bump the version) and fastacmd is from ftp://ftp.ncbi.nih.gov/blast/executables/legacy.NOTSUPPORTED/2.2.17/blast-2.2.17-x64-linux.tar.gz
This change was done on Feb 4th, 2020: New layout for NCBI BLAST FTP database site starting February 4, 2020
You may need to use
blastdbcmd
instead (sounds like that is retrieving fasta sequences).@genomax - That was totally the right answer; I swapped v4 data in and everything else worked. Feel free to post that as an answer. Rosetta patch coming up :)
Thanks for the confirmation. I moved my comment to an answer. You can accept it (green check) to provide closure to this thread.
Hi, could you please describe how did you sapped to v4? I'm having the same problem.
Thanks,
Hi, could you please describe how did you sapped to v4? I'm having the same problem.
Hello Diego,
I did something like this: In a clean rosetta install, edit main/tools/fragment_tools/install_dependencies.pl to comment out these four lines
Run these commands to download v4 (if you don't have wget, install it first)
and NOW you can run install_dependencies.pl as usual
Thank you very much.
I'm going to try this.
Best regards
I ran into this issue today and I was wondering if this was ever patched in Rosetta, or should I still be falling back to the v4 format data?
I've been using the Linux release version "2021.16+release.8ee4f02"