nr database Diamond
1
4
Entering edit mode
3.5 years ago
Dave Th ▴ 50

Hi guys,

I want to use DIAMOND for my metagenome Functional analysis. As the instruction, I have to download NCBI nr database (ftp://ftp.ncbi.nlm.nih.gov/blast/db/). Unfortunately, my internet connection is not very stable, so I have to download a multiple nr file nr.**.tar.gz instead of a nr single gz file using these code:

wget 'ftp://ftp.ncbi.nih.gov/blast/db/nr.01.tar.gz';
cat nr.**.tar.gz | tar -zxvi -f - -C


After that, I got a lot of file in my output directory (~180Gb). I wonder how I can combine all these file into a single nr.faa just like in the DIAMOND manual.

Thank you all.

Dave

alignment protein • 8.1k views
8
Entering edit mode
3.5 years ago
h.mon 34k

DIAMOND needs its own database, it does not work with blast databases - which is what you are downloading. You have to download the NR fasta file, then:

wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
diamond makedb --in nr.gz -d nr

1
Entering edit mode

DIAMOND also needs more RAM than BLAST+. Something to keep in mind.

0
Entering edit mode

How long does it usually take to build nr with diamond using all of the taxonomy files?