Creating DIAMOND database on HPC
1
0
Entering edit mode
2.5 years ago
Dee • 0

Hello everyone,

I'm new to DIAMOND and would like to create its database on HPC. I'm wondering if anyone knows how to write the right script to submit a job on SLURM?

Thank you!

Database DIAMOND HPC • 4.2k views
ADD COMMENT
0
Entering edit mode

that I don't know, what i do know is that you will need to have access to considerable amount of storage to store this index/database. So perhaps double check if you have that before really doing it.

ADD REPLY
0
Entering edit mode

Thank you for the information!

ADD REPLY
1
Entering edit mode
2.5 years ago
GenoMax 142k

Creating DIAMOND database is not difficult. It is a straightforward command as noted on the wiki page.

# creating a diamond-formatted database file
./diamond makedb --in reference.fasta -d reference

You will need to wrap this into a SLURM submission script or a command line that will work on your HPC.

That said, if you are going to align against one of the standard NCBI databases then the good news is DIAMOND can now use NCBI blast formatted databases, so that can save you the trouble of creating an index. You will need to use latest DIAMOND version for this.

ADD COMMENT
0
Entering edit mode

Hi GenoMax! Thank you for the information!

You've mentioned that DIAMOND can now use NCBI blast formatted databases without creating an index. May I know how can this be done? Basically, what I'm planning to do is that from NCBI nr database, I would like to create a database of bacteria and archaea. Then, from the bacteria and archaea database, a diamond database will be created, which will then be used for the alignment of the contigs of interest.

Thank you!

ADD REPLY
0
Entering edit mode

DIAMOND Wiki page describes the necessary commands. It is not a trivial task to create a db of just bacteria and archea from nr database. You may want to filter your results afterwards. Otherwise you will have to extract the bacterial/archaea sequences in fasta format from nr and then build a database. May as well build a direct DIAMOND db at that point.

To use nr db as is do:

update_blastdb.pl --decompress --blastdb_version 5 nr
./diamond prepdb -d nr
./diamond blastp -d nr -q queries.fasta -o matches.tsv
ADD REPLY
0
Entering edit mode

Hi GenoMax ! Thank you very much for the information!

ADD REPLY

Login before adding your answer.

Traffic: 2642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6