Using NCBI database in DIAMOND BLAST
3.7 years ago

I want to use NCBI protein(nr) database in DIAMOND BLAST. But it is always giving some database extension format error. The commands for standalone BLAST and DIAMOND BLAST are different. And also the database format required for DIAMOND BLAST is different from the normal BLAST. Hence, I'm not able to download and use the Protein database from NCBI directly since, it is always giving some DB extension error. How should I proceed ?

3.7 years ago
GenoMax 99k

You need to recreate the nr database for DIAMOND using the fasta nr file that you can download here. Are you trying to re-use the pre-formatted nr database as is?

3.7 years ago
Sej Modha 4.8k

I've had this issue in past and I believe that's due to the fact that sometimes sequences end up with certain special characters in the fasta file. I tend to edit those sequences (usually one or two) using sed or something and then it works for me.