Creation of subset of diamond database
3
0
Entering edit mode
3.2 years ago
mark.bogen • 0

Hello, I am interested to perform a massive search with diamond program. I have a large protein database, but I am interesting to perform a search only on subset of it. I can not create a different database, because the set of the desired entries changes from run to run and the creation of database is computationally expensive. Is there any way to run my search against a subset of database restricted by a list of protein names. Thanks a lot,

Mark

alignment • 1.2k views
ADD COMMENT
1
Entering edit mode
3.1 years ago
buchfink ▴ 250

DIAMOND v2.0.8 now supports directly using BLAST database files, and also subsetting using --seqidlist. https://github.com/bbuchfink/diamond

ADD COMMENT
0
Entering edit mode

buchfink : Not sure if DIAMOND has a tool post on Biostars. If not please consider creating one. Improvements such as these can then be directly announced in that thread to keep them visible.

ADD REPLY
0
Entering edit mode
3.2 years ago

With DIAMOND? I'm afraid not no (as far as I understood that kind of behaviour is not compatible with its DB/index structure)

but why not using 'normal' blast than? That does offer that kind of functionality (from version 5 DB onwards)

ADD COMMENT
0
Entering edit mode
3.2 years ago
buchfink ▴ 250

It's not possible sorry, but it's a feature request I would consider useful. If you can wait 2-3 weeks I can make that feature available.

BTW you can speed up database creation by a lot by using the option --masking 0.

Benjamin

ADD COMMENT
0
Entering edit mode

Thanks a lot, I'll readily wait and also I'll try the --masking 0

ADD REPLY

Login before adding your answer.

Traffic: 2153 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6