I want to take a very large list of proteins and check to see if they harbor any interesting domains.
Is there a tool or a DB I can download to work with?
Hmmer + Pfam
The databases and software can be downloaded and run locally. I regularly do several hundred proteins through my local install with some additional scripts to agregate the results.
I am biased since I am the developers of MMseqs2. MMseqs2 can perform fast sequence/profile searches. It is possible to annotate 1.1 billion protein sequences with Pfam domains in 8.3 h on a 2×14-core server. Here is a guide how to setup the search.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy