7.9 years ago by
First, I believe SMART and PFAM use different versions of HMMER to search. Pfam has switched to HMMER3, but SMART is still using HMMER2. This alone can account for some inconsistencies. The HMMER2 ls/fs options are not implemented in HMMER3 in the same way. From my head Pfam previously used HMMER2 with ls (globally align the domainmodel, but HMMER3 does not implement ls as of yet and locally aligns the domain model). SMART of course have their own models, compared to Pfam, so depending on how they defined the domain you will find differences as well. SMART focusses more on signalling related domains while Pfam also contains models for protein families, not just domains. Also SMART maintains a separate list of cut-off values for specific domains when they occur in repetition, these are filtered by scripts after running HMMSEARCH.
So usage of uniprot, SMART, Pfam or all of the above depends really on what your ultimate goal is. If you have a large dataset I suggest you use HMMER3+Pfam for the sake of not having to wait eons, also HMMER3 is more sensitive. You can build the SMART models into HMMER3 yourself, but expect different results compared to the SMART database.
Food for thought: domain boundaries are very ill defined. So having domain hits that are off by 1 or 2 residues may not be such a problem... If somebody can comment on this please do!