Am I right in saying, I should either use HMMER 3.0 with Pfam 26 assuming Rob has rebuilt the seed alignments for the problematic families or use HMMER 2.3.2 with Pfam 23? (Before Pfam Cambridge used HMMER 3.0 to produce their profile HMMs)
What I'm really trying to do is run a hmmscan on protein sequences against pfam to identify all the domains. Then extract the proteins that have zinc finger domains in them.
Any other comments about this would be greatly appreciated.
I guess as I am specifically named in this post I should reply. So, although the curators at Pfam have added many additional Zn-finger families to Pfam, the current version of HMMER (3.0) still struggles with finding all of these short, relatively divergent domains. I can confirm that Sean Eddy, the author of HMMER, is actively working on fixing the problem. The problem arises due to the fact that HMMER 3.0 only produces local-local alignments (i.e. alignments can be generated from a partial match the model and a sub region of the sequence). In HMMER 2.3.2, Zn-fingers and such like, where detected via the global-local mode (i.e. the full length of the model had to be matched against the sequence or sub region of sequence). Forcing the match along the entire length of the model, provided just enough of a boost to the score, to enable clear distinction of the motif from the background or noise. In HMMER 3.1, glocal alignments will return and the problem will go away. From the initial results I have seen, it will even improve the detection rate of such motifs.
In the meantime, what to do? Well, if this were me and I was specifically interested in Zn-fingers, I would probably run both Pfam 23 and 26 with the respective matches and take the union of the hits. You may also consider running the sequences against InterPro as well. The different methods/signatures may detect additional matches.