I'm wondering if anyone out there has any experience aligning copies of a protein domain to it's Pfam domain using the HMM. Currently I'm aligning large numbers (1 - 100k) of sequences with Clustal Omega and supplying the Pfam using the
--hmm-in parameter to guide the alignment, however I'm unhappy with the resulting alignments, as they have a large number of gap positions.
There is no gap opening penalty to explicitly set in clustal, but from the clustal omega README it appears that the way to reduce gaps would be to increase the number of iterations of alignment. Currently I've been using the
--iter parameter but I'm wondering if anyone has had positive results setting the max hmm iterations (
--max-guidetree-iterations seperately instead?
If I had to summarize my question I'm basically asking:
- Are there any other programs out there to do external profile alignment?
- Does anyone know the optimal way to use iterations in clustal omega to reduce the number of gaps in an alignment? Is it the guide tree iterations that care causing the gaps to be introduced?
- How would you evaluate how much alignments improve with iterations other than visual inspection? Does anyone know more objective criteria to benchmark with?
Thanks in advance Biostars!