Consider these two sequences:
These are circadian locomoter output cycles kaput protein (CLK) orthologs from E. superba and D. melanogaster respectively.
The UniProt webpages (correctly?) indicates that both orthologs carry a complement of 1 bHLH domain and 2 PAS domains (note that the annotation sources are different: InterPro for E. superba, and PROSITE for D. melanogaster respectively).
I happened to try and annotate the domains again using hmmscan (https://www.ebi.ac.uk/Tools/hmmer/) against the Pfam database, and to my surprise it did not find the second PAS domain in the D. melanogaster sequence. (It did "correctly" identify 1x bHLH and 2x PAS domains in the E. superba sequence.)
Now this happened with the default search parameters, which includes using the so-called Gathering Threshold for defining the cut-off used to indicate sequence membership to a family of domains. Changing this to e-value (< 0.01, the default) "restores" the previously missing domain. I must also note that all isoforms of this sequence seem to be experiencing the same problem(s).
My question is: why is this the case? The D. melanogaster CLK sequence is arguably the best studied ortholog from the CLK family. This sequence has been used as a bait many times to discover orthologs in other organisms (most of which I presume hmmscan + Pfam annotate correctly w.r.t. their domains). I don't know how the Pfam HMM profiles are constructed but I presume this specific sequence contributed to the construction process.
Why then does the hmmscan + Pfam combination using default cut-offs annotate this sequence incorrectly in comparison to its ortholog? Is there any way to fix this?
Edit: I am reading through the hmmer userguide but this is tough going, and I'm not sure if I'll find an answer in there (or if I do find it, actually understand it).

Hi lieven.sterck, thank you for the answer. I don't mean to be rude, but I don't think your answer addresses my question directly (tangentially perhaps, yes). I do urge you to re-read the OP again, or am I misunderstanding something in your answer? (I apologize for this coming off as somewhat confrontational!!)
My answer (in condensed form) is mostly this
&
and to add: I don't think
hmmscan + Pfamis what most people do, moreover I believe that also UniProt will use InterPro (can't find the ref on their website immediately though )and for why it does work on one of the sequence but not on the other one is likely (because I don't know the exact details of this analysis/domain) because it does just fit within threshold for one and not for the other.
domain (hmm) profiles are build form a multiple alignment of similar sequences, it does not reflect a specific sequence
and no offense taken ;)
I suggest to run both sequences through interproscan, that forms the most comprehensive domain search you can do.