I'm interested in retrieving deep homologies for a number of genes that belong to a protein superfamily (let's say, GPCR). For it, one of the strategies was to perform HMMER searches, using an alignment or a HMM created from an aligment. For what I have read, many people use specific protein domains in order to determine which proteins found are true matches. In my case, my proteins don't have a specific domain characterizing them, and share a common domain with the rest of the family (e.g. the 7TM domain). Therefore, though I get a good number of good matches (proteins previously identified in the database as an homolog of my query genes) in my search, a number of other proteins from the family appear too, which somehow hampers determining if uncharacterized proteins in my search are true matches or not. I tried to improve this approach by using different domain architectures, but I'm still dealing with the problem of retrieving false matches. I tried to play around with E-values and Bit scores might help, and using a different kind of search (e.g. iterative search), but I haven't found a fully satisfactory way to tackle the issue.