Following on my previous question regarding discovering protein homology. After finding sequences of interest against a profile of a family, I want to determine whether these sequences can be categorized into this family or not. How can one score proteins against each other so that they can be grouped as so?
Originally, this "family" was determined via simple statistics (pairwise scoring via z-score and alignment calculated from shuffling of these sequences), although I'm not convinced this is a sophisticated enough to determine membership. Therefore I'm looking for a more sophisticated method of scoring this. There are important secondary structures that I am adding to my scoring function, but beyond this, I can't seem to find much on google regarding this type of scoring.