I'm carrying out motif enrichment on a set of 15-mer kinase substrate peptides using motif-x, which results in something like this (see below).
Now, I'm trying to select the most significant motifs. According to motif-x, all returned motifs are significantly enriched, but In this case there are 15 enriched motifs, in others there areas much as 40, which is too many. Does any one have suggestions on how I could select best motifs? Here are my ideas:
1- Simply selecting top k motifs e.g. top 3 motifs, based on the score
2- Use a score threshold that maximises similarity between the extracted motifs and a "gold standard" kinase motif that I have. But I was unable to find a score cutoff that works across different kinases.
3- Cluster the motifs somehow, to avoid redundant motifs, but I couldn't find any tools for motif clustering in proteins. Most of them are developed for DNA motifs.