Hello BioStar Community,
I am working with a database storing motif sequences for proteins. The motifs (subsequences of primary sequences of GPCRs) have been excised from an iterative database scanning algorithm that determines the most conserved subsequences in a multiple sequence alignment, following certain criteria (such as length, whether motifs are allowed to overlap). Because I am going to be doing extensive work with these motifs, I need a way of demonstrating that they are truly non-random. I would like a method that randomly selects motifs (with similar constraints and criteria to the original ones). In this way, given a multiple sequence alignment I could compare the profile of the original motifs to that of the random ones to test whether or not they superimpose each other (they should). Can anyone suggest some way of approaching this?
What do you mean by randomly selecting motifs. Do you mean randomly selecting subsequences from your genome?
@I Albert: Yes, to prove that the motifs have been selected in a non-random way, I would like to have some way of randomly selecting subsequences from my sequence alignments (proteomic sequences) and repeat this process many times over.