For short(ish) amino acid sequences, you could write a brief R script to do this for you. For an amino acid sequence "VARY"
library(Deducer)
input <- "VARY"
sp.input <- strsplit( input, split='')[[1]]
perms <- perm(sp.input)
print(perms)
The permutation matrix will rapidly get quite large as the number of input characters increases. The generate all alternatives is impractical. Alternatively, you repeatedly call sample() how ever many times you need.
sample(sp.input,length(sp.input), replace=FALSE)
num.samples <- 10
for(i in 1:num.samples)
{
random.sample <- sample(sp.input,size=length(sp.input), replace=FALSE)
random.sample <- paste(random.sample,sep='',collapse='')
print(random.sample)
}
Another solution could be to sort your peptide sequence, do a run length encoding and divide each by the total number of residues. This would give you the probabilities of the individual amino acids. You could use this vector of probabilities in the sample() function. In this way, you would ensure that your input string never gets beyond a length of 22. The replace parameter would have to be TRUE and the size parameter would have to be set independently.
How about enzymes families ?! is there any method to create random sequence from a bunch of aligned sequence ?! I will update the question now. Sorry about that.
oh, that's a completely different problem. What is your goal? to provide a control in multiple alignment algorithms?
not really, making a simulation data for proving the concept of a method. This method, is a predictive model for function prediction of protein sequences. Should I open a new question ?!