Hi all, this is the first time that I am posting a query here. So please do not be offended if I have written something which is not the norm here.
Essentially, I have an alignment of protein coding genes and I am trying to find if there are certain regions in the alignment which evolve under dual constraints i.e. in addition to the constraint for protein coding, there is also some regulatory role for certain DNA elements. Such constraints are usually observed for elements like Exonic splicing enhancers.
One way to do that is to look at the conservation at synonymous sites. Since synonymous sites are free from constraints of protein coding, any conservation observed at such sites would imply an additional role for such subsequences, possibly regulatory. However, I do not know how do some math/stats to prove that the conservation observed at synonymous sites is statistically significant and more than what would be observed under a null model (where the null model states that the only constraint/conservation observed in a multiple sequence alignment is because of protein coding). Any help in this regard would be very very welcome.