Question: multiple sequence alignment using HMM and simulated annealing
0
3.0 years ago by
Rojhin0
Rojhin0 wrote:

Can anyone help me with Multiple Sequence Alignment (MSA) using Hidden Markov Model (HMM) by giving an example or a reference except these 2 references:

1- Eddy, Sea.R., et al.Multiple alignment using hidden markov models, 2- Boer Jonas, Multiple alignment using hidden Markov models, Seminar Hot Topics in Bioinformatics.

I know that there are 3 states: match, deletion and insertion and I know the emission probabilities and transitions probabilities can be learned by viterbi algorithm but what is vague is that if I want to do multiple alignment I need to have HMM and if I want to have HMM I need to have aligned sequences but we know that sequences are unaligned and also with simulated annealing we can Enter randomness to the model and have better solutions and also this algorithm is different with E-M algorithm.

I have another question how many states our model of HMM for this problem should have at the first step, does the number of states change during the time of convergence or it is fixed from the first??

If anybody can help me to understand what really happens in this MSA with HMM I'll appreciate.

I should explain that there have been found more sequences of DNA,RNA and protein but there are less information about structures and functions of each protein so we do MSA to understand the similarities between sequences and find out whether they are homologous (have a same ancestor) or not and find out the unknown structure and functions of sequences.

modified 3.0 years ago • written 3.0 years ago by Rojhin0
1
3.0 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche22k wrote:

You don't need an HMM to do a multiple sequence alignment. The typical procedure is to do the multiple sequence alignment first with some multiple sequence alignment software (e.g. Clustal, Muscle, MAFFT) then derive an HMM from the alignment. Simulated annealing is a technique for optimization of parameters. It can be used to produce multiple sequence alignments or to estimate HMM parameters. It's unclear to which application you're referring to.

Thank to your answer, I know that I can derive a HMM from aligned sequences but as it is mentioned in Eddy's paper ,in the abstract part "A simulated annealing method is described for training HMM and producing MSAs from initially unaligned protein or DNA sequences".

1

This means simulated annealing is used to infer the HMM parameters. You can use an HMM to model a set of unaligned sequences and then use this HMM to produce a multiple sequence alignment. The problem with this approach however is that it requires a lot of sequences which is why one usually starts from an existing alignment. You can read more on using HMMs for multiple sequence alignments here.