Question: Forward-Backward Algorithm For Secondary Structure Prediction
gravatar for Stick
7.4 years ago by
Stick10 wrote:

I want to use HMM (forward backward model) for protein secondary structure prediction.

Basically, a three-state model is used: States = {H=alpha helix, B=beta sheet, C=coil}

and each state has a emission probability pmf of 1-by-20 (for the 20 amino acids).

After using a "training set" of sequences on the forward backward model, the expectation maximization converges for an optimal transitions matrix (3-by-3 between the three states), and emission probability pmf for each state.

Does anyone know of a dataset (preferably very small) of sequences for which the "correct" values of the transition matrix and emission probabilities are determined? I would like to use that dataset in Excel to apply the forward backward algorithm and build my confidence to determine whether or not I can get the same result.

And then move on to something less primitive than Excel :o)

ADD COMMENTlink written 7.4 years ago by Stick10

Sounds fun. However, aren't there some pretty good models already out there to make structural predictions? I am not trying to discourage you, just curious if there is a new problem you a trying to tackle.

ADD REPLYlink written 7.4 years ago by Zev.Kronenberg11k

Hi Zev, Thanks for the message. I would like to model the simple case first (i.e. 3 states: {H=alpha helix, B=beta sheet, C=coil}) and then allow for more states (i.e. x states: {H1, H2 ,Hn, ... Hn, B1, B2, ... Bn, C1, C2, ..., Cn) similar to what was done in this interesting paper (

I am a newbie in the field, so maybe one day I will have a new problem to tackle! But for now, I am familiarizing myself with established models.

Anyway, I read the paper very carefully, but would like to play with a dataset for which the "correct" values of the transition matrix and emission probabilities are determined. Do you know of such a set, or how I could obtain one? Thanks!

ADD REPLYlink written 7.4 years ago by Stick10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 738 users visited in the last hour