How to calculate the Secondary Structure similarity score between a sequence and a protein template?
0
0
Entering edit mode
7.8 years ago

Hi

I am reading the paper Fold Recognition by Predicted Alignment Accuracy. In this paper the author first align the the input sequence with the sequence of a template protein.

Then in order to define an structural similarity between the sequence and that protein template, first predict the secondary structure for the sequence with a tool such as PSIPRED. PSIPRED assign three values to each residue (call it I) of the input sequence: Alpha_Helix(I), Beta_Sheet(I), Loop(I). We can see this values as level of confidence for each residue I to be in Alpha Helix, Beta Sheet or Loop region (consider their sum equal to one).

The contribution of paper is that for an aligned pair of residues for example residue J from the sequence and residue K from the template the algorithm define a structural similarity measure between this pair as below:

(remember that we calculated Js confidence for being in each region before with PSIPRED and also we know Ks region because it is a residue from the template protein that We know everything about that.)

if K be in alpha-helix region of template
similarity(J,K)=Alpha_Helix(J)-loop(J);
if K be in beta-sheet region of template
similarity(J,K)=Beta_Sheet(J)-loop(J);


What I can't understand is why We reduce the loop(J) from its alpha_helix(J) or beta_sheet(J)?I think It should has a biological background but I don't know what.What I think to be true is:

if K be in alpha-helix region of template
similarity(J,K)=Alpha_Helix(J);
if K be in beta-sheet region of template
similarity(J,K)=Beta_Sheet(J);

Fold-Recognition Protein Secondary-Structure • 2.9k views