Question: How to calculate the Secondary Structure similarity score between a sequence and a protein template?
0

Hi

I am reading the paper "Fold Recognition by Predicted Alignment Accuracy" . In this paper the author first align the the input sequence with the sequence of a template protein.

Then in order to define an structural similarity between the sequence and that protein template, first predict the secondary structure for the sequence with a tool such as PSIPRED. PSIPRED assign three values to each residue (call it I) of the input sequence: Alpha_Helix(I), Beta_Sheet(I), Loop(I). We can see this values as level of confidence for each residue I to be in Alpha Helix, Beta Sheet or Loop region (consider their sum equal to one).

The contribution of paper is that for an aligned pair of residues for example residue J from the sequence and residue K from the template the algorithm define a structural similarity measure between this pair as below:

(remember that we calculated J's confidence for being in each region before with PSIPRED and also we know K's region because it is a residue from the template protein that We know everything about that.)

if K be in alpha-helix region of template

similairy(J,K)=Alpha_Helix(J)-loop(J);

if K be in beta-sheet region of template

similairy(J,K)=Beta_Sheet(J)-loop(J);

What I can't understand is why We reduce the loop(J) from its alpha_helix(J) or beta_sheet(J)?I think It should has a biological background but I don't know what.What I think to be true is :

if K be in alpha-helix region of template

similairy(J,K)=Alpha_Helix(J);

if K be in beta-sheet region of template

similairy(J,K)=Beta_Sheet(J);