Protein sequence identity calculation

1

Entering edit mode

9.7 years ago

ajingnk ▴ 130

I am new to this field. A little confused about how to define an overall sequence identity between two proteins.

Because one protein can have multiple chains. What I was doing is to compare chain to chain. For protein A and B, I get the maximum similarity for each chain in A to all chains in B, and then get the minimum similarity in all maximum similarities. Or I just conjugate all protein chain sequence to get a whole sequence for that protein.

However, I think I probably should give more credit to long sequence, because short sequence is easier to be similar.

Is there any canonical way to get the identity/similarity score between two protein sequences?

=======================================================================================

And I can also add protein structure information. But for multiple domain proteins, I could not find a score which can scale from 0 to 1, or as easy to understand as sequence identity.

Thanks

sequence protein • 4.6k views

ADD COMMENT • link updated 2.4 years ago by Ram 43k • written 9.7 years ago by ajingnk ▴ 130

Login before adding your answer.