Question

The Best Similarity Measure For Protein Alignment

0

Entering edit mode

12.3 years ago

Rein • 0

I am using the pairwise2 module in biopython to do alignment and I am confusing on how to quantify the similarity of the aligned two sequences.

I have done a search of the literature and found that there is many types of similarity measure for protein alignment. For example e-value, p-value , bitscore, percentage identity, etc...

Yet which one is the most commonly used?

pairwise alignment scoring • 4.7k views

ADD COMMENT • link updated 12.1 years ago by Damian Kao 16k • written 12.3 years ago by Rein • 0

score 0 · Answer 1 · 2011-12-29

0

Entering edit mode

12.3 years ago

Damian Kao 16k

You should really use the similarity measure that pertains to what you want to investigate and situation rather than what is most commonly used.

All the similarity measures you've listed essentially come from the same score calculated from the best alignment according to a scoring matrix. The different measures are just the score normalized or scaled differently. Here is the [?]NCBI explanation[?] for the similarity scores you've listed.

The pairwise2 module uses a standard dynamic programming algorithm which uses a scoring matrix. I would read up on how dynamic programming is used for sequence alignments. Start with the wikipedia entry on [?]sequence alignment[?] and go from there.

Personally, I would use a bit score as it is independent of the substitution matrix and search space.

ADD COMMENT • link 12.3 years ago by Damian Kao 16k

0

Entering edit mode

Thank you very much for your information. But I am not so clear with one point, in the computation of bit score it requires the parameter K and lambdam. Most of the document only mention it as a statistical parameter but didn't talk about how to compute it, do the information provided from the pairwise2 module enough to compute K and lambda or I should use another algorithm to find the bit score?

ADD REPLY • link 12.3 years ago by Rein • 0

0

Entering edit mode

I do also find some document mentioned that K and lambda can be derived from my scoring system, does it means my score matrix?

ADD REPLY • link 12.3 years ago by Rein • 0

0

Entering edit mode

@DK Thank you very much for your information. But I am not so clear with one point, in the computation of bit score it requires the parameter K and lambdam. Most of the document only mention it as a statistical parameter but didn't talk about how to compute it, do the information provided from the pairwise2 module enough to compute K and lambda or I should use another algorithm to find the bit score? I do also find some document mentioned that K and lambda can be derived from my scoring system, does it means my score matrix?

ADD REPLY • link 12.3 years ago by Rein • 0