How To Evaluate The Scoring Scheme Used In Pairwise Alignment
1
0
Entering edit mode
8.1 years ago
Maria ▴ 170

Hello, I want to align two nucleotide sequences using a semi global alignment method where gaps are for free at the end of sequences. I'm using simple scoring scheme i.e constants for match , mismatch and gap. The problem is that I don't know what score shall I choose knowing that by changing the length of aligned sequences is changing also . Below is an example of sequences : 1- longer sequence: cggacgtgccattgcatgccccgggacgc 2- shorter sequence: acgtggattacgagagaga The alignment should look like this 

cgggacgtgccattgcatgccccgggacgc
--- acgtggatt-----------------


*question : how to evaluate the scoring scheme i'm using in order to chose the best ? *

Thanks in advance for any suggestion

alignment programming scoring genetics • 3.1k views
0
Entering edit mode

What do you mean by what score should you choose? The score for matching, mismatching and gaps?

0
Entering edit mode

yes for example match , mismatch gap : +1,-1,-2 is one option but how to evaluate this choice ? and why not + 1,-1,-3 for example etc ..

2
Entering edit mode
8.1 years ago
Niek De Klein ★ 2.6k

It depends on your dataset. For simple alignments (like the one you showed) having +1/-1 for match/mismatch is usually good enough. However, this does not take into consideration the difference between mutations, like transversion and transition. If you want to include this you want to probably use the Kimura model, which gives a scoring model like this:

  A C G T
A 6 1 2 1
C 1 6 1 2
G 2 1 6 1
T 1 2 1 6
`

For gap penalty it depends on how divergent the sequences are that you are aligning. For closely related organisms, take a low gap penalty. The more divergent organisms, take a higher gap penalty.

0
Entering edit mode

any method for evaluating the resulted alignment using that score ?