Gap penalty in smith waterman
1
1
Entering edit mode
9.5 years ago

Hi all,

I am looking for advice on how to calculate the gap and affine gap extension penalties that are used in the dynamic programming approaches to sequence alignment. I understand that the substitution matrices are simple lod scores, but always see somewhat hand wavy justifications of gap penalties.

As an aside, is there a reason why there doesn't seem to be nearly as much literature for substitution matrices in DNA as opposed to proteins - presumably there is a reason for this?

Cheers

sequence-alignment • 3.7k views
ADD COMMENT
1
Entering edit mode
9.5 years ago

Interesting question on a subject I never actually thought about.

I would say the cause for lack of results on substitution matrices for DNA is that there are so few options: there are just three alternatives for which the score will depend on the context that it is being used in. In addition the nocoding DNA has a lot less conservation and a lot less defined functionality than the protein coding region - so it is hard to come up with a general rule.

As for gaps: the information in a mismatch is easy to capture and formalize, a gap's role will depend on what is being replaced, how long the gaps are etc.

ADD COMMENT
0
Entering edit mode

So what you are saying is that it is necessary to calculate gap penalty for a given base matching, and use replacement base context? Do you have a sense of how people are generally coming up with the substitution matrices for smith waterman dna local alignment experiments - it seems to be mostly just qualitative choice. Is his a fair characterisation?

ADD REPLY
0
Entering edit mode

scoring is a measure of similarity - it is used to compare sequences and serves as a metric. For that to work properly it has to actually be able to quantify the differences. And when it comes to just DNA there is just not enough information - it is a bit like trying to infer someone's height from their shoe size. It works for the extreme cases - a baby vs Shaq - but it just does not contain information to properly characterize an average height person.

ADD REPLY

Login before adding your answer.

Traffic: 2541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6