Cons of Smith-Waterman Alignment
2
0
Entering edit mode
12 months ago
Student ▴ 30

Hello.

I was reading this article about Whole-Genome Alignment and Comparative Annotation. I find difficult to understand this extract from the article that describes the cons of using Smith-Waterman algorithm for the alignment problem:

Another consideration is how genome rearrangements complicate the alignment problem. Smith–Waterman and Needleman–Wunsch both produce alignments that have fixed order and orientation; that is, insertions, deletions, and substitutions are the only allowed edit operations. When looking within short or well-conserved sequences, like genes, this requirement is usually fulfilled. But at large evolutionary distances and looking within a sufficiently large window, genomes almost always contain more complex rearrangements with respect to each other—inversions, transpositions, and duplications all cause breaks in order and orientation that cannot be captured under constant order and orientation.

My doubt is: what do they mean by "orientation" ?

Smith-Waterman Genomics Alignment DNA Sequences • 831 views
ADD COMMENT
1
Entering edit mode
12 months ago
Guillermo ▴ 10

Hello there!

Normally when people talk about gene orientation, they are referring to whether the gene is encoded on the positive or negative strand of DNA.

I found a complimentary video that may help: https://www.youtube.com/watch?v=JC6ew2xnJBA

ADD COMMENT
1
Entering edit mode
12 months ago

Well, there are expected inversions and translocations in a genome. A piece that was oriented as ---===>----> in a genome of one organism may be oriented as ---<====----> in another organism of the same specie. Dynamic programming algorithms can not detect this, obviously.

ADD COMMENT
0
Entering edit mode

I do not understand why dynamic programming algorithms can not detect this... for example, if I align a gene that has a sequence 5'->3' with an other that is 3'->5', Smith-Waterman algorithm is not able to do it ?

ADD REPLY
1
Entering edit mode

You can check it here https://www.ebi.ac.uk/Tools/psa/emboss_water/ (put DNA):

ACGACACGTAGCAGCATGCAGCATCATACAGCATCACAGTCAGTTTCAGCAGCAAACTACAGT

and its reverse

TGACATCAAACGACGACTTTGACTGACACTACGACATACTACGACGTACGACGATGCACAGCA

The result:

EMBOSS_001         1 ACGAC--------ACGTAGCAGCATGCAGCATCATACAGCA     33
                     |||||        |||||..|..||||        ||||||
EMBOSS_001        31 ACGACATACTACGACGTACGACGATGC--------ACAGCA     63

Not really what we expect, true?

ADD REPLY
0
Entering edit mode

ok thank you , I saw again the algorithm and in fact it makes sense to me that it does not detect this .

ADD REPLY
1
Entering edit mode

yeap, I wanted to answer "see the algorithm itself", but then understood that you may come not from compsci background, so gave an example =)

ADD REPLY

Login before adding your answer.

Traffic: 907 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6