Question: How To Align Three Dna Sequences In Perl?
0
8.2 years ago by
Ika83830
Ika83830 wrote:

Hi! I have to align three DNA sequences in PERL. I found subroutine in "Genomic Perl: From Bioinformatics Basics to Working Code" which is based on Needleman–Wunsch Algorithm. But I have problem to create working script. Can anybody help me to create program which will work with this subroutine?

``````my \$s1 = 'AAAATATATTTCGCTTTTTTATA';

my \$s2 = 'AGAATATATTTCGGTTAATTATA';

my \$s3 = 'AGAATATAATTCGGTCCATTATA';

sub similarity {

my(\$s1,\$s2,\$s3) = @_;

### fill in edges of cube

foreach my \$i1 (0..length(\$s1)) { \$M[\$i1][0][0]=\$g*\$i1*2; }

foreach my \$i2 (0..length(\$s2)) { \$M[0][\$i2][0]=\$g*\$i2*2; }

foreach my \$i3 (0..length(\$s3)) { \$M[0][0][\$i3]=\$g*\$i3*2; }

### fill in sides of cube

## Side 1

foreach my \$i1 (1..length(\$s1)) {

my \$aa1 = substr(\$s1,\$i1-1,1);

foreach my \$i2 (1..length(\$s2)) {

my \$aa2 = substr(\$s2,\$i2-1,1);

\$M[\$i1][\$i2][0]=max(\$M[\$i1-1][\$i2][0]+\$g+\$g,

\$M[\$i1][\$i2-1][0]+\$g+\$g,

\$M[\$i1-1][\$i2-1][0]+\$g+p(\$aa1,\$aa2));

}
}
## Side 2

foreach my \$i1 (1..length(\$s1)) {

my \$aa1 = substr(\$s1,\$i1-1,1);

foreach my \$i2 (1..length(\$s2)) {

my \$aa2 = substr(\$s2,\$i2-1,1);

\$M[\$i1][0][\$i3] = max(\$M[\$i1-1][0][\$i3]+\$g+\$g,

\$M[\$i1][\$i2-1][0]+\$g+\$g,

\$M[\$i1-1][\$i2-1][0]+\$g+p(\$aa1,\$aa2));

}
}

## Side 3

foreach my \$i1 (1..length(\$s1)) {

my \$aa1 = substr(\$s1,\$i1-1,1);

foreach my \$i2 (1..length(\$s2)) {

my \$aa2 = substr(\$s2,\$i2-1,1);

\$M[0][\$i2][\$i3] = max(\$M[0][\$i2-1][\$i3]+\$g+\$g,

\$M[\$i1][\$i2-1][0]+\$g+\$g,

\$M[\$i1-1][\$i2-1][0]+\$g+p(\$aa1,\$aa2));

}
}

foreach my \$i1 (1..length(\$s1)) {

my \$aa1 = substr(\$s1,\$i1-1,1);

foreach my \$i2 (1..length(\$s2)) {

my \$aa2 = substr(\$s2,\$i2-1,1);

my \$p12 = p(\$aa1,\$aa2);

foreach my \$i3 (1..length(\$s3)) {

my \$aa3 = substr(\$s3,\$i3-1,1);

my \$p13 = p(\$aa1,\$aa3);

my \$p23 = p(\$aa2,\$aa3);

\$M[\$i1][\$i2][\$i3]

= max(\$M[\$i1-1][\$i2-1][\$i3-1]+\$p12+\$p13+\$p23,

\$M[\$i1-1][\$i2-1][\$i3]+\$p12+\$g+\$g,

\$M[\$i1-1][\$i2][\$i3-1]+\$g+\$p13+\$g,

\$M[\$i1][\$i2-1][\$i3-1]+\$g+\$g+\$p23,

\$M[\$i1][\$i2][\$i3-1]+0+\$g+\$g,

\$M[\$i1][\$i2-1][\$i3]+\$g+0+\$g,

\$M[\$i1-1][\$i2][\$i3]+\$g+\$g+0);

}
}
}
return (\$M[length(\$s1)][length(\$s2)][length(\$s3)]);

}
``````
perl homework multiple • 3.9k views
modified 12 months ago by Biostar ♦♦ 20 • written 8.2 years ago by Ika83830
1

\$g is not defined in the code sample; what is it?

1

This is a classic homework problem. It gives you a lot if you solve it by yourself.

2
8.2 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

The basic answer is that the subroutine expects 3 arguments (the 3 sequences) and returns a variable, which you can either print or assign to a new variable. Something like:

``````my \$align = similarity(\$s1, \$s2, \$s3);
``````

However, this will not work with the code in the question; there are several problems with it. One of these is the variable \$g, which is not defined. Another is the inconsistent use of my. If you're going to use my, include "use strict;" at the top and use it consistently.

I don't think the code sample is particularly useful, if you're trying to implement Needleman-Wunsch in Perl. Try a Google search for something more useful. The first hit is a CPAN module for Needleman-Wunsch, which probably contains better code.