Question

How To Align Three Dna Sequences In Perl?

0

Entering edit mode

13.0 years ago

Ika838 ▴ 30

Hi! I have to align three DNA sequences in PERL. I found subroutine in "Genomic Perl: From Bioinformatics Basics to Working Code" which is based on Needleman–Wunsch Algorithm. But I have problem to create working script. Can anybody help me to create program which will work with this subroutine?

my $s1 = 'AAAATATATTTCGCTTTTTTATA';

my $s2 = 'AGAATATATTTCGGTTAATTATA';

my $s3 = 'AGAATATAATTCGGTCCATTATA';

sub similarity {

my($s1,$s2,$s3) = @_;

### fill in edges of cube

foreach my $i1 (0..length($s1)) { $M[$i1][0][0]=$g*$i1*2; }

foreach my $i2 (0..length($s2)) { $M[0][$i2][0]=$g*$i2*2; }

foreach my $i3 (0..length($s3)) { $M[0][0][$i3]=$g*$i3*2; }

### fill in sides of cube

## Side 1

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

$M[$i1][$i2][0]=max($M[$i1-1][$i2][0]+$g+$g,

$M[$i1][$i2-1][0]+$g+$g,

$M[$i1-1][$i2-1][0]+$g+p($aa1,$aa2));

}
}
## Side 2

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

$M[$i1][0][$i3] = max($M[$i1-1][0][$i3]+$g+$g,

$M[$i1][$i2-1][0]+$g+$g,

$M[$i1-1][$i2-1][0]+$g+p($aa1,$aa2));

}
}

## Side 3

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

$M[0][$i2][$i3] = max($M[0][$i2-1][$i3]+$g+$g,

$M[$i1][$i2-1][0]+$g+$g,

$M[$i1-1][$i2-1][0]+$g+p($aa1,$aa2));

}
}

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

my $p12 = p($aa1,$aa2);

foreach my $i3 (1..length($s3)) {

my $aa3 = substr($s3,$i3-1,1);

my $p13 = p($aa1,$aa3);

my $p23 = p($aa2,$aa3);

$M[$i1][$i2][$i3]

= max($M[$i1-1][$i2-1][$i3-1]+$p12+$p13+$p23,

$M[$i1-1][$i2-1][$i3]+$p12+$g+$g,

$M[$i1-1][$i2][$i3-1]+$g+$p13+$g,

$M[$i1][$i2-1][$i3-1]+$g+$g+$p23,

$M[$i1][$i2][$i3-1]+0+$g+$g,

$M[$i1][$i2-1][$i3]+$g+0+$g,

$M[$i1-1][$i2][$i3]+$g+$g+0);

}
}
}
return ($M[length($s1)][length($s2)][length($s3)]);

}

perl multiple homework • 5.1k views

ADD COMMENT • link updated 5.8 years ago by Biostar 20 • written 13.0 years ago by Ika838 ▴ 30

1

Entering edit mode

$g is not defined in the code sample; what is it?

ADD REPLY • link 13.0 years ago by Neilfws 49k

1

Entering edit mode

This is a classic homework problem. It gives you a lot if you solve it by yourself.

ADD REPLY • link 13.0 years ago by Aleksandr Levchuk 3.2k

score 2 · Answer 1 · 2011-05-07

The basic answer is that the subroutine expects 3 arguments (the 3 sequences) and returns a variable, which you can either print or assign to a new variable. Something like:

my $align = similarity($s1, $s2, $s3);

However, this will not work with the code in the question; there are several problems with it. One of these is the variable $g, which is not defined. Another is the inconsistent use of my. If you're going to use my, include "use strict;" at the top and use it consistently.

I don't think the code sample is particularly useful, if you're trying to implement Needleman-Wunsch in Perl. Try a Google search for something more useful. The first hit is a CPAN module for Needleman-Wunsch, which probably contains better code.