How To Align Three Dna Sequences In Perl?
2
0
Entering edit mode
13.0 years ago
Ika838 ▴ 30

Hi! I have to align three DNA sequences in PERL. I found subroutine in "Genomic Perl: From Bioinformatics Basics to Working Code" which is based on Needleman–Wunsch Algorithm. But I have problem to create working script. Can anybody help me to create program which will work with this subroutine?

my $s1 = 'AAAATATATTTCGCTTTTTTATA';

my $s2 = 'AGAATATATTTCGGTTAATTATA';

my $s3 = 'AGAATATAATTCGGTCCATTATA';

sub similarity {

my($s1,$s2,$s3) = @_;

### fill in edges of cube

foreach my $i1 (0..length($s1)) { $M[$i1][0][0]=$g*$i1*2; }

foreach my $i2 (0..length($s2)) { $M[0][$i2][0]=$g*$i2*2; }

foreach my $i3 (0..length($s3)) { $M[0][0][$i3]=$g*$i3*2; }

### fill in sides of cube

## Side 1

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

$M[$i1][$i2][0]=max($M[$i1-1][$i2][0]+$g+$g,

$M[$i1][$i2-1][0]+$g+$g,

$M[$i1-1][$i2-1][0]+$g+p($aa1,$aa2));

}
}
## Side 2

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

$M[$i1][0][$i3] = max($M[$i1-1][0][$i3]+$g+$g,

$M[$i1][$i2-1][0]+$g+$g,

$M[$i1-1][$i2-1][0]+$g+p($aa1,$aa2));

}
}

## Side 3

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

$M[0][$i2][$i3] = max($M[0][$i2-1][$i3]+$g+$g,

$M[$i1][$i2-1][0]+$g+$g,

$M[$i1-1][$i2-1][0]+$g+p($aa1,$aa2));

}
}

foreach my $i1 (1..length($s1)) {

my $aa1 = substr($s1,$i1-1,1);

foreach my $i2 (1..length($s2)) {

my $aa2 = substr($s2,$i2-1,1);

my $p12 = p($aa1,$aa2);

foreach my $i3 (1..length($s3)) {

my $aa3 = substr($s3,$i3-1,1);

my $p13 = p($aa1,$aa3);

my $p23 = p($aa2,$aa3);

$M[$i1][$i2][$i3]

= max($M[$i1-1][$i2-1][$i3-1]+$p12+$p13+$p23,

$M[$i1-1][$i2-1][$i3]+$p12+$g+$g,

$M[$i1-1][$i2][$i3-1]+$g+$p13+$g,

$M[$i1][$i2-1][$i3-1]+$g+$g+$p23,

$M[$i1][$i2][$i3-1]+0+$g+$g,

$M[$i1][$i2-1][$i3]+$g+0+$g,

$M[$i1-1][$i2][$i3]+$g+$g+0);

}
}
}
return ($M[length($s1)][length($s2)][length($s3)]);

}
perl multiple homework • 5.1k views
ADD COMMENT
1
Entering edit mode

$g is not defined in the code sample; what is it?

ADD REPLY
1
Entering edit mode

This is a classic homework problem. It gives you a lot if you solve it by yourself.

ADD REPLY
2
Entering edit mode
13.0 years ago
Neilfws 49k

The basic answer is that the subroutine expects 3 arguments (the 3 sequences) and returns a variable, which you can either print or assign to a new variable. Something like:

my $align = similarity($s1, $s2, $s3);

However, this will not work with the code in the question; there are several problems with it. One of these is the variable $g, which is not defined. Another is the inconsistent use of my. If you're going to use my, include "use strict;" at the top and use it consistently.

I don't think the code sample is particularly useful, if you're trying to implement Needleman-Wunsch in Perl. Try a Google search for something more useful. The first hit is a CPAN module for Needleman-Wunsch, which probably contains better code.

ADD COMMENT

Login before adding your answer.

Traffic: 1609 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6