Hi everyone,
I've been working on a personal project in PHP, namely an alignment algoritm using Needlemann-Wunsh. I've added some fixed PAM and BLOSSUM matrices however I wanted to add the option to use a variable PAM matrix. For instance PAM65 would be (PAM1)^65. After implementing everything in PHP and running it, I get the correct PAM250 matrix when compared to the official PAM250 matrix of NCBI (some rounding errors probably). However when testing the PAM30 and comparing that one I get totally different results.
My matrix multiplication code looks like this:
function matrixmult($m1,$m2){
$r=count($m1);
$c=count($m2[0]);
$p=count($m2);
if(count($m1[0])!=$p){throw new Exception('Incompatible matrixes');}
$m3=array();
for ($i=0;$i< $r;$i++){
for($j=0;$j<$c;$j++){
$m3[$i][$j]=0;
for($k=0;$k<$p;$k++){
$m3[$i][$j]+=$m1[$i][$k]*$m2[$k][$j];
}
}
}
for ($i = 0; $i < $r; $i++) {
$sum=0;
for ($j = 0; $j < $c; $j++) {
$sum+=$m3[$j][$i];
}
$sum-=$m3[$i][$i];
echo $sum;
$m3[$i][$i]=1-$sum;
}
return($m3);
}
This would then be run N-1 times with m1 being PAM(i) and m2 being PAM1.
The second part of the function makes sure that the sum of each column = 1
Next I would calculate the Log-odds scoring matrix as following:
foreach ($PAMN2 as $key => $line) {
foreach ($line as $key2 => $word) {
$scomat[$key][$key2]=round(10*log(($word/($rm[$key])),10));
}
}
With the keys being amino acid 1 letter codes and rm being the relative frequencies of the amino acids.
I did not really understand how the log odds are symmetric so I just copied the lower half of the scoring matrix to the upper half:
$allkeys = array_keys($scomat);
for ($i = 0; $i < 20; $i++) {
for ($j = 0; $j < $i; $j++) {
$scomat[$allkeys[$j]][$allkeys[$i]]=$scomat[$allkeys[$i]][$allkeys[$j]];
}
}
I wondered if anyone knows why the scoring matrix is wrong? The PAM1 that I use is the standard one as well as the amino acid frequencies.
Best regards
Cedric