log-odds score from PAM matrix
0
0
Entering edit mode
6.5 years ago

Hi all,

I wrote a small prototype for coding PAM matrices. The generated PAM matrices are OK compared with other sources. However, I have troubles when deriving the score matrix out of the PAM matrix. Indeed, depending on the order (e.g. PAM10, PAM250) I have to select different bases for the logarithms to make my score matrix suit with reference ones found for instance on ncbi ftp site (ftp://ftp.ncbi.nih.gov/blast/matrices).

Indeed:

  • for PAM10: S10 = 2*log2(PAM10/f)
  • for PAM250: S250 = 10.0*log10(PAM250/f)

where f are the amino acids normalized frequencies. I feel quite puzzled with such (apparently) formula inconsistency. I probably misunderstood something. I can not find anything about this in the Dayoff seminal paper. Would you have any idea about what I am doing wrong ? Thanks

alignment • 2.9k views
ADD COMMENT
0
Entering edit mode

Why would you not expect different results from these (they are different algorithms)?

ADD REPLY
0
Entering edit mode

thx for the reply. The ncbi implementation (see link above) provides log-odds with different logarithmic scales. That puzzles me because the standard formula for computing the score matrix is unique i.e. the log of the PAM matrix divided by the amino-acid frequencies. Whatever the logarithmic base used for computing that formula, I would expect this base to be constant regardless the PAM matrix number (2, 10, 50, 250 ...). Otherwise how to compare sequence alignment performed with PAM10, PAM 50 or PAM250 ?

ADD REPLY

Login before adding your answer.

Traffic: 2609 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6