Question: log-odds score from PAM matrix
0
gravatar for ericpellegrini76
2.6 years ago by
ericpellegrini760 wrote:

Hi all,

I wrote a small prototype for coding PAM matrices. The generated PAM matrices are OK compared with other sources. However, I have troubles when deriving the score matrix out of the PAM matrix. Indeed, depending on the order (e.g. PAM10, PAM250) I have to select different bases for the logarithms to make my score matrix suit with reference ones found for instance on ncbi ftp site (ftp://ftp.ncbi.nih.gov/blast/matrices).

Indeed:

  • for PAM10: S10 = 2*log2(PAM10/f)
  • for PAM250: S250 = 10.0*log10(PAM250/f)

where f are the amino acids normalized frequencies. I feel quite puzzled with such (apparently) formula inconsistency. I probably misunderstood something. I can not find anything about this in the Dayoff seminal paper. Would you have any idea about what I am doing wrong ? Thanks

alignment • 1.3k views
ADD COMMENTlink written 2.6 years ago by ericpellegrini760

Why would you not expect different results from these (they are different algorithms)?

ADD REPLYlink written 2.6 years ago by Kevin Blighe59k

thx for the reply. The ncbi implementation (see link above) provides log-odds with different logarithmic scales. That puzzles me because the standard formula for computing the score matrix is unique i.e. the log of the PAM matrix divided by the amino-acid frequencies. Whatever the logarithmic base used for computing that formula, I would expect this base to be constant regardless the PAM matrix number (2, 10, 50, 250 ...). Otherwise how to compare sequence alignment performed with PAM10, PAM 50 or PAM250 ?

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by ericpellegrini760
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1204 users visited in the last hour