Question: log-odds score from PAM matrix
0
gravatar for ericpellegrini76
21 months ago by
ericpellegrini760 wrote:

Hi all,

I wrote a small prototype for coding PAM matrices. The generated PAM matrices are OK compared with other sources. However, I have troubles when deriving the score matrix out of the PAM matrix. Indeed, depending on the order (e.g. PAM10, PAM250) I have to select different bases for the logarithms to make my score matrix suit with reference ones found for instance on ncbi ftp site (ftp://ftp.ncbi.nih.gov/blast/matrices).

Indeed:

  • for PAM10: S10 = 2*log2(PAM10/f)
  • for PAM250: S250 = 10.0*log10(PAM250/f)

where f are the amino acids normalized frequencies. I feel quite puzzled with such (apparently) formula inconsistency. I probably misunderstood something. I can not find anything about this in the Dayoff seminal paper. Would you have any idea about what I am doing wrong ? Thanks

alignment • 953 views
ADD COMMENTlink written 21 months ago by ericpellegrini760

Why would you not expect different results from these (they are different algorithms)?

ADD REPLYlink written 21 months ago by Kevin Blighe45k

thx for the reply. The ncbi implementation (see link above) provides log-odds with different logarithmic scales. That puzzles me because the standard formula for computing the score matrix is unique i.e. the log of the PAM matrix divided by the amino-acid frequencies. Whatever the logarithmic base used for computing that formula, I would expect this base to be constant regardless the PAM matrix number (2, 10, 50, 250 ...). Otherwise how to compare sequence alignment performed with PAM10, PAM 50 or PAM250 ?

ADD REPLYlink modified 21 months ago • written 21 months ago by ericpellegrini760
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 856 users visited in the last hour