Question: BLOSUM80 values differ - but what's the reason?
1
gravatar for RamRS
2.4 years ago by
RamRS17k
Houston, TX
RamRS17k wrote:

Hi all,

I've a simple but intriguing question: I was trying to pull the BLOSUM80 matrix online, and I see that the substitution scores differ among sources.

For example, the Ter scores on this matrix are -8 while the same change on this NCBI BLOSUM80 are -6

Is there a "right" matrix? What do these different scores indicate as an underlying factor? I'd appreciate the community's input on this please.

Thank you!

substitution matrix blosum • 1.1k views
ADD COMMENTlink modified 2.4 years ago by gearoid200 • written 2.4 years ago by RamRS17k
2

This may also be related to the miscalculation discovered in BLOSUM matrices a few years back. Surprisingly miscalculated matrix was found to be better for searches.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax55k

Sort of scary this observation, since it will eventually affect the results

ADD REPLYlink written 2.4 years ago by Antonio R. Franco3.8k
2
gravatar for gearoid
2.4 years ago by
gearoid200
gearoid200 wrote:

It looks like the two versions are using different units. In the first matrix the unit is 1/3 bit, while in the second matrix, it's 1/2 bit.

From the second matrix:

a scale of ln(2)/2.0

is the conversion factor for 'nats' to half bit units.

ADD COMMENTlink written 2.4 years ago by gearoid200

good point,

the concerning thing here is that whereas the relative ratios (probabilities) will be the same the total scores for alignments will be different when using the two, at which point stating which BLOSUM80 matrix one used is required. Oh I used the BLOSUM80 with this scaling not the other ... as if there there weren't enough issues to deal with.

Frankly I never thought that you could download two BLOSUM80 matrices from NCBI and get very different data ... well now I know

ftp://ftp.ncbi.nlm.nih.gov/blast/matrices/BLOSUM80

http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/data/BLOSUM80

ADD REPLYlink written 2.4 years ago by Istvan Albert ♦♦ 77k

I did not realize both were available in NCBI!

ADD REPLYlink written 2.4 years ago by RamRS17k

I noticed that too, but I'm confused - I recall having to use log_base_2( P(Obs) / P(Exp) ) to get the scores. How would I convert these two units?

ADD REPLYlink written 2.4 years ago by RamRS17k

It is a constant scaling factor to the equation:

https://en.wikipedia.org/wiki/BLOSUM#Score_of_the_BLOSUM_matrices

The factor \lambda is a scaling factor, set such that the matrix contains easily computable integer values.

ADD REPLYlink written 2.4 years ago by Istvan Albert ♦♦ 77k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 722 users visited in the last hour