Question: Blast raw score calculation
0
5.1 years ago by
Spain
cristianrohr76830 wrote:

Hello,

I search for the past two days, trying to find out how the blast raw score is calculated, i read i lot, read the answers here, but i can't found the right answer to my problem

I'm using a 240 aa query, blastp agains refseq, the first result is

Range 1: 140 to 379

Score Expect Method Identities Positives Gaps
462 bits(1190) 2e-160 Compositional matrix adjust. 239/240(99%) 240/240(100%) 0/240(0%)

Raw score 1190, with 1 mismatch

the protein is from H sapiens, and this result appears in the 4 place

Range 1: 1 to 240

Score Expect Method Identities Positives Gaps
450 bits(1157) 6e-158 Compositional matrix adjust. 240/240(100%) 240/240(100%) 0/240(0%)

The raw score is 1157,

how is this possible, that a perfect match has a lower raw score??

blast raw score calculation • 2.9k views
modified 4.4 years ago by Biostar ♦♦ 20 • written 5.1 years ago by cristianrohr76830

The aligned sequences are different: "Range 1: 140 to 379" and "Range 1: 1 to 240". Have a look at the diagonal of the BLOSUM62 matrix and you will understand the reason.

0
5.1 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

Two possible reasons. 1) The raw score is not length normalized. Longer sequences could have bigger raw score values. 2) The raw score is also calculated from a substitution matrix. The purpose of the matrix is to give a score based on the likelihood of an amino acid being substituted for another. Some substitutions will have better score than others. So depending on the amino acid content of your sequence, you can have different scores.

**edit

By the way, if you want to learn how blast works. Ian Korf's BLAST essentials book is probably the most comprehensive resource:

http://pedagogix-tagc.univ-mrs.fr/courses/bioinfo_intro/articles/sequence_alignment/Korf_BLAST_essential_OReilly.pdf

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.