Discrepancy between E() Values in Blastp Search - HELP
1
0
Entering edit mode
2.9 years ago
minasmayth • 0

Hi everyone,

I've been running a blastp search to compare the Histone H1t amino acid sequences of Ursus Maritimus and Mus Musculus, and have run into a bit of a problem. When calculating the E() Value, the BLAST program gives a value of 1e-58, which would be fine, but when I try to calculate the E() Value myself using the equation:

E=m×n×2^(-S')


Where E is the expected value, m is the total number of residues in the database/subject species (in this case 212), n is the number of residues in the query sequence (209) and S’ is the bit score (171).

E=212×209×2^(-171)= 1.48031073×〖10〗^(-47)≈〖1e〗^(-47)


Which lead to a problem for me. The bit score is calculated correctly from both BLAST and my calculations, so I'm wondering what the problem could be?

Any help at all would be greatly appreciated! Minas

BLAST Protein Blastp E-value E() Value • 692 views
0
Entering edit mode

Your formula's are correct, that's for sure.

Why are you actually looking to recalculate the blast Evalue?

Not very helpful but it turns out that recalculating blast Evalue is far from straightforward (impossible even?), the blast algorithm is a pitch black box in that sense unfortunately. So I would let it rest.

0
Entering edit mode

Ah okay, I'll leave it at that then. I wanted to do the calculations as the project I am doing recommends it, but if its not a good idea I won't do it. Thanks for the help!

0
Entering edit mode
2.8 years ago
mhampton • 0

Sorry to be pedantic but you should put the species in lower-case.

I tried to reproduce your result and I got a bit score of 183, not 171, but that isn't enough to resolve things. I think the bulk of the discrepancy you are seeing is because of the compositional scoring adjustment, described here:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1343503/

It is irritating that they don't report the statistics in a more transparent way.