How Is The Score And E-Value Calculated In Blast Outputs
4
5
Entering edit mode
10.4 years ago
Beeth ▴ 170

Hi!

I am working on a BLAST similar tool and wanted to implement the Score (raw Alignment score) and the E-value. I was wondering if someone could please explain me how it is calculated by showing it with an example. I found lots of explanations but didn't get the right answers. I really need an example how these scores are calculated.

Thanks,

Beeth

blast scoring • 18k views
1
Entering edit mode

As I think the answers below suggest... If you don't know what BLAST really does and how its core outputs are calculated, why then would you start working on something that does something similar?

0
Entering edit mode

I've also tried to emulate BLAST's scoring scheme in another program, and it's close to impossible. I've managed to get the bit scores more or less comparable (except in the range of around ~20 - 40 bits), but I've not attempted to calculate E-values. There's a lot of advanced statistics, strange behavior, hard-coded options etc. hidden inside BLAST's bowels.

6
Entering edit mode
10.4 years ago

For a step-by-step example see the Wikipedia article on BLAST, also read about sequence alignment and dynamic programming.

The Statistics of Sequence Similarity Scores: IMHO this is best explanation of the BLAST score and E-value.

Next will be the '98 article: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. I would recommend to read through the references of this article for a detailed overview.

Also check the latest BLAST+ article for a brief overview of the new improvements.

1
Entering edit mode

+1 for "The Statistics of Sequence Similarity Scores"

5
Entering edit mode
10.4 years ago

I recommend:

http://www.amazon.com/Blast-Ian-Korf/dp/0596002998

This book has all the information you are looking for.

1
Entering edit mode
10.4 years ago
Woa ★ 2.9k

Check the book "Genomic Perl: From Bioinformatics Basics to Working Code" by Rex A. Dwyer. See particularly Chapter 7 "Local alignment and BLAST heuristic"

0
Entering edit mode

Though the script is a 'toy' implementation, you can get some idea from there. You can further go thru' the actual code of BLAST, which I think is freely available.

0
Entering edit mode
10.3 years ago
Beeth ▴ 170

Thanks for lots of answers and I've started with lots of sources here. It helps me to get a good introduction to blast.

After I've now a little more understanding I can ask more specific questions such as the following:

The following blast result:

ATTTGCAGAATTTGCAAAAAAATGTTTGT
|||||||||||||||    ||||||||||
ATTTGCAGAATTTGC----AAATGTTTGT


I'd like to calculate the raw alignment score: (Scoring scheme: match=1, mismatch= -2, opening gap= 3, extented gap=2)

We have 25 matches and 4 gaps. S = 25-9 = 16 or do I need to say: there are 4 gaps which are 4 mismatches and therefore we get the following raw score: S = 25-9-8= 8

Which one is correct?

Thanks! Beeth.

0
Entering edit mode

Please re-post this as a new question. The general format of this site is one well-defined question per post, rather than a series of follow-up questions with replies in the manner of a forum or listserv.