blastp alignment output : scores not in descending order?
3
1
Entering edit mode
6.9 years ago
willing_mh ▴ 20

My understanding is that alignments reported by ncbi blast* tools are always listed in order of decreasing score. However after setting up blastp with a local database I am seeing output where this is not the case as shown below. As evident, the alignment scores are not in listed decreasing order. Any thoughts on what might cause this? thanks. ---

Sequences producing significant alignments:                          (Bits)  Value

HLA:HLA00355 B*51:10 232 bp                                           471   0.0
HLA:HLA03555 G*01:04:05 273 bp                                        469   0.0
HLA:HLA00950 G*01:04:02 273 bp                                        469   0.0
HLA:HLA03552 G*01:01:16 273 bp                                        470   0.0
HLA:HLA03553 G*01:01:15 273 bp                                        470   0.0
HLA:HLA00945 G*01:01:07 273 bp                                        470   0.0
HLA:HLA03159 G*01:01:14 273 bp                                        470   0.0
HLA:HLA03558 G*01:01:19 273 bp                                        470   0.0
HLA:HLA03556 G*01:01:17 273 bp                                        470   0.0
HLA:HLA03557 G*01:01:18 273 bp                                        470   0.0
HLA:HLA13776 G*01:19 273 bp                                           468   0.0
HLA:HLA03396 G*01:12 273 bp                                           468   0.0
HLA:HLA01802 A*02:67 270 bp                                           471   0.0
blast blastp score • 1.9k views
0
Entering edit mode

thanks for your comment - I reformatted the blastp output so it is easier to read: the e-value is reported at 0.0 for all entries but the score both decreases and increases

0
Entering edit mode

Because blastp sorts by e-value, equal e-values are output in random order.

2
Entering edit mode
6.9 years ago

BLAST always reports hits in order of decreasing max_score of found HSPs (E-value is also calculated based on maximal score of HSP).

What you see in your output is total_score for a hit. Total score is a sum of HSPs' scores for a given hit. I can only see a fragment of your BLAST output, but I bet that BLAST found one HSP for your top-ranked sequence HLA:HLA00355. It means it is aligned to your query in one segment which achieved the highest score (471 bits). In this case total_score equals max_score. However, the sequence from the botttom of your output (HLA:HLA01802) achieved the same total_score (471 bits), but its alignment is splitted into more than 1 HSP. For example, one HSP could have score of 400 and the other one is 71, they both sum to total score of 471.

Scroll down your results and look at the alignments to see these differences between scores for HLA:HLA00355 and HLA:HLA01802.

0
Entering edit mode
6.9 years ago

As far as I remember, the default sort order is based on e-value.

0
Entering edit mode
6.8 years ago
willing_mh ▴ 20
thanks for your comment. At this point I would be grateful for any pointers to blast documentation that addresses either of the following points : (1) the criterion used for hit ordering (e-value? score?) (2) a way of setting num_alignments to the maximum, ie to the number of database items searched. Since in this case many e-values are 0, the default ordering is not helpful. Listing all alignments would allow the maximum score to be obtained by scanning the output.
1
Entering edit mode

Read blastp -help output. Also, given default outfmt 6 output, you can simply sort by column 1 (query sequence) and column 12 (bit score), i.e.

export LC_ALL=C; export LANG=C; sort -k1,1 -k12,12gr blast_output > bscore_sorted_blast_output