Bl2Seq/Blastall And The E Value Help
2
4
Entering edit mode
12.9 years ago
Gturco ▴ 40

I am using bl2seq(2.2.25) to blastn two fasta files. I am not sure I fully understand how the (-e) parameter works.

I thought the -e parameter was a cut off e-value and would therefore show all results with a e_value under 10 given my current parameters.This is not what I am seeing. Can someone please explain why?

EXAMPLE: When I set my e_value to 10 (-e 10) I am missing some results that have an -e value under 10. I know this because when i set it to -e 50 (using the same parameters) I get more results of e values under 10.

NOTICE: the 0.09 e values have one more when i set my e value to 50

/blast-2.2.25/bin/bl2seq -p blastn -D 1 -E 2 -q -2 -r 1 -G 5 -W 7 -F T -e 10.11 -i '1.fasta' -j '3.fasta' -I 35606641,35636750 -J 66649067,6668249735606641,35636750 -J 66649067,66682497

BLASTN 2.2.25 [Feb-01-2011]
Query: 
Fields: Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score
QUERY        88.89    72    5    1    35624525    35624593    66670273    66670344    8e-17    83.4
QUERY        79.37    63    13    0    35625030    35625092    66670905    66670967    8e-06    46.8
QUERY        79.66    59    12    0    35618518    35618576    66660799    66660857    3e-05    44.9
QUERY        82.81    64    9    2    35624843    35624905    66670679    66670741    4e-04    41.1
QUERY        95.45    22    1    0    35618611    35618632    66660909    66660930    0.006    37.2
QUERY        91.30    23    2    0    35616154    35616176    66656321    66656343    0.090    33.4
QUERY        90.32    31    2    1    35622387    35622416    66666325    66666355    0.090    33.4
QUERY        100.00    17    0    0    35636434    35636450    66679138    66679154    0.090    33.4
QUERY        94.74    19    1    0    35619531    35619549    66661864    66661882    0.34    31.5
QUERY        100.00    15    0    0    35624258    35624272    66660468    66660482    1.3    29.5
QUERY        90.48    21    2    0    35623234    35623254    66667186    66667206    1.3    29.5

/blast-2.2.25/bin/bl2seq -p blastn -D 1 -E 2 -q -2 -r 1 -G 5 -W 7 -F T -e 50.11 -i '1.fasta' -j '3.fasta' -I 35606641,35636750 -J 66649067,66682497

BLASTN 2.2.25 [Feb-01-2011] Query:

Fields: Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score

    QUERY        88.89    72    5    1    35624525    35624593    66670273    66670344    8e-17    83.4
    QUERY        79.37    63    13    0    35625030    35625092    66670905    66670967    8e-06    46.8
    QUERY        79.66    59    12    0    35618518    35618576    66660799    66660857    3e-05    44.9
    QUERY        82.81    64    9    2    35624843    35624905    66670679    66670741    4e-04    41.1
    QUERY        95.45    22    1    0    35618611    35618632    66660909    66660930    0.006    37.2
    QUERY        91.30    23    2    0    35616154    35616176    66656321    66656343    0.090    33.4
    QUERY        90.32    31    2    1    35622387    35622416    66666325    66666355    0.090    33.4
    QUERY        96.00    25    0    1    35625785    35625809    66672077    66672100    0.090    33.4
    QUERY        100.00    17    0    0    35636434    35636450    66679138    66679154    0.090    33.4
    QUERY        94.74    19    1    0    35619531    35619549    66661864    66661882    0.34    31.5
    QUERY        100.00    15    0    0    35624258    35624272    66660468    66660482    1.3    29.5
    QUERY        90.48    21    2    0    35623234    35623254    66667186    66667206    1.3    29.5

35606641,35636750 -J 66649067,66682497

blast blast • 4.5k views
ADD COMMENT
3
Entering edit mode
12.9 years ago

The missing alignment with e-value 0.09 is relatively short and has a gap. So it might be that it originated as two HSPs with e-values above 10, which are discarded in the first case, but in the second case are merged yielding an alignment with a higher bitscore and lower e-value. (See steps 9 and 10 of the Wikipedia explanation.)

Update: In my own BLAST queries, I saw a case where increasing the e-value threshold (while using Smith-Waterman traceback) caused some hits to vanish and others to appear. So there is more non-determinism to this than the explanation I give. :-/

ADD COMMENT
0
Entering edit mode

Ah, that seems likely, +1 to you. You should be able to check this by looking at the XML output from BLAST, where each [?] may have multiple [?] elements. Thanks for the WP link, hadn't seen that one.

ADD REPLY
0
Entering edit mode

Thanks! I didn't even think of that! also thanks for the additional help for checking it. =)

ADD REPLY
0
Entering edit mode
12.9 years ago
Ketil 4.1k

Looks like a BLAST [?]bug[?] feature? I'd be tempted to think that BLAST would adjust its stringency based on the E-value, but you seem to have all the other parameters nailed down. BLAST is using a heuristic, so perhaps some non-determinism is to be expected, perhaps especially as your e-value thresholds are high. At any rate, I think failing to report an alignment with an e-value of 0.090 must be considered a bug.

What happens if you switch the sequences? What happens if you use other e-values?

ADD COMMENT

Login before adding your answer.

Traffic: 4004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6