Question: Why Do Blastall And Blast+ (Blastn) Give Different Results For What I Beleive Is The Same Search?
8
gravatar for David M
6.7 years ago by
David M540
David M540 wrote:

I've recently run two searches, one with blastall (from legacy blast) and one with modern blastn. Both versions are 2.2.25 (the most recent).

The blastall search is: blastall -i test.fasta -p blastn -d old_formatted_database -m 8 -e 1e-10

The blastn search is: blastn -query test.fasta -db new_formated_database -outfmt 6 -evalue 1e-10

Note that the two databases have each been created with the program's respective formatdb utility.

The blastn search yields 3 results, while the blastall search results in over 50. Why is this? I realize the programs themselves are not readily comparable, but I was under the impression that their output should be similar since they're based on the same scoring methods.

From a more theoretical point of view, shouldn't there be an exact number of hits matching a given query sequence and scoring scheme? Shouldn't these all be reported regardless of the blast approach used?

blast • 5.3k views
ADD COMMENTlink modified 6.7 years ago by Jon Binkley150 • written 6.7 years ago by David M540
3

can you upload and share your data for reproducing this (meaning both databases and the test.fasta), or if not appropriate make another test.db? Otherwise, I would guess or hope that there are some different parameter defaults to explain this. Also, what happens if you use a query identical to the blastall query but using the legacy command wrapper in blast+?

ADD REPLYlink written 6.7 years ago by Michael Dondrup44k
1

I can't upload the data, but I'll make a test DB later today. I did try running the query in the legacy wrapper, and I got a 3rd, different set of results (over 150 hits). I'm also curious about the defaults; I suspect this is where the difference arises. I was hoping someone here might have some insight into the differences in these defaults.

ADD REPLYlink written 6.7 years ago by David M540
1

Was the query particularly short or long? Or the database huge? Because BLAST+ will automatically alter "-task" to enable megablast or blastn-short modes instead of regular blastn. Whereas BLAST has megablast as a separate application.

ADD REPLYlink written 6.7 years ago by Torst870
1

Also, low complexity filtering could be different between BLAST and BLAST+, limiting search results in one of them. Use "blastall -F F" and "blastn -dust no" to compare properly.

ADD REPLYlink written 6.7 years ago by Torst870
1

Any news on the test-case?

ADD REPLYlink written 6.7 years ago by Michael Dondrup44k
14
gravatar for Jon Binkley
6.6 years ago by
Jon Binkley150
SF Bay Area
Jon Binkley150 wrote:

blast+ has a "task" option for blastn and blastp searches. Available blastn tasks are "blastn", "blastn-short", "megablast", and "megablast-dc". For some reason, the default task for blastn is not blastn, but megablast! So, to get it to work like your blastall search, your blast+ command should be:

blastn -task blastn -query test.fasta -db new_formated_database -outfmt 6 -evalue 1e-10

ADD COMMENTlink written 6.6 years ago by Jon Binkley150
4

whoa! that's unexpected!

ADD REPLYlink written 6.6 years ago by Istvan Albert ♦♦ 77k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 514 users visited in the last hour