What'S Faster? "Blastall -P Blastp" Or "Blastp?"
1
7
Entering edit mode
10.6 years ago
Lee Katz ★ 3.1k

I am just wondering if anyone has done a timed test and/or a memory test with blastall compared to blastp. In other words, classic BLAST vs BLAST+. It'd be really interesting to get some kind of frequency distribution of the times it takes to blast one set of genes vs a database using each of the tools. The comparison could include any flavor of blast.

I know it's something I can do myself, but I'm being a little lazy about it and it could be something interesting that the biostar community can get together on.

blast blast comparison frequency • 4.2k views
1
Entering edit mode

I haven't done the formal test you're requesting, but I have seen a steady increase in performance with each new release. Unfortunately, I've also seen bugs in the newer releases too sometimes.

11
Entering edit mode
10.6 years ago

This is pretty crude, but here we go. A very rapid comparison using nt and a 219 base query sequence:

$time blastall -p blastn -i gene.fa -d /data/blastdb/nt -o out.blst 32.19s user 3.01s system 99% cpu 35.469 total  $ time blastn -db /data/blastdb/nt -query gene.fa -out out.blstpl
11.61s user 2.22s system 99% cpu 13.846 total


So, from this it looks like BLAST+ is somewhat faster, for a single sequence vs nt.

This is good, but I think we can do better. The BLAST+ paper reports significant speed-ups are possible when querying with long sequences. So, let's try a whole genome (even if it's only a bacterial one)...

$time blastall -a 4 -p blastn -i NC_011353.fna -d /data/blastdb/nt -o out.blst 46115.20s user 24.96s system 388% cpu 3:17:58.91 total  $ time blastn -num_threads 4 -db /data/blastdb/nt -query NC_011353.fna -out out.blstpl
1462.23s user 7.98s system 233% cpu 10:29.37 total


This obviously isn't like-for-like with the first test, I ran these on 4 CPUs because I suspected they might take a while, and I was right. BLAST+ is offering a >30x speed up here. I would be interested to see if the reported reduction in memory usage is achieved too.

So it looks like for this test BLAST+ is faster (further scenarios are also offered in the paper linked above).