Blast: Influence Of Database Filesize On Performance, And Purpose Of -Max_File_Sz Option
1
1
Entering edit mode
11.9 years ago

Obviously BLAST is going to have better on-average performance on a smaller database than on a larger database. But for a particular (large) database (let's say 10s of GB), how does splitting up the database into smaller files affect performance of subsequent BLAST searches? The default is to split the database into files no larger than 1GB, but is performance significantly affected if I decides to split the same database into chunks of 0.5GB or 5GB? It doesn't seem like this would make much difference...

...which leads to the second part of my question: what is the purpose of the max_file_sz option? Does file size indeed effect performance? Or is this perhaps a holdover from the days when 32-bit architectures placed constraints on file size?

blast • 2.4k views
ADD COMMENT
1
Entering edit mode
11.9 years ago
Niek De Klein ★ 2.6k

One purpose for max_file_sz (or use; I don't know if it was included for this particular point) is for using the GPU to BLAST. GPUs don't have as much global memory as CPUs do so this might sometimes be necessary to lower the max file size(see GPU_BLAST README for more details). I haven't seen it used as a performance improvement directly.

Don't know about the splitting of databases (you could test it and post the results here;)

ADD COMMENT

Login before adding your answer.

Traffic: 4037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6