Question: How Reproducible Are The Results Programs Compiled With Blast Libs Vs Calling The Ncbi Binary?
gravatar for pld
6.7 years ago by
United States
pld4.9k wrote:

Is there a risk that using the blast libs to do blast work could result in reproducibility issues if someone were to spot check the results with the NCBI executable? I am working on a program to do ortholog mining (and other things) that relies on BLAST as a key tool. Using the libraries directly gives me the best control over the behavior of the program, but I'm worried that there could be a case where my implementation of BLAST gives a different result than the NCBI compiled binary. Is this a reasonable concern, or will it be fine?

The other option is to make system calls, but using system() makes me cringe.

C blast • 1.4k views
ADD COMMENTlink modified 6.6 years ago by Biostar ♦♦ 20 • written 6.7 years ago by pld4.9k

I think it is all about what reproducible means: does it have to produce strictly the exact same answer all the time, or does it have to support the same biological discovery. For the former you would need to make sure to apply all the same parameters and default settings that command line invocation does.

I think you should be fine as long as the results are correct, because there may be multiple correct answers.

ADD REPLYlink written 6.7 years ago by Istvan Albert ♦♦ 85k

Not many people use the NCBI toolkit, virtually everyone just parses blast output. That's not to say your approach is wrong, or not reproducible. It's just not that common.

ADD REPLYlink modified 11 months ago by _r_am30k • written 6.6 years ago by Jeremy Leipzig19k

That's what I ended up going with, spawing subprocesses to the pre-compiled versions. It ended up generating some headaches with the streams not flushing correctly, but the main concern (performance) wasn't an issue. 480 Xeon cores will let me do an all to all blast for two mammalian species in under 3 minutes.

ADD REPLYlink written 6.6 years ago by pld4.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1134 users visited in the last hour