How Reproducible Are The Results Programs Compiled With Blast Libs Vs Calling The Ncbi Binary?
0
0
Entering edit mode
7.6 years ago
pld 5.0k

Is there a risk that using the blast libs to do blast work could result in reproducibility issues if someone were to spot check the results with the NCBI executable? I am working on a program to do ortholog mining (and other things) that relies on BLAST as a key tool. Using the libraries directly gives me the best control over the behavior of the program, but I'm worried that there could be a case where my implementation of BLAST gives a different result than the NCBI compiled binary. Is this a reasonable concern, or will it be fine?

The other option is to make system calls, but using system() makes me cringe.

blast c • 1.6k views
0
Entering edit mode

I think it is all about what reproducible means: does it have to produce strictly the exact same answer all the time, or does it have to support the same biological discovery. For the former you would need to make sure to apply all the same parameters and default settings that command line invocation does.

I think you should be fine as long as the results are correct, because there may be multiple correct answers.

0
Entering edit mode

Not many people use the NCBI toolkit, virtually everyone just parses blast output. That's not to say your approach is wrong, or not reproducible. It's just not that common.

0
Entering edit mode

That's what I ended up going with, spawing subprocesses to the pre-compiled versions. It ended up generating some headaches with the streams not flushing correctly, but the main concern (performance) wasn't an issue. 480 Xeon cores will let me do an all to all blast for two mammalian species in under 3 minutes.