Forum: How do you perform functional testing for your Bioinformatics tools?
gravatar for shenwei356
2.1 years ago by
shenwei3564.6k wrote:

Hi all,

Functional testing is essential to guarantee software reliability.

But I find it needs huge efforts to write hundreds even thousands of test cases to cover all the potential usage scenes. Especially for utility tools like bedtools2, bedops, and my seqkit and csvtk, which have 10+ or 20+ commands.

Bedtools2 creates testing shell scripts for all commands, and automates the tests via Travis CI, which is a very good practice.

I follow @brentp's way and use the shell test framwork ssshtest to write testing scripts (example). Travis is also used to test the source code and functions.

However I did not start to write tests from the very beginning of the development :(, therefore it's impossible to finish writing tests in short time. The consequence is that bugs frequently come out and I have to fix them and frequently update versions (such a long release history of seqkit in one year).

Can you talk about your practice of bioinformatics software testing?

Wish your tools have no bugs,


PS: Can any Mac user tell me why the command paste -sd"+" or paste -s -d "+" fail in OS X?

ADD COMMENTlink modified 2.1 years ago by Istvan Albert ♦♦ 80k • written 2.1 years ago by shenwei3564.6k

paste works as expected for me on macOS Sierra. What kind of error do you get ?

ADD REPLYlink written 2.1 years ago by Jean-Karim Heriche18k

no error detail. but the result is blank (noting). e.g.

echo -e "acgtn\nACGTN" | paste -s -d "+"

This command exportsacgtn+ACGTN on Linux, but it exports nothing in OS X (version not known) according to the Travis log.

At last, I used md5sum to compare contents and the tests passed.

echo -e "acgtn\nACGTN" | md5sum | cut -d " " -f 1
ADD REPLYlink written 2.1 years ago by shenwei3564.6k

It is a slightly different paste that does not automatically recognize that the standard input is used hence it requires that you specify the - to indicate that input comes from stdin

echo -e "acgtn\nACGTN" | paste -s -d "+" -
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Istvan Albert ♦♦ 80k

That's right and on my mac, omitting - (or a file name) gives a usage error:

echo -e "acgtn\nACGTN" | paste -s -d "+"
usage: paste [-s] [-d delimiters] file ...

echo -e "acgtn\nACGTN" | paste -s -d "+" -

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Jean-Karim Heriche18k

Get it! Thank you Istvan!

ADD REPLYlink written 2.1 years ago by shenwei3564.6k
gravatar for Istvan Albert
2.1 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

Based on my own personal experiences over longer periods of time I came to believe that the so-called "test-first" philosophy is misguided and leads to substantial amounts of wasted effort. The vast majority of the tests are too simple, the code never breaks around them, you look back five years later and wonder why did you do all that work since it was never needed, basically it never caught anything.

I think testing needs to be done as part of the documentation process - the code is tested while demonstrating and explaining the inputs and outputs of the program. This solves the 90% situations where testing helps - and then if the usage changes you want that to be reflected in the documentation as well.

Now since it is impossible to foresee which section of the code will cause problemsI found that a reactive testing works very well - when I have a bug I'll add a test for it and perhaps I explore the functionality around that use case. It turns out that bugs usually congregate in the same regions.

ADD COMMENTlink written 2.1 years ago by Istvan Albert ♦♦ 80k

Interesting topic... But I personally disagree with Istvan. I cannot claim either a long or deep experience but I found test-driven development (test-first philosophy) good. It took (actually it's taking) me a while to get my head around it but this book was very useful.

By writing tests first you are forced to establish what your method or class should do and do not worry upfront how it does it, that's of secondary importance. I think one is often tempted to test the implementation of the program rather then the behaviour and this leads to useless tests that break too easily.

Tests that show what a method does (again, forget about the how) are also useful as stand-alone examples and they can be part of the documentation. In fact, test examples ensure that code and documentation are in sync.

Finally, in my opinion/experience I find that even dead simple tests capture a good deal of bugs when you change code elsewhere.

Just my 2p, curious to know others' experience...

ADD REPLYlink written 2.1 years ago by dariober10.0k

I support test-driven development, cause it keeps the reliability when more functions are added which may bring code changes everywhere. I shall add as many as possible tests (package unit tests and software functional tests) for foreseen cases from the very beginning of development not just documenting.

Thank you all. :)

ADD REPLYlink written 2.1 years ago by shenwei3564.6k
gravatar for John
2.1 years ago by
John12k wrote:

Testing is a relatively new concept where you tell your program that it should check it does what you expected it to do. For example in python

a = 1
b = 2
assert a+b == 3

Despite current wide support for unit tests, there has been little evidence that tests actually prevent unexpected outcomes from programs in real world situations. A lot of development shops do it because it gives the feeling of safety/productivity, when you hit test and all those little green arrows get ticked off. That isn't to say they have no value - any tool that forces a developer to take a look at what they expect and what is really happening will result in a better program. Testing is just one very basic and commonly not-properly-implemented way to do that. Type hints, are another. Code review, another. Closely following an architectural design pattern another.

But testing doesn't work well for file parsers, or most things that work with data, because file parsers do not have a defined input/output. Or rather, the range of potential inputs is unlimited, and thus the number of tests must be unlimited for a "successful test" to mean what developers think it means - the code works as expected. A parser can beat a million tests, but still fail when something unexpected appears in user data. This is because the developer didn't write the data. Actually it goes one step further than that - a sufficiently complex parser can end up being turin complete, and thus data can program the parser to perform any calculation the data wishes. The data controls the program. It is impossible to write tests for a program you do not control.

There are even more modern, even more highly recommended, and even more not-properly-implemented options a developer could take. For example LangSec is a movement dedicated to formally defining what makes a bad file format for the point of security, although security has only been chosen because it's the most practical reason not to write bad file formats. But the recommendations of LangSec apply to writing testable code too. This video seems to cover the major topics in the first 5 minutes:

So, if writing testable code for file parsers depends on the file format itself - then "how do we guarantee software reliability in bioinformatics?" is like asking how does one hold ice without melting it. You cannot write a FASTA parser that is testable, because your definition of FASTA is not the same as someone else's. What does a tool do when it encounters a FASTA file with a ">" as the 121st character of the sequence? Split it onto a new line as the spec says rows must be wrapped to 120 lines, resulting in a new entry? Report an error that ">" shouldn't be in the sequence, even though the spec says it should be ignored? Wrap the line to 80 characters (which is also valid FASTA)? Not perform any wrapping at all (as many consider linearised FASTA to be valid FASTA these days).

And FASTA is one of the simpler formats! - try writing a parser for BAM, VCF, etc. I'm not sure it's actually possible. ValidateSam file tried, and it flags pretty much everything as invalid :P And my definition of invalid is probably different from yours, is different from the specification, etc.

Long story short, writing reliable software that has to parse unreliable data formats is fundamentally impossible. Unit-tests will not save us :P

ADD COMMENTlink written 2.1 years ago by John12k

This is a problem that comes up from time to time with my convert2bed.

If I'm lucky, the format has some kind of specification I can follow and some labs put out datasets with some little tweak that breaks the parser, and I either need to write a patch in a fix for it if it is a big lab, or patch in an error message, otherwise. If I'm unlucky, the format has no specification and I have to "reverse-engineer" the format from real-world examples; and then some other real-world file comes out that doesn't work with the parsing logic.

If we keep bioinformatics data in text form, I hope that JSON and versioned JSON Schema schemes can replace current formats. A consistent scheme would at least decide the ground rules with a little more clarity, and versioning would allow for modifications down the road.

ADD REPLYlink written 2.1 years ago by Alex Reynolds28k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1890 users visited in the last hour