Question

Forum:How do you perform functional testing for your Bioinformatics tools?

7

Entering edit mode

8.3 years ago

shenwei356 8.7k

Hi all,

Functional testing is essential to guarantee software reliability.

But I find it needs huge efforts to write hundreds even thousands of test cases to cover all the potential usage scenes. Especially for utility tools like bedtools2, bedops, and my seqkit and csvtk, which have 10+ or 20+ commands.

Bedtools2 creates testing shell scripts for all commands, and automates the tests via Travis CI, which is a very good practice.

I follow brentp's way and use the shell test framework ssshtest to write testing scripts (example). Travis is also used to test the source code and functions.

However I did not start to write tests from the very beginning of the development :(, therefore it's impossible to finish writing tests in short time. The consequence is that bugs frequently come out and I have to fix them and frequently update versions (such a long release history of seqkit in one year).

Can you talk about your practice of bioinformatics software testing?

Wish your tools have no bugs,

Wei

PS: Can any Mac user tell me why the command paste -sd"+" or paste -s -d "+" fail in OS X?

development testing bug reliability • 4.1k views

ADD COMMENT • link updated 2.2 years ago by Ram 45k • written 8.3 years ago by shenwei356 8.7k

0

Entering edit mode

paste works as expected for me on macOS Sierra. What kind of error do you get ?

ADD REPLY • link 8.3 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

no error detail. but the result is blank (noting). e.g.

echo -e "acgtn\nACGTN" | paste -s -d "+"

This command exportsacgtn+ACGTN on Linux, but it exports nothing in OS X (version not known) according to the Travis log.

At last, I used md5sum to compare contents and the tests passed.

echo -e "acgtn\nACGTN" | md5sum | cut -d " " -f 1

ADD REPLY • link 8.3 years ago by shenwei356 8.7k

2

Entering edit mode

It is a slightly different paste that does not automatically recognize that the standard input is used hence it requires that you specify the - to indicate that input comes from stdin

echo -e "acgtn\nACGTN" | paste -s -d "+" -

ADD REPLY • link 8.3 years ago by Istvan Albert 102k

1

Entering edit mode

That's right and on my mac, omitting - (or a file name) gives a usage error:

echo -e "acgtn\nACGTN" | paste -s -d "+"
usage: paste [-s] [-d delimiters] file ...

echo -e "acgtn\nACGTN" | paste -s -d "+" -
acgtn+ACGTN

ADD REPLY • link 8.3 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Get it! Thank you Istvan!

ADD REPLY • link 8.3 years ago by shenwei356 8.7k

score 8 · Answer 1 · 2017-03-12

8

Entering edit mode

8.3 years ago

Istvan Albert 102k

Based on my own personal experiences over longer periods of time I came to believe that the so-called "test-first" philosophy is misguided and leads to substantial amounts of wasted effort. The vast majority of the tests are too simple, the code never breaks around them, you look back five years later and wonder why did you do all that work since it was never needed, basically it never caught anything.

I think testing needs to be done as part of the documentation process - the code is tested while demonstrating and explaining the inputs and outputs of the program. This solves the 90% situations where testing helps - and then if the usage changes you want that to be reflected in the documentation as well.

Now since it is impossible to foresee which section of the code will cause problemsI found that a reactive testing works very well - when I have a bug I'll add a test for it and perhaps I explore the functionality around that use case. It turns out that bugs usually congregate in the same regions.

ADD COMMENT • link 8.3 years ago by Istvan Albert 102k

2

Entering edit mode

Interesting topic... But I personally disagree with Istvan. I cannot claim either a long or deep experience but I found test-driven development (test-first philosophy) good. It took (actually it's taking) me a while to get my head around it but this book was very useful.

By writing tests first you are forced to establish what your method or class should do and do not worry upfront how it does it, that's of secondary importance. I think one is often tempted to test the implementation of the program rather then the behaviour and this leads to useless tests that break too easily.

Tests that show what a method does (again, forget about the how) are also useful as stand-alone examples and they can be part of the documentation. In fact, test examples ensure that code and documentation are in sync.

Finally, in my opinion/experience I find that even dead simple tests capture a good deal of bugs when you change code elsewhere.

Just my 2p, curious to know others' experience...

ADD REPLY • link 8.3 years ago by dariober 15k

0

Entering edit mode

I support test-driven development, cause it keeps the reliability when more functions are added which may bring code changes everywhere. I shall add as many as possible tests (package unit tests and software functional tests) for foreseen cases from the very beginning of development not just documenting.

Thank you all. :)

ADD REPLY • link 8.3 years ago by shenwei356 8.7k

score 7 · Answer 2 · 2017-03-12

Testing is a relatively new concept where you tell your program that it should check it does what you expected it to do. For example in python

a = 1
b = 2
assert a+b == 3

Despite current wide support for unit tests, there has been little evidence that tests actually prevent unexpected outcomes from programs in real world situations. A lot of development shops do it because it gives the feeling of safety/productivity, when you hit test and all those little green arrows get ticked off. That isn't to say they have no value - any tool that forces a developer to take a look at what they expect and what is really happening will result in a better program. Testing is just one very basic and commonly not-properly-implemented way to do that. Type hints, are another. Code review, another. Closely following an architectural design pattern another.

But testing doesn't work well for file parsers, or most things that work with data, because file parsers do not have a defined input/output. Or rather, the range of potential inputs is unlimited, and thus the number of tests must be unlimited for a "successful test" to mean what developers think it means - the code works as expected. A parser can beat a million tests, but still fail when something unexpected appears in user data. This is because the developer didn't write the data. Actually it goes one step further than that - a sufficiently complex parser can end up being turin complete, and thus data can program the parser to perform any calculation the data wishes. The data controls the program. It is impossible to write tests for a program you do not control.

There are even more modern, even more highly recommended, and even more not-properly-implemented options a developer could take. For example LangSec is a movement dedicated to formally defining what makes a bad file format for the point of security, although security has only been chosen because it's the most practical reason not to write bad file formats. But the recommendations of LangSec apply to writing testable code too. This video seems to cover the major topics in the first 5 minutes:

So, if writing testable code for file parsers depends on the file format itself - then "how do we guarantee software reliability in bioinformatics?" is like asking how does one hold ice without melting it. You cannot write a FASTA parser that is testable, because your definition of FASTA is not the same as someone else's. What does a tool do when it encounters a FASTA file with a ">" as the 121st character of the sequence? Split it onto a new line as the spec says rows must be wrapped to 120 lines, resulting in a new entry? Report an error that ">" shouldn't be in the sequence, even though the spec says it should be ignored? Wrap the line to 80 characters (which is also valid FASTA)? Not perform any wrapping at all (as many consider linearised FASTA to be valid FASTA these days).

And FASTA is one of the simpler formats! - try writing a parser for BAM, VCF, etc. I'm not sure it's actually possible. ValidateSam file tried, and it flags pretty much everything as invalid :P And my definition of invalid is probably different from yours, is different from the specification, etc.

Long story short, writing reliable software that has to parse unreliable data formats is fundamentally impossible. Unit-tests will not save us :P