5.5 years ago by
France, Lille, CNRS
Hi Bob, you're right, these terms were not defined in the readme!
A kmer is said to be solid if it occurs more than a minimal number of times in the dataset. DSK returns all the solid kmers (and their counts) as a result, and filters out all non-solid kmers. That threshold is set by the parameter "-abundance-min".
Minimizers are a technical objects we use during k-mer counting, following the KMC 2 algorithm. A typical user should not care about them. If you'd still like to know about minimizers, you can read more about them here : http://arxiv.org/abs/1407.1507
I've made some changes and the next release of DSK (2.0.8) will have an updated README and all command line parameters regarding minimizers will be in a separate "developer" section of the help so as not to confuse users.
Rayan