Kover is an out-of-core implementation of the Set Covering Machine algorithm that has been tailored for genomic biomarker discovery. Given two groups of phenotipically distinct individuals represented by their genomes, Kover seeks an intelligible model that accurately discriminates them. The obtained models are conjunctions (logical-AND) or disjunctions (logical-OR) of rules that capture the presence or absence of k-mers.
For example, when applied to 462 C. difficile isolates divided into two groups: resistant or sensitive to the antibiotic Azithromycin, Kover found that the following model is a good predictor of resistance to this drug:
Presence(AGCCAGGTTCTTCATTTAAGATGCTAACTTC) OR Presence(CTTAAGCTGCCAGCGGAATGCTTTCATCCTA) OR Presence(AAGTCGCCCTTTTTTAAGGATACGGCGGTAT)
Kover has been found to outperform the widely used pipeline consisting of using univariate feature selection (e.g. chi2 test) coupled with a learning algorithm (e.g.: SVM, CART). These results are described in:
Drouin, A., Giguère, S., Déraspe, M., Marchand, M., Tyers, M., Loo, V. G., ... & Corbeil, J. (2016). Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons. bioRxiv, 045153.
We will answer Biostar questions regarding:
- Help for using the tool
Post a question: link
See all posts: https://www.biostars.org/t/kover/
-- Kover authors