I've written a software tool that allows genome scaffolds to be reliably reproduced by writing the set of instructions to build the scaffold as a domain specific language. The software, "Scaffolder," parses this instruction file, fetches the corresponding contig sequences, and joins them together into a continuous super-sequence. Separating the contig-joining process into a separate file decouples the data from the steps required to build the scaffold.
I'm writing on BioStar because I hope this software will be useful to the bioinformatics and genomics community. Therefore any patches, comments or constructive criticism of this software will improve and, ideally, make this a useful resource.
Finally, in addition, this software has been submitted to the journal Open Research Computation. Therefore any comments made on this question directly feed into the peer-review process for the article. I believe this could be an interesting approach to peer-review and will add to suggestions made by the two reviewers.
- Scaffolder website and documentation
- Preprint of the scaffolder manuscript
- Github repo for manuscript LaTeX source
- Github repo of scaffolder API
- Github repo of scaffolder CLI
Please separate suggestions into individual answers so they can be voted on individually. Multiple answers and votes are very welcome.