6.5 years ago by
The training set is a file of genes in genbank format to use for training. The test set is also a file of genes in genbank format that you may use to assess the quality of the training. The meta parameters are various parameters used by AUGUSTUS for prediction.
You must choose your own training and test set of genes. The "retraining AUGUSTUS" page suggests a number of possible sources:
- Spliced alignments of ESTs against the assembled genomic sequence. e.g. PASA
- Spliced alignments of protein sequences of a related species against the assembled genomic sequence, e.g. GeneWise
- Data from a related species
- Iterate retraining with predicted genes
The meta parameters should be based on the generic ones that come with AUGUSTUS in
The first link you provide (http://augustus.gobics.de/binaries/retraining.html) describes how to perform the training of AUGUSTUS manually. The second link (http://bioinf.uni-greifswald.de/augustus/binaries/README.autoAug) describes another program,
autoAug.pl, that can automate this training for you.
Training AUGUSTUS can seem intimidating at first, but if you follow the retraining document it is reasonably straightforward. In particular, the steps in the section 3. RUN THE SCRIPT optimize_augustus.pl are easy to follow.