7.1 years ago by
I am going to give another biased answer, CG-Pipeline, which is available at nbase.biology.gatech.edu. I am currently the lead developer for CG-Pipeline. It uses a combination of GeneMark, Glimmer3, and BLAST vs SwissProt and outputs a GenBank file with gene predictions for CDS. Additionally, it uses RNAmmer and tRNAscan-SE to detect RNA genes (these genes also appear in the output GB file). Our publication discusses the algorithm we use in depth, but basically a gene is called if two out of three predictors call it, and the longest prediction is used. CG-Pipeline has been used for several bacterial pathogens including but not limited to Neisseria, Haemophilus, Bordetella, E. coli, and Vibrio.
CG-Pipeline has modules for assembly, gene prediction, and annotation, but you can readily use only the run_prediction script if you want. In other words, it is very modular, and you can pick and choose the scripts you want to use, or you can run the entire thing in one command.