Question: Bacterial Annotation Pipeline
gravatar for scapella
7.7 years ago by
Barcelona, Spain
scapella390 wrote:

Hi, Might be this one is an old question but I haven't found a real answer. Does anyone know an annotation pipeline (automatic or not) for working with bacterial species? In my case, there is not reference genome close to my species.

ADD COMMENTlink modified 4.4 years ago by dago2.5k • written 7.7 years ago by scapella390

Thanks guys for your answers! I'll try RAST and BG7. Both look very promising!

ADD REPLYlink written 7.7 years ago by scapella390

Hopefully I'll be releasing and publishing my Prokka in early 2012.

ADD REPLYlink written 7.6 years ago by Torst900

Hello every body, 

Does anyone can give me solution? In fact, I annotated my genome sequence by PROKKA, but when I analysed  my sequence by blast I found that some ORFs don't start or finish in the same location comparing to what annotated in blast. Is there an other  server can give the good annotation and the good ORFs, or a server that I can use to correct manually? 

Thank you very much


ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by etudiantscience0
gravatar for Marina Manrique
7.7 years ago by
Marina Manrique1.3k
Marina Manrique1.3k wrote:


We (at Oh no sequences!) have developed an annotation system specially designed for bacterial and NGS data. It's called BG7, probably the most interesting feature to you is that a close reference genome is not needed.

Unlike other annotation pipelines, like those based on ORF prediction with Glimmer, where your annotation strongly depends on having a close reference genome BG7 system works very well even when you don't have a reference genome. You just need a set of what we call 'reference proteins' that will guide the annotation, these proteins don't need to be too similar to the proteins you expect to find in your genome, so it's no problem if you don't have a close reference. We've tested it in lots of genomes (some of them with no similar sequences) and are very happy with the results.

The system is open-source (AGPL-V3 license) so you can freely use it.

We're about to launch its website, meanwhile you can take a look at these slides describing it and the results files of the E. coli Germany outbreak we published in this Github repository (the system gives the annotations in more format like gbk and embl, this is just an example of the annotations)

Please let me know if you want to know anything else, @pablopareja is the main developer, you can also ask him



EDIT: We've just launched the bg7 website please feel free to try it (any feedback is highly appreciated) :)

ADD COMMENTlink modified 7.6 years ago • written 7.7 years ago by Marina Manrique1.3k

Could you please let me know if there is any installation manual as well as how to run bg7??

ADD REPLYlink written 5.3 years ago by HG1.1k
gravatar for Martin A Hansen
7.7 years ago by
Martin A Hansen3.0k
Martin A Hansen3.0k wrote:

RAST works really well.

RAST (Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating bacterial and archaeal genomes. It provides high quality genome annotations for these genomes across the whole phylogenetic tree.

ADD COMMENTlink written 7.7 years ago by Martin A Hansen3.0k
gravatar for Scott Cain
7.7 years ago by
Scott Cain750
Scott Cain750 wrote:

The GMOD project has several alternatives, of which MAKER (mentioned above) is one, though it leans a little towards the euks. Another option which was designed for work with prokaryotes is DIYA (though looking at that page now it looks like SourceForge is messing with our wiki page). There is also Ergatis which was designed by the people at TIGR/JCVI for doing bacterial annotation, which they know how to do very well (they are now at the University of Maryland). Ergatis is by far the most powerful, but overkill to install if you are only doing one genome. If you are only doing one genome, you might want to look at CloVR, which I am pretty sure is powered by Ergatis but is inside a virtual machine that you can download and run (I think they have options for running it on the cloud too, but I haven't talked to them in a while).

ADD COMMENTlink written 7.7 years ago by Scott Cain750

any update on DIYA? I would like to include it as a Galaxy module for routine annotation of environmental clones. However its seems like it hasn't seen much action in awhile.

ADD REPLYlink written 7.5 years ago by Zach Powers340
gravatar for Haibao Tang
7.7 years ago by
Haibao Tang3.0k
Mountain View, CA
Haibao Tang3.0k wrote:

It takes a bit time to set up, but try MAKER.

ADD COMMENTlink written 7.7 years ago by Haibao Tang3.0k
gravatar for dago
4.4 years ago by
dago2.5k wrote:

PROKKA is quite good and fast and you do not need any reference genome.

It perform for you ORF prediction and annotation using several well established tools.

ADD COMMENTlink written 4.4 years ago by dago2.5k
gravatar for wrf
4.4 years ago by
wrf50 wrote:

This thread seems to have died despite this not being a solved probelm. One could also check PRODIGAL. It does a very fast annotation of proteins, like 10 seconds. It is a single binary to download and running is fast since bacterial genomes are small. If it doesn't work, then not much time is lost.

ADD COMMENTlink written 4.4 years ago by wrf50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 816 users visited in the last hour