Question: Gene annotation pipeline for bacteria
0
gravatar for bird77
2.1 years ago by
bird7720
bird7720 wrote:

I have a draft genome of a alpha-proteobacterium (genome size about 8Mbp) and I want to perform gene prediction and annotation.

Until now, I have used the RAST server for this task, but the amount of predicted genes with the annotation "hypothetical protein" is very high (about 60%).

What can you recommend for prokaryotic gene prediction and annotation workflows (I certainly can also combine different tools for gene prediction and annotation, but I do not know what the current standard is)?

Thank you so much for your assistance.

annotation genome • 1.0k views
ADD COMMENTlink modified 13 months ago by predeus830 • written 2.1 years ago by bird7720
5
gravatar for colindaven
2.1 years ago by
colindaven1.0k
Hannover Medical School
colindaven1.0k wrote:

Have a look at Prokka by Torsten Seeman, or the recent Genix.

Other places also have web based annotation pipelines, the NCBI one probably still exists.

ADD COMMENTlink written 2.1 years ago by colindaven1.0k
0
gravatar for predeus
13 months ago by
predeus830
Russia
predeus830 wrote:

Prokka is probably most used in the field now. Using --proteins with a specific database (e.g. species-specific) seems like a good way to annotate most of the ORFs. Just make sure you get the reference proteins in the right format - they need to look something like this:

>gene_id ~~~gene_name~~~putative protein function

There are few scripts to make your protein annotations look like this. the problem is discussed in one of the issues on Prokka github repository.

ADD COMMENTlink written 13 months ago by predeus830
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1894 users visited in the last hour