How to go from a fasta file to an annotated genome assembly in genbank format?
1
0
Entering edit mode
12 months ago
vixelaa ▴ 20

Does anyone know what the best/fastest way is to create an annoted bacterial genome assembly (genbank format) starting from a fasta file containing the whole genome sequence? (There is a gff3 file with the gene annotations available.)Thank you.

genbank fasta genome assembly • 704 views
1
Entering edit mode

Can we assume this is a new bacterial species (as in : not previously has been annotated?)

if so, you then will have to go through a step called genome annotation, which might not always be an easy process but luckily for most bacterial species it is not that hard. I can suggest PROKKA to do this, though there are many other tools around (have you looked around/googled for any?)

0
Entering edit mode

Sorry, my question was probably not entirely clear. The sequence I would like to annotate is an ancestral genome from a bacterial species that are already annotated. In fact I did some mapping of new isolates against this ancestor so it's also already assembled. Now I'm wondering if there is some kind of fast way to take the annotation from the current known annotated bacterial genome and "paste" it onto the ancestral one...

0
Entering edit mode

vixelaa : Please do not delete posts once they have received at least one comment/answer.

1
Entering edit mode
12 months ago
Joe 19k

You could use something like RATT but I think this requires that the genomes be very very similar, which it sounds like yours might not be. In which case, you can still use prokka, and provide a database of 'trusted' proteins from which to start the annotation.

1
Entering edit mode

indeed, in this case I would also recommend something like RATT. A recent alternative, and worth a try I think, for it is this one: liftoff (https://www.biorxiv.org/content/10.1101/2020.06.24.169680v1)

0
Entering edit mode

Thank you! I'll try both.

1
Entering edit mode

Just fyi, I used liftoff and it worked almost perfectly, I got a gff file (only missed around 10 genes which can be added manually). Then this together with the fasta file I converted to genbank format.

1
Entering edit mode

thanks for the feedback (appreciated) and good to hear liftoff is promising.