Tool: Goodbye, Genbank: A Python package that salvages feature annotations from GenBank records
5
gravatar for Lars Schöning
3.0 years ago by
DTU Biosustain, Copenhagen, Denmark
Lars Schöning90 wrote:

Hi,

While building a parts library for internal use, I noticed the quirks of the GenBank format and also the fact that almost no GenBank file is up to spec. I started building a tool to iron out the quirks and salvage only the usable parts of GenBank feature annotations for use elsewhere. It has become a larger task than I initially anticipated and I thought some other people might find it useful or wish to contribute to it, so I made it open source:

https://biosustain.github.io/goodbye-genbank/

In summary, this is:

  • A Python package for use with Biopython
  • It maps GenBank feature keys (and in some cases qualifiers) to Sequence Ontology terms.
  • It fixes/normalizes GenBank feature qualifiers (annotations) and discards qualifiers that cannot be fixed. This is customizable to allow for adding your own salvaging code for certain qualifiers.
  • The output is nice, predictable features that can be used elsewhere.
  • Masochists can also use this package to simply clean up GenBank feature annotations into valid GenBank.

(A GFF3 exporter is also planned, but it may be a while.)

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by Lars Schöning90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1062 users visited in the last hour