Question: Simplest genome protein annotation pipeline possible
1
gravatar for Eric Normandeau
3.5 years ago by
Quebec, Canada
Eric Normandeau10k wrote:

I'm often playing with draft genomes of non-model species (mostly in fishes) and we need to annotate these genomes. In cases like this, we do not really care about putative proteins that are based on ORFs or any ab-initio methods.

What we really need is to get a GFF3 annotation file listing known proteins (from swissprot, for example) with an accompanying .csv file that gives more informations about the proteins (scaffold, position, protein name, etc).

What would the simplest approach be to achieve that goal while treating intron/exons properly and producing annotations like (gene, cds, exon, utr...)?

Right now, I am considering a workflow like this:

  1. Repeat Masker
  2. PASA
  3. EVidence Modeler (EVM)

And skipping anything to do with ab-initio detection (augustus, exonerate...)

Am I missing a simpler approach? The approach needs to work for eukaryote genomes (~1-3 Gbp).

EDIT: Ah well... Please do not suggest MAKER 1 or 2. I am not going to use MAKER unless my actual survival depends on it ;)

proteins annotation genome • 1.7k views
ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Eric Normandeau10k

In the end, it looks like Maker is still the best/correct approach... Eukaryote Genome Annotation needs some serious streamlining.

ADD REPLYlink written 3.5 years ago by Eric Normandeau10k
1

Isn't "eukariotic genome annotation" and "simple" an oxymoron (unless there is a not between them)?

I never used them (and they seem to be anything but simple), but do you know JAMg and JAMp?

ADD REPLYlink written 3.5 years ago by h.mon32k

Yes, I know firsthand that genome annotation and simple don't go hand in hand ;)

I'm investigating JAMg. Thanks for the suggestion!

ADD REPLYlink written 3.5 years ago by Eric Normandeau10k

JAMg looks a bit more complex than our current pipeline (which fails) and depends on some of the same software that fails on our genomes... ¯\_(ツ)_/¯

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Eric Normandeau10k
0
gravatar for Eric Normandeau
3.5 years ago by
Quebec, Canada
Eric Normandeau10k wrote:

I ended up developing a genome annotation pipeline based on suggestions from a colleague. You can find more about it in this Biostar post: GAWN - Genome Annotation Without Nightmares GAWN - Genome Annotation Without Nightmares

ADD COMMENTlink written 3.5 years ago by Eric Normandeau10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1284 users visited in the last hour
_