From Ests To Gene Models
1
2
Entering edit mode
13.0 years ago
Ketil 4.1k

I'm planning to use Augustus to annotate a genome. It needs some initial gene models to train on, and it is suggested I use ESTs and PASA to build these. However, PASA is a bit complicated, and requires things like mysql, which seems a bit unnecessary. Are there any simpler alternatives I could use?

est gene clustering • 1.9k views
ADD COMMENT
1
Entering edit mode
13.0 years ago
Darked89 4.6k

Not ESTs based but used in few studies with novel genomes: CEGMA It has some advantage of selected gene set (i.e. you do not overtrain Augustus with 100 most highly expressed kinases) but it is a bit dated (gene/protein sequences got updated since), and the set of "core genes" present in all Eukariotes seems to be shrinking.

In the case of Augustus the most important part (speaking of a plant genome) was to have possibly accurate set of hints obtained from RNA-Seq data, followed by repeat masking. Even with Arabidopsis-trained Augustus it was surprisingly accurate even for some giant plant genes. Situation may be obviously very different for other groups/organisms.

ADD COMMENT

Login before adding your answer.

Traffic: 1462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6