Question: Annotation pipelines in 2018
1
gravatar for Ric
8 days ago by
Ric190
Australia
Ric190 wrote:

Hi, I would like to annotate a plant genome. I have also some RNA-Seq data.

Here ( https://www.sunflowergenome.org/annotations/ ) there described that they used the below technologies in order to annotate the sunflower genome:

I also found other pipelines:

and there is, even more, here ( https://omictools.com/genome-annotation-category )

Which one to choose?

Thank you in advance.

gene rna-seq assembly genome • 209 views
ADD COMMENTlink modified 8 days ago by Roxane Boyer690 • written 8 days ago by Ric190

I would like to annotate a plant genome.

Can you provide some stats about how good that assembly is now (# of contigs, avg length, N50), ? It is unlikely to be complete so you should always keep your expectations in line with that point.

ADD REPLYlink written 8 days ago by genomax54k

The genome is an allotetraploid with 3gb in size and 5000 contigs. The N50 is 1.3 Mb.

ADD REPLYlink written 8 days ago by Ric190

looks decent at first sight, but given that genome size be prepared to spend time on it as mentioned by genomax

ADD REPLYlink written 8 days ago by lieven.sterck2.1k
0
gravatar for lieven.sterck
8 days ago by
lieven.sterck2.1k
Belgium, Ghent, VIB
lieven.sterck2.1k wrote:

It all depends a little on how serious (high quality) you want the result to be. If you only want to have a global idea of what it would look like , any of the pipelines you mention will do I assume.

The protocol as described for the sunflower paper will deliver nice result but is much more work (compute / time / man / ...) to run then the pipeline-packages. Being a big fan of Eugene I can certainly recommend that one but keep in mind it will require some tweaking and time-investement to obtain the best result.

Generally, keep in mind that the bigger your genome is and the more data you might want to input to the pipelines the more computational power and time you will need.

ADD COMMENTlink written 8 days ago by lieven.sterck2.1k

The expectation is to get a high-quality annotation. Do you only run Eugene and how do you know which parameter has to be tweaked?

ADD REPLYlink written 8 days ago by Ric190

Parameters may be dependent on your genome since "one size fits all" will not apply.

Be prepared to spend much longer on doing annotation than you did on the sequencing/assembly. While parts of the annotation can be automated there would be human intervention required in many places and this (allotetraploid) is going to make your task that much harder.

If you intend to make the genome public then you can leverage NCBI's Eukaryotic annotation pipeline.
Edit: Request annotation link on NCBI's Eukaryotic annotation page is not working. I have emailed their support.

Edit 2: NCBI support agreed that the wording/link is misleading. Clicking on the "Request Annotation" takes you to a help desk ticket page. You are supposed to fill out a request for annotation. They will then get in touch with you.

ADD REPLYlink modified 4 days ago • written 8 days ago by genomax54k

yes, usually EuGene is our 'end' tool by which we combine all other data, but much more other recipes are possible.

You will definitely need to do parameter optimization, which can take up quite some time even but the end result will reflect the effort you put in it!

ADD REPLYlink written 8 days ago by lieven.sterck2.1k
0
gravatar for Roxane Boyer
8 days ago by
Roxane Boyer690
France / Toulouse / GeT-Plage
Roxane Boyer690 wrote:

Hello !

I never worked with plant genome, but I'v heard maker have a particular pipeline adapted for plants genome, maybe you can find what you need in here : http://www.yandell-lab.org/software/maker-p.html

Cheers,

Roxane

ADD COMMENTlink written 8 days ago by Roxane Boyer690

correct, it is part of the Maker "family" .... however after all these years I still have to figure out what makes this particular OK for plant genomes though, especially compared to the 'normal Maker .

Problem with all those pipelines is that there is no "one size fits all" as mentioned by genomax , while that's exactly what they try to offer .

Anyway good annotation can be made by several software, bad annotation can be made by all software if you're not paying attention to the details ;-)

ADD REPLYlink written 8 days ago by lieven.sterck2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 660 users visited in the last hour