Question: De novo Genome Annotation without extra data
0
gravatar for margab
6 months ago by
margab10
margab10 wrote:

I would like some suggestions for which tools or pipelines I can use for de novo genome annotation without using transcriptional data (ESTs, RNA-seq, Transcripts, Isoseq), proteins for my organism or hints from proteins of unknown evolutionary distance or other extra data.

Genome information: Mammalian genome, size of 2.5Gb Illumina (36-fold coverage) and Nanopore(4-fold coverage) data was used for the assembly

annotation genome • 191 views
ADD COMMENTlink modified 6 months ago by lieven.sterck6.9k • written 6 months ago by margab10

Whilst you might not be able to leverage transcriptional data, perhaps your mammal has some "slightly distant relatives" that you can leverage their predicted proteins in annotation using MAKER. You could train Augustus with BUSCO and include proteins from the "distant relatives". I've done this before for a rodent and turtle genome.

ADD REPLYlink written 6 months ago by jean.elbers1.3k
0
gravatar for lieven.sterck
6 months ago by
lieven.sterck6.9k
VIB, Ghent, Belgium
lieven.sterck6.9k wrote:

Given that you can not (want not?) use extrinsic info you have to rely on intrinsic or ab-initio prediction tools eg. Augustus, EuGene, Genemark, ... and many others. The big issue here is that, in order to get somewhat good results you will have to train/optimise them for your organism, which is not a simple task (but doable though!).

I'm however wondering why you say you want to do this without extrinsic data? As jean.elbers pointed out as well, there is nonetheless valuable info in all the proteins known so far, even if they are not specifically from the you are working with. Transcript data might indeed be a little less straightforward but the protein info is gonna be for sure valuable!

Actually, you will achieve best performance of your genome annotation tools when combining both intrinsic and extrinsic info in your approach. So perhaps reconsider using any available data source.

ADD COMMENTlink written 6 months ago by lieven.sterck6.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1726 users visited in the last hour