Question: Is Genome heterozygosity a problem for gene annotation?
0
gravatar for ricardoguerreiro2121
8 months ago by
Germany
ricardoguerreiro212160 wrote:

Quick question:

How important is it to reduce uncollapsed heterozygosity in a Genome assembly before proceeding to Gene annotation?

With uncollapsed heterozygosity I mean: The existence of alternative contigs (haplotigs) for one same region of the genome, for an organism that possesses multiple alternative chromosomes (diploid, triploid, tetraploid, etc..)

I have heard that uncollapsed heterozygosity is harmful for scaffolding attempts, but don't know about gene annotation.

I use duplication in Busco results as a proxy for heterozygosity. But there is a tradeoff between reducing duplication and avoiding missing genes.

Busco results for assembly

Complete , Single-copy , Duplicated , Fragmented , Missing

2070 (98 %) 1646 (78 %) 424 (20 %) 28 (1 %) 23 (1 %) Before eliminating some haplotigs

2065 (97 %) 1710 (81 %) 355 (17 %) 25 (1 %) 31 (1 %) After eliminating some haplotigs

Cheers, Ricardo

ADD COMMENTlink modified 8 months ago by lieven.sterck8.5k • written 8 months ago by ricardoguerreiro212160
3
gravatar for lieven.sterck
8 months ago by
lieven.sterck8.5k
VIB, Ghent, Belgium
lieven.sterck8.5k wrote:

solely for the technical aspect of gene prediction: not I would say. Perhaps you might encounter some issues with RNAseq data (if you're using that) having higher multi-map rate than it should be, but other then that I don't really see any issue.

interpretation of the results will be a different thing though. Eg. the final number of genes predicted (or rather: truly present in the genome) will of course not be accurate.

The proxy you're using is perhaps also not the best one: if in your species the genome (or some regions in it) are effectively duplicated then you will overestimate the heterozygosity.

ADD COMMENTlink written 8 months ago by lieven.sterck8.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1782 users visited in the last hour