Question: Is Genome heterozygosity a problem for gene annotation?
gravatar for ricardoguerreiro2121
8 months ago by
ricardoguerreiro212160 wrote:

Quick question:

How important is it to reduce uncollapsed heterozygosity in a Genome assembly before proceeding to Gene annotation?

With uncollapsed heterozygosity I mean: The existence of alternative contigs (haplotigs) for one same region of the genome, for an organism that possesses multiple alternative chromosomes (diploid, triploid, tetraploid, etc..)

I have heard that uncollapsed heterozygosity is harmful for scaffolding attempts, but don't know about gene annotation.

I use duplication in Busco results as a proxy for heterozygosity. But there is a tradeoff between reducing duplication and avoiding missing genes.

Busco results for assembly

Complete , Single-copy , Duplicated , Fragmented , Missing

2070 (98 %) 1646 (78 %) 424 (20 %) 28 (1 %) 23 (1 %) Before eliminating some haplotigs

2065 (97 %) 1710 (81 %) 355 (17 %) 25 (1 %) 31 (1 %) After eliminating some haplotigs

Cheers, Ricardo

ADD COMMENTlink modified 8 months ago by lieven.sterck8.5k • written 8 months ago by ricardoguerreiro212160
gravatar for lieven.sterck
8 months ago by
VIB, Ghent, Belgium
lieven.sterck8.5k wrote:

solely for the technical aspect of gene prediction: not I would say. Perhaps you might encounter some issues with RNAseq data (if you're using that) having higher multi-map rate than it should be, but other then that I don't really see any issue.

interpretation of the results will be a different thing though. Eg. the final number of genes predicted (or rather: truly present in the genome) will of course not be accurate.

The proxy you're using is perhaps also not the best one: if in your species the genome (or some regions in it) are effectively duplicated then you will overestimate the heterozygosity.

ADD COMMENTlink written 8 months ago by lieven.sterck8.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1782 users visited in the last hour