Question: Is Genome heterozygosity a problem for gene annotation?
gravatar for ricardoguerreiro2121
13 months ago by
ricardoguerreiro212160 wrote:

Quick question:

How important is it to reduce uncollapsed heterozygosity in a Genome assembly before proceeding to Gene annotation?

With uncollapsed heterozygosity I mean: The existence of alternative contigs (haplotigs) for one same region of the genome, for an organism that possesses multiple alternative chromosomes (diploid, triploid, tetraploid, etc..)

I have heard that uncollapsed heterozygosity is harmful for scaffolding attempts, but don't know about gene annotation.

I use duplication in Busco results as a proxy for heterozygosity. But there is a tradeoff between reducing duplication and avoiding missing genes.

Busco results for assembly

Complete , Single-copy , Duplicated , Fragmented , Missing

2070 (98 %) 1646 (78 %) 424 (20 %) 28 (1 %) 23 (1 %) Before eliminating some haplotigs

2065 (97 %) 1710 (81 %) 355 (17 %) 25 (1 %) 31 (1 %) After eliminating some haplotigs

Cheers, Ricardo

ADD COMMENTlink modified 13 months ago by lieven.sterck10.0k • written 13 months ago by ricardoguerreiro212160
gravatar for lieven.sterck
13 months ago by
VIB, Ghent, Belgium
lieven.sterck10.0k wrote:

solely for the technical aspect of gene prediction: not I would say. Perhaps you might encounter some issues with RNAseq data (if you're using that) having higher multi-map rate than it should be, but other then that I don't really see any issue.

interpretation of the results will be a different thing though. Eg. the final number of genes predicted (or rather: truly present in the genome) will of course not be accurate.

The proxy you're using is perhaps also not the best one: if in your species the genome (or some regions in it) are effectively duplicated then you will overestimate the heterozygosity.

ADD COMMENTlink written 13 months ago by lieven.sterck10.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1153 users visited in the last hour