How are haplotypes/heterozygosity resolved in sequence assembly?
Entering edit mode
17 months ago
DNAlias ▴ 10

I am under the impression that many sequencing assemblers are unable to resolve heterozygosity, and account for it by either separating each variant into different contigs, or the two are fused into hybrid of the two variants.

1) Which of these outcomes is preferable and why?

2) I know that there are variant calling pipelines that require a reference genome, is there a way to recognize alleles during de novo assembly?

assembly • 332 views
Entering edit mode
16 months ago
Vitis ★ 2.4k

I think the ultimate goal for assembling a heterozygous genome is to fully resolve the two haplotypes, essentially two genome assemblies. Platanus seems to be doing a fairly job dealing with heterozygous genomes. Also, long-read sequencing technologies like Nanopore and PacBio would enable variant phasing and resolution of alleles over long distance. Sometimes, the genetic trick of "trio binning" would also help. Basically, you sequence two parents plus the F1 offspring, so you are able to use the parental variant information to partition the offspring reads into haplotypes and do two assemblies simultaneously.


Login before adding your answer.

Traffic: 1332 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6