Question: How to visualize duplicated segments in a de novo Whole Genome Assembly
2.2 years ago by
gbdias • 90
gbdias • 90 wrote:
I am searching for tools to visualize regions of a whole genome assembly that are represented twice or more times.
- Most genome assemblers will collapse haplotypes that are similar enough, generating a "pseudo-haploid" final assembly.
- However, if the haplotypes are divergent past a certain threshold or if they have large scale structural variants the assemblers will likely output two contigs (or haplotigs), one for each haplotype. I am searching for a good way to visualize such haplotypes.
- I am aware of some methods to guess the overall duplication level of the assembly, such as: read mapping and depth analysis (like purge_haplotigs by Mike Roach) and BUSCO duplication level.
- However, I'd like to know if a more visual tool is available.
- I am also aware of the MUMmer package and mummerplots, but those are not the most easy way to visualize duplicated regions since MUMmer orders the contigs on a diagonal based on the best alignments with the reference. Using the --maxmatch option will display all alignments but those are not ordered in an intelligible manner so the whole thing becomes too polluted.
Any suggestions are welcome.
ADD COMMENT • link •