I'm trying to assemble small (20Mb), diploid fungal genome from MiSeq reads (~400bp after merging, 100x coverage).
The tricky thing: it's heterozygous. The divergence is ~4%, but there are hundreds (thousands?) loss of heterozygosity (LOH) regions, accounting in total to almost half of the genome...
Do you know of any assembler (methodology) capable of handling with such data?
I have tried many assemblers/tricks:
- de Bruijn graph: Velvet, ABySS, SOAPdenovo
- overlap-based: MIRA, Newbler
- clustering reads and then assembling with CAP3
but with rather bad effects. Every time, the assembly was very fragmented (N50 ~10kb), homozygous regions (LOH) were collapsed with ~200x, while heterozygous regions were separated with 100x.
I will be happy to hear about any ideas:)