Comparing Hi-c/dovetail, BioNano, and pacbio assemblies. Pick the best one?
1
0
Entering edit mode
6.8 years ago
mmats010 ▴ 80

We work with a VERY complex genome. It is a pathogen, with a large genome size at around ~240Mb. By complex, I mean, we have done PacBio sequencing and FALCON assembly, yet only got about 0.6 Mb N50 values.

In order to try to consolidate our genome into manageable segments (e.g. pseudochromsomes) we decided to utilize both BioNano optical maps and DoveTail HiC. Both methods relied on high molecular weight DNA.

Alas, neither method significantly improved our assembly in terms of N50, even though both assembled the FALCON pacbio contigs in different ways. DoveTail increased N50 from about 0.60Mb to 1.15Mb and a relaxed version of the BioNano pipeline increased the N50 to 916kb. The default parameters were, of course, lower.

MY QUESTION IS...are there any commonly used programs that can consolidate hi-C/opticalMap/pacBio assemblies? Many of the examples I see rely solely on Illumina assemblies, but those typically include mate pair libraries, which nonetheless don't contain the same kind of data as our Illumina PE datasets + optical maps + Hi-C maps. I have looked around and found "Metassembler" and "GAM-NGS", as well as "runBNG" and "BionaniAnalyst", but our group isn't very experienced in this kind of de novo assembly with a VERY difficult genome.

Any Advice would be appreciated.

Mike

bionano dovetail pacbio denovo illumina • 5.5k views
ADD COMMENT
0
Entering edit mode

Both methods relied on high molecular weight DNA.

And is that a problem for this organism?

Not an answer to your question, but might be an idea to use some nanopore reads. With careful extraction, manipulation and library prep you can get reads of hundreds of kb's (longest 970kb). That might be able to span complex sequences...

ADD REPLY
0
Entering edit mode

I mention the HMW DNA because even though we have had success isoloating it, the technologies we've employed to utilize it haven't really worked. Something chromatin-based, like what Phase Genomics does, might be better for us, though it isn't really an option now.

A group down the hall from us actually has a nanopore, but they haven't spoken very highly of it in the time that they've been using it. Perhaps we could ask anyway.

ADD REPLY
0
Entering edit mode

a relaxed version of the BioNano pipeline increased the N50 to 916 Mb.

How can a ~240Mb genome have a N50 of 916Mb?

ADD REPLY
0
Entering edit mode

Whoops, meant to write "Kb" there

ADD REPLY
1
Entering edit mode
6.8 years ago

So there are a few papers which did what you're trying to do.

In the goat genome paper they used both HiC and BioNano and found that HiC worked a bit better for them: http://www.nature.com/ng/journal/v49/n4/full/ng.3802.html They used Lachesis for HiC scaffolding with optimised parameters (somewhere in the supplementary), and then merged the HiC and BioNano scaffolds, look at the supplementary, it's not straightforward (a lot of it looks like manual checking of mummer output)

There is also a recent mosquito genome which uses their own pipeline to scaffold, but no BioNano used: http://science.sciencemag.org/content/early/2017/03/22/science.aal3327.full Pipeline is here: https://github.com/theaidenlab/3d-dna

Lastly, maybe you can do what the most recent wheat genome did, and merge your two assemblies using mummer, not 100% sure how exactly http://www.biorxiv.org/content/biorxiv/early/2017/07/03/159111.full.pdf

I have not tried GAM-NGS or Metassembler, but runBNG and BioNanoAnalyst both won't merge your assemblies (I'm co-author on those two papers)

ADD COMMENT
0
Entering edit mode

Thanks, I'll take a look at those!

I think I understand why runBNG can't merge our Hi-C and BioNano assemblies, but could we not simply take the optical maps, Hi-C fasta output, and assemble those together using runBNG? It sounds like while it quite merge the fasta files from our previous bionano and Hi-C assemblies, but it can at least use a better starting material (N50=1.15Mb vs N50=0.60Mb) for the optical assembly.

ADD REPLY
0
Entering edit mode

Yes, runBNG can scaffold your assembly fasta using the BioNano data, that would work!

BTW there's also OMSim, which can simulate optical mapping data from your assembly; https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btx293/3791407/OMSim-a-simulator-for-optical-map-data

ADD REPLY

Login before adding your answer.

Traffic: 1729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6