I have done a lot of denovo assembly with NGS data (Illumina NextSeq and MiSeq) and expect to only get a "pretty good" final assembly. However with PacBio I was under the impression this improved greatly. I'm struggling to finalize assemblies though.
Currently I have tried the following assemblers:
- HGAP4/Whatever the pbsmrtpipe de novo assembly pipeline is
- SOAPdenovo with hybrid mode (pacbio+illumina)
I generated my data from a multiplexed run on a PacBio Sequel machine and demulitplexed with
Of the assemblies the hybrid did the best. The overrall assembly contained ~500 conitgs and was twice the expected genome size. However if I filtered out conitgs <10,000 base pairs I ended up with 80 contigs whose length is extremely close to the expected genome size.
What do I do from here? I've tried
circlator which seems to only try to circularize the contigs themselves. My next step is to considered
quickmerge to possibly finalize.
Has anyone else hit a similar stumbling block in trying to finish a genome using PacBio reads?