Question: PacBio assemblies only ending up somewhere between 80 and 250 contigs.
gravatar for dylan.lawrence
5 days ago by
dylan.lawrence10 wrote:

I have done a lot of denovo assembly with NGS data (Illumina NextSeq and MiSeq) and expect to only get a "pretty good" final assembly. However with PacBio I was under the impression this improved greatly. I'm struggling to finalize assemblies though.

Currently I have tried the following assemblers:

  • CANU
  • HGAP4/Whatever the pbsmrtpipe de novo assembly pipeline is
  • SOAPdenovo with hybrid mode (pacbio+illumina)

I generated my data from a multiplexed run on a PacBio Sequel machine and demulitplexed with lima.

Of the assemblies the hybrid did the best. The overrall assembly contained ~500 conitgs and was twice the expected genome size. However if I filtered out conitgs <10,000 base pairs I ended up with 80 contigs whose length is extremely close to the expected genome size.

What do I do from here? I've tried circlator which seems to only try to circularize the contigs themselves. My next step is to considered quickmerge to possibly finalize.

Has anyone else hit a similar stumbling block in trying to finish a genome using PacBio reads?

pacbio assembly de novo • 81 views
ADD COMMENTlink written 5 days ago by dylan.lawrence10

What is the organism? I suppose it is a bacteria, as you were trying circlator. It is really strange the final assembly being twice the expected genome size, did you check for contaminants?

ADD REPLYlink written 5 days ago by h.mon16k

Not in depth but I have performed Illumina sequencing on this same sample and there were no contaminants.

ADD REPLYlink written 5 days ago by dylan.lawrence10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 992 users visited in the last hour