Question: PacBio genome assembly with canu shorter than expected
gravatar for Rob
3.9 years ago by
Rob120 wrote:


I have ~70x PacBio reads and I did an assembly with Canu. I expected a 216 Mb genome but I got a 146Mb assembly of 465 contigs with 1.3G of unassembled data.

I tried to modify some parameters, for overlap length or coverage, but I can't get an assembly which reach the size I wanted (it just increase the number of contigs, or did nothing visible).

Is there any way to improve the size of my assembly by adjusting assembler's parameters, or maybe is there a possible problem with my data? (I didn't polish my data yet, because I have trouble with quiver atm, but I don't expect quiver to up the size of my assembly, am I right?)

What can explain this difference? And what can I do for that?

Thanks for your help!

canu pacbio assembly genome • 2.1k views
ADD COMMENTlink modified 10 months ago by Lancer0 • written 3.9 years ago by Rob120

Did you try first to correct your read using any self correction method like then assemble and see the results?

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Medhat8.8k

I assume Canu correct the reads by itself, so no, I didn't correct them. But I will try your tool, it seems useful.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Rob120

you are right

Canu will correct the reads

I missed it

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Medhat8.8k

Hi, have you got the expected genome size?

ADD REPLYlink written 2.1 years ago by jianjianbo0
gravatar for Lancer
10 months ago by
Lancer0 wrote:
  1. Have you change minReadLength( default 1000)? I suggest you to have a look at your reads distribution ,maybe many of your reads are too short. Read length below 1000bp will be discarded.Please read the canu document canu questions carefully .That will be helpful.
  2. In the assembly step,you can change correctedErrorRate according to your coverage:
  1. For low coverage: For less than 30X coverage, increase the alllow difference in overlaps by a few percent (from 4.5%to 8.5% (or more) with correctedErrorRate=0.105 for PacBio and from 14.4% to 16% (or more) correctedErrorRate=0.16 for Nanopore), to adjust for inferior read correction.Canu will automatically reduce corMinCoverageto zero to correct as many reads as possible.
  2. For high coverage: For more than 60X coverage, decrease the allowed difference in overlaps (from 4.5% to 4.0% with correctedErrorRate=0.040 for PacBio, from 14.4% to 12% with correctedErrorRate=0.12 for Nanopore), so that only the better corrected reads are used.This is primarily an optimization for speed and generally does not change assembly continuity.
ADD COMMENTlink modified 10 months ago • written 10 months ago by Lancer0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1219 users visited in the last hour