Question: Genome Assembly at low coverage?
gravatar for AP
2.2 years ago by
AP90 wrote:

Hi all,

I am about to run a PacBio sequencing project (Sequel) and after a first run on a single cell, it looks like I will have an overall coverage of 8X (if I continue the run). My knowledge in genome assembly is very small but I am curious to know what folks think:

Would it be OK to do de novo assembly with 8X coverage and good quality fragments of ~ 10kb? I know it is of course possible but I am just wondering if it is worth the money at this point or if I should start again from scratch (although not sure if it will really change something).


coverage assembly • 1.1k views
ADD COMMENTlink written 2.2 years ago by AP90

From PacBio their recommendations:

We recommend PacBio-only de novo assembly when it is possible to get at least 50X PacBio coverage.

For a hybrid assembly involving both PacBio and short read sequencing, PBcR and ECTools can work well with around 20X PacBio coverage. If a high quality set of scaffolds exists, then PBJelly 2 can be used. We recommend at least PacBio 5X coverage to fill gaps; higher coverage enables better consensuses in gap filled regions and increases the number of addressable gaps, as random sampling at lower coverage can lead to coverage gaps.

You may get something but don't expect a lot. Then you may be pleasantly surprised. What is the expected genome size BTW?

ADD REPLYlink written 2.2 years ago by genomax83k

Thanks for the quick reply. Yes I am aware of the PacBio recommendations. The expected size is ~ 1Gb.

ADD REPLYlink written 2.2 years ago by AP90

If you've got a good reference and aren't planning to call SNPs or something then you might be OK for certain use cases. No point throwing the data away though. You can always combine the library if you do a second run and improve your coverage still.

ADD REPLYlink written 2.2 years ago by Joe16k

Yes indeed, I am not planning on throwing the data away - although a new library would require a new individual. So not ideal to combine data from two different individuals.

ADD REPLYlink written 2.2 years ago by AP90

You can always run more cells using the current library, right?

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by genomax83k

Not really unfortunately. I won't have enough material for more cells. I have enough to get a coverage of about 8X.

ADD REPLYlink written 2.2 years ago by AP90

Is this human/animal data?

ADD REPLYlink written 2.2 years ago by Joe16k

It is from animal data

ADD REPLYlink written 2.1 years ago by AP90

I'd just assemble it and see what you get. If you assembly stats look reasonable (decent N50s etc), then I would probably, cautiously, continue. it depends what exactly you intend to do with this data. Even with a very good 8X assembly you probably won't be able to call SNPs etc.

ADD REPLYlink written 2.1 years ago by Joe16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1454 users visited in the last hour