PacBio coverage question for a plant genome
Entering edit mode
7.8 years ago
arnstrm ★ 1.8k

Hi all,

Can anybody suggest how much pacbio coverage do we need for a genome of 1.7 giga bases (de novo assembly)? We already have about 47X illumina reads (paired end, mate pairs 9 and 11 Kb) and the assembly that we generated is not so great (scaffold N50 of 36Kb, but 50% Ns). Most importantly, the genome has ~50% repeat content. We have a shared budget that we need to split for both pacbio DNA sequencing and Iso-seq (for future genome annotation: this spp doesn't have any transcriptome data). What do you guys think is the ideal coverage we need aim for to get a quality assembly?

Any suggestions on this will be greatly appreciated!

coverage Assembly pacbio • 4.1k views
Entering edit mode

just as a gut feeling, to complement your Illumina-Seq I'd say 10-15x coverage. But I might be way off here!

Entering edit mode
7.8 years ago
thackl ★ 3.0k

It depends a lot on what you are planning to do with your PacBio data. For assembly and error correction, there are different strategies:

Also, I've seen a lot of issues with large/repetitive plant genomes and library prep/size selection for PacBio. Make sure to test your protocols. Good subread length is very important if you do low coverage PacBio. With short subreads you may get a theoretical coverage of 10X, but since all subreads from one read stack onto each other, you'll have much less real overlapping fragments than you need for assembly/scaffolding


Login before adding your answer.

Traffic: 719 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6