Question: PacBio coverage question for a plant genome
2.4 years ago
Ames, IA
arnstrm wrote:

Hi all,

Can anybody suggest how much pacbio coverage do we need for a genome of 1.7 giga bases (de novo assembly)? We already have about 47X illumina reads (paired end, mate pairs 9 and 11 Kb) and the assembly that we generated is not so great (scaffold N50 of 36Kb, but 50% Ns). Most importantly, the genome has ~50% repeat content. We have a shared budget that we need to split for both pacbio DNA sequencing and Iso-seq (for future genome annotation: this spp doesn't have any transcriptome data). What do you guys think is the ideal coverage we need aim for to get a quality assembly?

Any suggestions on this will be greatly appreciated!


just as a gut feeling, to complement your Illumina-Seq I'd say 10-15x coverage. But I might be way off here!

2.4 years ago
European Union
thackl wrote:

It depends a lot on what you are planning to do with your PacBio data. For assembly and error correction, there are different strategies:

Also, I've seen a lot of issues with large/repetitive plant genomes and library prep/size selection for PacBio. Make sure to test your protocols. Good subread length is very important if you do low coverage PacBio. With short subreads you may get a theoretical coverage of 10X, but since all subreads from one read stack onto each other, you'll have much less real overlapping fragments than you need for assembly/scaffolding

