PacBio coverage question for a plant genome
1
0
Entering edit mode
7.8 years ago
arnstrm ★ 1.8k

Hi all,

Can anybody suggest how much pacbio coverage do we need for a genome of 1.7 giga bases (de novo assembly)? We already have about 47X illumina reads (paired end, mate pairs 9 and 11 Kb) and the assembly that we generated is not so great (scaffold N50 of 36Kb, but 50% Ns). Most importantly, the genome has ~50% repeat content. We have a shared budget that we need to split for both pacbio DNA sequencing and Iso-seq (for future genome annotation: this spp doesn't have any transcriptome data). What do you guys think is the ideal coverage we need aim for to get a quality assembly?

Any suggestions on this will be greatly appreciated!

coverage Assembly pacbio • 4.1k views
ADD COMMENT
0
Entering edit mode

just as a gut feeling, to complement your Illumina-Seq I'd say 10-15x coverage. But I might be way off here!

ADD REPLY
7
Entering edit mode
7.8 years ago
thackl ★ 3.0k

It depends a lot on what you are planning to do with your PacBio data. For assembly and error correction, there are different strategies:

Also, I've seen a lot of issues with large/repetitive plant genomes and library prep/size selection for PacBio. Make sure to test your protocols. Good subread length is very important if you do low coverage PacBio. With short subreads you may get a theoretical coverage of 10X, but since all subreads from one read stack onto each other, you'll have much less real overlapping fragments than you need for assembly/scaffolding

ADD COMMENT

Login before adding your answer.

Traffic: 719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6