Question: proovread Illumina coverage for hybrid genome assembly
gravatar for Josué Barrera
3.8 years ago by
Josué Barrera10 wrote:

Hello everyone!

I'm planning to use proovread to correct some PacBio sequences and use them to assemble a plant genome (around 400 Mbp). I currently have 30x coverage of PacBio data, 88x coverage of HiSeq2000 data and 24x coverage of MiSeq data (after quality filters and paired-end merging of Illumina sequences). The proovread manual suggests a coverage around 30-50x.

Is there any reason, aside from computational time, to use a short read coverage =<50x ? Will a higher coverage (112x) improve the results obtained from proovread?

Or is there any other hybrid method you suggest I could use to benefit from both my Illumina and PacBio data (e.g., DBG2OLC, ABySS)?


ADD COMMENTlink modified 3.8 years ago by Medhat8.7k • written 3.8 years ago by Josué Barrera10
gravatar for Medhat
3.8 years ago by
Medhat8.7k wrote:

more coverage is always better, in case of proovread the author suggest that you correct your pacbio read in chunks not all at one cause of the memory

Don’t run proovread on entire SMRT cells directly, it will only blast your memory and take forever. Split your data in handy chunks of a few Mbp first:

and he gives you this suggestion

# located in /path/to/proovread/bin
SeqChunker -s 20M -o pb-%03d.fq pb-subreads.fq

proovread -l pb-001.fq -s reads.fq [-u unitigs.fa] --pre pb-001

on the other hand if you have a pacbio coverage more than 20X you can try canu

If you care about speed "in correction" you can use LoRDEC

regarding assembly If you want to use hybrid assembly you can use PBcR "again the author of this software suggest you use canu" also toy can use DBG2OLC It is relatively faster

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Medhat8.7k

Thank you very much for your reply!

I think I'll try both proovread + canu and DBG2OLC to see which gives me the best results.

ADD REPLYlink written 3.8 years ago by Josué Barrera10

canu takes uncorrected pacbio reads , so no need to use proovread with canu.

ADD REPLYlink written 3.8 years ago by Medhat8.7k

The CANU documentation (release 1.3) still recommends polishing for 'best accuracy' (sic).

ADD REPLYlink written 3.6 years ago by jahn.davik0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1424 users visited in the last hour