Question: Polish PacBio assembly with Hi-C reads
gravatar for alex.zaccaron
6 days ago by
alex.zaccaron30 wrote:


I have a small haploid genome (85 Mb) that was assembled with Canu based on ~100x of PacBio Sequel reads. In addition, a batch of 40 Gbp Hi-C Illumina reads was sequenced to perform scaffolding. The assembly has been polished with Arrow, but there is not a third dataset of Illumina reads to polish with Pilon. I was wondering if I could instead use the Hi-C reads to perform the Illumina polishing step by mapping one or both ends of the reads individually to the assembly. However, given the nature of Hi-C reads, I am a little concerned that the uneven coverage and chimeric reads could have a negative impact. Anyone has previous experience with this approach? Is it a good idea to use Hi-C reads to polish an assembly?


sequencing assembly • 68 views
ADD COMMENTlink written 6 days ago by alex.zaccaron30

The uneven coverage means polishing will be uneven, with some regions unpolished. As for the chimeric reads, you could use only reads mapping end-to-end to the reference, e.g., using samclip.

ADD REPLYlink written 6 days ago by h.mon27k

Thanks h.mon for the suggestion. Like you pointed out, using only end-to-end mapped reads could still be useful to polish regions of the genome. I will give it a shot and see how it looks.

ADD REPLYlink written 6 days ago by alex.zaccaron30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1092 users visited in the last hour