Question: Improving existing assembly using PacBIo reads
gravatar for bhagyathimmappa
4 weeks ago by
bhagyathimmappa0 wrote:

Hello all,

I am new to Bioinformatics field, I have an assembly available for one of the fungi (14.5mb) but this one is not end to end assembly. we have got the PacBio sequencing done for the same strain of an organism.

a. I wan to check the quality of these reads (something like fastqc). b. I want to improve the existing assembly using PacBio long reads. c. calculate N50 value for the final assembly.

Please forgive me if the question is already asked in some other forum, I tried my best to get the answer.

Thanks a lot in advance :)

Bhagya C T

ADD COMMENTlink modified 29 days ago • written 4 weeks ago by bhagyathimmappa0

This recent tutorial may be of great help to you: Polish PacBio assembly with latest PacBio tools : an affordable solution for everyone

ADD REPLYlink written 4 weeks ago by Kevin Blighe3.3k

Thank you Kevin.

Nice one, but it explains only about polishing, I want to know how to improve the scafolding using PacBio long reads.

ADD REPLYlink written 29 days ago by bhagyathimmappa0

Okay, I would ask the person who created that tutorial, Roxane I believe, as she appears to have been working in that area for the past few years. Apologies that I cannot help further.

ADD REPLYlink written 29 days ago by Kevin Blighe3.3k
gravatar for colindaven
29 days ago by
colindaven340 wrote:

a) use Canu to do a pacbio assembly. It gives you a html with output about the pacbio read quality. BBmap or will give you great stats on the reads.

b) tell us more about the stats of the pacbio sequencing. You're probably better doing an entirely new assembly with pacbio alone, then using any existing reads to do polishing of the pacbio assembly with Pilon after running Canu.

c) Again use from a.

Good luck. Canu/bbmap can be easily installed using bioconda

ADD COMMENTlink modified 29 days ago • written 29 days ago by colindaven340
gravatar for bhagyathimmappa
29 days ago by
bhagyathimmappa0 wrote:

Dear colindaven, I have used canu and tried to assemble, here is the global stats.


 40 (expected coverage)
  0 (don't use overlaps shorter than this)

0.000 (don't use overlaps with erate less than this) 1.000 (don't use overlaps with erate more than this)



       0 (< 0.0000 fraction error)
       0 (> 0.4095 fraction error)
       0 (< 0 bases long)
       0 (> 2097151 bases long)


12147773 (too many overlaps, discard these shortest ones)


 2071295 (longest overlaps)


14219068 (all overlaps)


      66 (no overlaps)
   11145 (no overlaps filtered)
   22024 (<  50% overlaps filtered)
   33266 (<  80% overlaps filtered)
   37123 (<  95% overlaps filtered)
   43712 (< 100% overlaps filtered)

I do not think PacBio alone will give me good assembly, what I got from canu PacBio assembly is 120 contigs where as existing assembly has only 24 contigs. Based on the PacBio read stats I thought It is reasonable to try filling the gaps.

Please give me your inputs so that I can take it forward.

Thanks Bhagya C T

ADD COMMENTlink written 29 days ago by bhagyathimmappa0

What pacbio coverage do you have ? Good Pacbio data + canu should assemble a genome of this size with ease.

ADD REPLYlink written 27 days ago by colindaven340
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 955 users visited in the last hour