Question: Improving existing assembly using PacBIo reads
0
gravatar for bhagyathimmappa
10 months ago by
bhagyathimmappa10 wrote:

Hello all,

I am new to Bioinformatics field, I have an assembly available for one of the fungi (14.5mb) but this one is not end to end assembly. we have got the PacBio sequencing done for the same strain of an organism.

a. I wan to check the quality of these reads (something like fastqc). b. I want to improve the existing assembly using PacBio long reads. c. calculate N50 value for the final assembly.

Please forgive me if the question is already asked in some other forum, I tried my best to get the answer.

Thanks a lot in advance :)

Bhagya C T

ADD COMMENTlink modified 10 months ago • written 10 months ago by bhagyathimmappa10

This recent tutorial may be of great help to you: Polish PacBio assembly with latest PacBio tools : an affordable solution for everyone

ADD REPLYlink written 10 months ago by Kevin Blighe24k

Thank you Kevin.

Nice one, but it explains only about polishing, I want to know how to improve the scafolding using PacBio long reads.

ADD REPLYlink written 10 months ago by bhagyathimmappa10

Okay, I would ask the person who created that tutorial, Roxane I believe, as she appears to have been working in that area for the past few years. Apologies that I cannot help further.

ADD REPLYlink written 10 months ago by Kevin Blighe24k
0
gravatar for colindaven
10 months ago by
colindaven740
Hannover Medical School
colindaven740 wrote:

a) use Canu to do a pacbio assembly. It gives you a html with output about the pacbio read quality. BBmap stats.sh or readlength.sh will give you great stats on the reads.

b) tell us more about the stats of the pacbio sequencing. You're probably better doing an entirely new assembly with pacbio alone, then using any existing reads to do polishing of the pacbio assembly with Pilon after running Canu.

c) Again use stats.sh from a.

Good luck. Canu/bbmap can be easily installed using bioconda

ADD COMMENTlink modified 10 months ago • written 10 months ago by colindaven740
0
gravatar for bhagyathimmappa
10 months ago by
bhagyathimmappa10 wrote:

Dear colindaven, I have used canu and tried to assemble, here is the global stats.

PARAMETERS:

 40 (expected coverage)
  0 (don't use overlaps shorter than this)

0.000 (don't use overlaps with erate less than this) 1.000 (don't use overlaps with erate more than this)

OVERLAPS:

IGNORED:

       0 (< 0.0000 fraction error)
       0 (> 0.4095 fraction error)
       0 (< 0 bases long)
       0 (> 2097151 bases long)

FILTERED:

12147773 (too many overlaps, discard these shortest ones)

EVIDENCE:

 2071295 (longest overlaps)

TOTAL:

14219068 (all overlaps)

READS:

      66 (no overlaps)
   11145 (no overlaps filtered)
   22024 (<  50% overlaps filtered)
   33266 (<  80% overlaps filtered)
   37123 (<  95% overlaps filtered)
   43712 (< 100% overlaps filtered)

I do not think PacBio alone will give me good assembly, what I got from canu PacBio assembly is 120 contigs where as existing assembly has only 24 contigs. Based on the PacBio read stats I thought It is reasonable to try filling the gaps.

Please give me your inputs so that I can take it forward.

Thanks Bhagya C T

ADD COMMENTlink written 10 months ago by bhagyathimmappa10

What pacbio coverage do you have ? Good Pacbio data + canu should assemble a genome of this size with ease.

ADD REPLYlink written 10 months ago by colindaven740
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 935 users visited in the last hour