Question: What is the appropriate assembler for PacBio long reads
0
gravatar for bioinforesearchquestions
16 months ago by
United States
bioinforesearchquestions270 wrote:

Hi folks,

We got long reads sequenced from 10 bacteria using Pac Bio sequencing platform. 5 of them don't have reference bacterial strains and 5 of them have some bacterial strain closer to the subject.

I have to identify anti microbial resistant genes from these 10 bacteria. This is the first time, I am handling PacBio sequence.

Any assembler to handle long reads?As of now don't know the coverage of the sample. Guide me through a reference article if you have encountered for this requirement. I found HGAP from PacBio sequencing platform (https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/HGAP#implementations).

Celera® Assembler link is broken

alignment assembly • 1.4k views
ADD COMMENTlink modified 16 months ago by gconcepcion60 • written 16 months ago by bioinforesearchquestions270
3

Check this recent review (Table 2 has lists of lots of useful programs).

ADD REPLYlink written 16 months ago by genomax75k
1

There are numerous long-read assemblers available. Many listed here.

https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbx147/4590140

ADD REPLYlink written 16 months ago by Andy20

Do you only have PacBio data? You should get some Illumina:

On stuck records and indel errors; or “stop publishing bad genomes”

ADD REPLYlink written 16 months ago by h.mon28k

As of now, I have been told that I am going to get only the PacBio long reads. Why do you say that I should get some Illumina?

ADD REPLYlink written 16 months ago by bioinforesearchquestions270

From the blog post I linked:

If you can’t be bothered reading, then the summary is:

  • BOTH single molecule sequencing technologies (PacBio and Nanopore), their major error mode is insertions / deletions

  • Once a genome is assembled, some of these errors remain in the assembly

  • If they are uncorrected, they inevitably cause a frameshift or premature stop codon in protein-coding regions

  • It’s not that you can’t correct these errors, it’s that mostly, outside of the top assembly groups in the world, people don’t

PacBio and Nanopore have insertions / deletions as main error, Illumina doesn't have many insertions / deletions, so you can correct PacBio errors using Illumina reads.

ADD REPLYlink modified 16 months ago • written 16 months ago by h.mon28k
2
gravatar for gconcepcion
16 months ago by
gconcepcion60
Menlo Park, CA
gconcepcion60 wrote:

Your best bets are:

HGAP4 (GUI) as a pipeline provided in SMRTLink: https://www.pacb.com/support/software-downloads/

FALCON (command line) (bleeding edge HGAP): http://pb-falcon.readthedocs.io/en/latest/quick_start.html#quick-start

or Canu (command line) basically new Celera Assembler: https://canu.readthedocs.io/en/latest/quick-start.html https://github.com/marbl/canu

ADD COMMENTlink modified 16 months ago • written 16 months ago by gconcepcion60

I'd stay away from PacBio based assemblers - they're pretty difficult to get to work and take FOREVER. Use a third party assembler, like CANU.

ADD REPLYlink written 21 days ago by andorjkiss10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2267 users visited in the last hour