Question: What is the appropriate assembler for PacBio long reads
0
gravatar for bioinforesearchquestions
13 months ago by
United States
bioinforesearchquestions260 wrote:

Hi folks,

We got long reads sequenced from 10 bacteria using Pac Bio sequencing platform. 5 of them don't have reference bacterial strains and 5 of them have some bacterial strain closer to the subject.

I have to identify anti microbial resistant genes from these 10 bacteria. This is the first time, I am handling PacBio sequence.

Any assembler to handle long reads?As of now don't know the coverage of the sample. Guide me through a reference article if you have encountered for this requirement. I found HGAP from PacBio sequencing platform (https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/HGAP#implementations).

Celera® Assembler link is broken

alignment assembly • 1.1k views
ADD COMMENTlink modified 13 months ago by gconcepcion60 • written 13 months ago by bioinforesearchquestions260
3

Check this recent review (Table 2 has lists of lots of useful programs).

ADD REPLYlink written 13 months ago by genomax71k
1

There are numerous long-read assemblers available. Many listed here.

https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbx147/4590140

ADD REPLYlink written 13 months ago by Andy20

Do you only have PacBio data? You should get some Illumina:

On stuck records and indel errors; or “stop publishing bad genomes”

ADD REPLYlink written 13 months ago by h.mon27k

As of now, I have been told that I am going to get only the PacBio long reads. Why do you say that I should get some Illumina?

ADD REPLYlink written 13 months ago by bioinforesearchquestions260

From the blog post I linked:

If you can’t be bothered reading, then the summary is:

  • BOTH single molecule sequencing technologies (PacBio and Nanopore), their major error mode is insertions / deletions

  • Once a genome is assembled, some of these errors remain in the assembly

  • If they are uncorrected, they inevitably cause a frameshift or premature stop codon in protein-coding regions

  • It’s not that you can’t correct these errors, it’s that mostly, outside of the top assembly groups in the world, people don’t

PacBio and Nanopore have insertions / deletions as main error, Illumina doesn't have many insertions / deletions, so you can correct PacBio errors using Illumina reads.

ADD REPLYlink modified 13 months ago • written 13 months ago by h.mon27k
2
gravatar for gconcepcion
13 months ago by
gconcepcion60
Menlo Park, CA
gconcepcion60 wrote:

Your best bets are:

HGAP4 (GUI) as a pipeline provided in SMRTLink: https://www.pacb.com/support/software-downloads/

FALCON (command line) (bleeding edge HGAP): http://pb-falcon.readthedocs.io/en/latest/quick_start.html#quick-start

or Canu (command line) basically new Celera Assembler: https://canu.readthedocs.io/en/latest/quick-start.html https://github.com/marbl/canu

ADD COMMENTlink modified 13 months ago • written 13 months ago by gconcepcion60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1234 users visited in the last hour