Question: Using Bandage to finish ambiguous long-read assembly?
gravatar for predeus
13 months ago by
predeus1.3k wrote:

Hello all,

I have a Unicycler assembly of a bacterial genome from PacBio and Illumina reads. It's a rather small but repetitive genome, and it's didn't assemble into one circualr chromosome, despite having 500x long read coverage.

I've tried few other options (i.e. long read-only assemblers) and they didn't produce a finished genome as well.

I've read that it's possible to finish the assembly by inspecting it with Bandage and by aligning reads to it. However it's not obvious how to do it and what with? Bandage offers BLAST functionality, but I don't think I can blast 500x of PacBio reads onto the graph. Would it make sense to get consensus set of well-corrected reads with Racon? And how does one identify a subset of long reads that potentially span the contig ends?

Thank you for any suggestions.

ADD COMMENTlink modified 11 months ago by Biostar ♦♦ 20 • written 13 months ago by predeus1.3k

What's the quality and length of the long reads? If you have ILMN reads handy, I'd recommend using Ryan Wick's Filtlong to filter out the longest and highest quality reads possible. Maybe down to about ~100X coverage? You can use the ILMN reads as a reference for filtering the Nanopore reads. It also has a very handy script included w/ filtlong to quickly generate stats on the reads before/after filtering -

ADD REPLYlink modified 12 months ago • written 12 months ago by kapsakcj30

As a matter fact I do have some Illumina! Thank you, very good suggestion.

ADD REPLYlink written 12 months ago by predeus1.3k

wow I completely misread PB reads for nanopore, sorry. does filtlong work on pacbio reads?

ADD REPLYlink written 12 months ago by kapsakcj30

yes, I've ran it with my data and it worked very well (I've ran it against Illumina reads).

ADD REPLYlink written 12 months ago by predeus1.3k

How many contigs you had on your assembly? I played just a little with Bandage, it helped decide where to design PCR primers to link contigs and (possibly later) sequence and finish the assembly. I had only MiSeq data, PacBio is still a rarity around here. Of course, this approach is only useful if you have small gaps, and not too many.

I think you can try to map consensus reads with minimap2 and identify chimeric reads mapping to different contigs.

ADD REPLYlink written 13 months ago by h.mon29k

I have just a few contigs - the assembly is almost complete.

ADD REPLYlink written 12 months ago by predeus1.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1793 users visited in the last hour