Question: Using Bandage to finish ambiguous long-read assembly?
gravatar for predeus
5 months ago by
predeus890 wrote:

Hello all,

I have a Unicycler assembly of a bacterial genome from PacBio and Illumina reads. It's a rather small but repetitive genome, and it's didn't assemble into one circualr chromosome, despite having 500x long read coverage.

I've tried few other options (i.e. long read-only assemblers) and they didn't produce a finished genome as well.

I've read that it's possible to finish the assembly by inspecting it with Bandage and by aligning reads to it. However it's not obvious how to do it and what with? Bandage offers BLAST functionality, but I don't think I can blast 500x of PacBio reads onto the graph. Would it make sense to get consensus set of well-corrected reads with Racon? And how does one identify a subset of long reads that potentially span the contig ends?

Thank you for any suggestions.

ADD COMMENTlink modified 3 months ago by Biostar ♦♦ 20 • written 5 months ago by predeus890

What's the quality and length of the long reads? If you have ILMN reads handy, I'd recommend using Ryan Wick's Filtlong to filter out the longest and highest quality reads possible. Maybe down to about ~100X coverage? You can use the ILMN reads as a reference for filtering the Nanopore reads. It also has a very handy script included w/ filtlong to quickly generate stats on the reads before/after filtering -

ADD REPLYlink modified 4 months ago • written 4 months ago by kapsakcj30

As a matter fact I do have some Illumina! Thank you, very good suggestion.

ADD REPLYlink written 4 months ago by predeus890

wow I completely misread PB reads for nanopore, sorry. does filtlong work on pacbio reads?

ADD REPLYlink written 4 months ago by kapsakcj30

yes, I've ran it with my data and it worked very well (I've ran it against Illumina reads).

ADD REPLYlink written 4 months ago by predeus890

How many contigs you had on your assembly? I played just a little with Bandage, it helped decide where to design PCR primers to link contigs and (possibly later) sequence and finish the assembly. I had only MiSeq data, PacBio is still a rarity around here. Of course, this approach is only useful if you have small gaps, and not too many.

I think you can try to map consensus reads with minimap2 and identify chimeric reads mapping to different contigs.

ADD REPLYlink written 5 months ago by h.mon25k

I have just a few contigs - the assembly is almost complete.

ADD REPLYlink written 4 months ago by predeus890
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 666 users visited in the last hour