Question: Improvement of genome assembly using illumina contigs and nanopore reads
0
gravatar for KG
5 months ago by
KG10
KG10 wrote:

Hi, I have nanopore reads and a very fragmented genome assembly (~500 contigs for 16-20 mb genome) but not the illumina reads. I have used canu and generated a de novo assembly (44 contigs) from the nanopore reads (~30x). Since I do not have illumina reads, I could not polish this de novo assembly. Therefore, many of the ORFs could not be annotated (due to base pair level errors). I was wondering if there is any way to use the contigs (assembled from illumina reads) and improve the assembly quality (rectify base pair level errors). I have also tried LINKS and SMIS and could improve the assembly from ~500 contigs to ~200 contigs but we need a better assembly for our downstream analysis. I would appreciate if anybody can suggest any way out. We might get some illumina sequence reads in a month or so, but I wanted to know if anything can be done with what we have now.

Thanks!

ADD COMMENTlink modified 5 months ago by h.mon28k • written 5 months ago by KG10
0
gravatar for h.mon
5 months ago by
h.mon28k
Brazil
h.mon28k wrote:

You will get best polishing results with Illumina (or Illumina + Nanopore), but you can get a good improvement with Nanopore polishing. Try Racon, Nanopolish, or the polisher available for the wtdbg2 assembler - there are other polishers, but I never used them.

You can also try assembling with Flye, it has a built-in polishing step.

How come you have an assembly made with Illumina reads, but you don't have Illumina data?

ADD COMMENTlink written 5 months ago by h.mon28k

Thanks for your suggestions. I'll give it a try and use nanopore polishing tools.

We have not generated that illumina assembly. It's available from NCBI, but not the raw reads.

ADD REPLYlink written 5 months ago by KG10

I have not yet tried polishing a short read assembly with long reads (and i would assume one shouldn't if they have other options).

My first suggestion would be to try contacting the author of the paper and ask them for the illumina reads.

If you really have to work with the short read assembly + nanopore reads, then i guess your goal is not improving the quality of existing sequences, but rather linking contigs / resolving repeats. I would not expect racon or nanopolish to be of much use here. But you might try, of course.

From a cursory search: Long Read Gapcloser and GMcloser seem to be built specifically for your task.

ADD REPLYlink modified 5 months ago • written 5 months ago by Tom520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1204 users visited in the last hour