Question: SNP calling for correcting errors ?
gravatar for marcela.uliano
4.3 years ago by
European Union
marcela.uliano60 wrote:

Hey guys,

Let's say one has scaffolded a draft genome with PacBio subreads and PB Jelly, but still is concerned about not enough coverage of Illumina contigs, and the maintenance of PacBio indel errors in the final draft.

This person also have high quality RNA-seq illumina and can map this to the draft genome. One could use a SNP caller to, instead of calling SNPs, to evaluate errors in the draft genome?

Taking into consideration the species for which the draft genome is available is diploid, and that the one would only align RNA-seq of one individual (one sample).

Or do you guys know any other pos-assembly draft error-correctors?

Thank you, guys!

ADD COMMENTlink modified 2.8 years ago by tjduncan260 • written 4.3 years ago by marcela.uliano60

Google: "pilon broad".

ADD REPLYlink written 4.3 years ago by lh332k

Thank you Ih3 and ALchEmiXt,

Running PILON iteractively is exactly what I'm doing and its working great! I notice by the number of CEGs I got in the draft genome previous and post PILON.

Thanks guys!

ADD REPLYlink written 4.2 years ago by marcela.uliano60
gravatar for ALchEmiXt
4.3 years ago by
The Netherlands
ALchEmiXt1.9k wrote:

As mentioned you could use pilon as a curation tool to polish the assemblies.

We actually use pilon in iterated mode since we noticed that using different technologies not all curatable changes are polished the first round. We usually run them in a short loop of max 4 -6 iterations taking the output of each iteration of pilon as input to the next. Worked quite well for us.

Another way we tend to do it is by using mapping with bowtie2 or bwa and use samtools to generate and extract a consensus. This consensus is next used for another iteration. Usually (depending on quality) this consensus bulding is saturated at 5 iterations. the latter can also be used to generate your own consensus from a closely related sequence.... (your milage may vary though depneding how close the sequence is).

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by ALchEmiXt1.9k
gravatar for tjduncan
2.8 years ago by
Indianapolis, IN
tjduncan260 wrote:

The Hercules package that hit biorxiv recently may be perfect for this. It is a profile HMM-based hybrid error correction algorithm for long reads.

ADD COMMENTlink written 2.8 years ago by tjduncan260
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 721 users visited in the last hour