Question: Assessing Quality And Accuracy Of De Novo Genome Assembly
gravatar for a81526a
7.1 years ago by
a81526a60 wrote:

All, I am curious whether anyone out there has a method for assessing the quality and accuracy of de novo genome assemblies? I am currently doing in silico simulations of de novo genome assembly from a previously sequenced genome to determine the best assembly parameters (K-mer size, coverage cutoff etc) and optimal dataset (mate pair library size, coverage etc). The ultimate goal will be to use these parameters to assemble the genome of a related species, de novo.

However, the difficulty is that after simulating the data and making a de novo assembly I don't know of any statistics or methods to compare the assembled contigs back to original sequence that they were simulated from. This requires two steps (1) align assembled contigs to reference genome (2) assess the fit

People often optimize N50, assembly size, contig number and other length-based measurements - but this only makes for bigger and bigger contigs and there is little information about whether these contigs are accurate. I have been using BLAST to compare the contigs to the reference and asking how well they fit, how long the alignments are and how many mis-assembled contigs there are. If anyone has ideas or methods for assessing the accuracy ( or overall similarity of an assembly and a genome) I would be grateful to hear about it.

ADD COMMENTlink modified 5.6 years ago by h.mon31k • written 7.1 years ago by a81526a60

Duplicate of: How to assess the quality of an assembly? (Is there no magic formula?)

ADD REPLYlink written 7.1 years ago by SES8.4k
gravatar for Rohit
7.1 years ago by
Rohit1.4k wrote:

There is a tool named QUAST for assessment of genome assemblies

Have never used it though.

ADD COMMENTlink written 7.1 years ago by Rohit1.4k

If there is a similar reference to compare against then this is very good at giving a "real" N50 value.

ADD REPLYlink written 7.1 years ago by rob234king600
gravatar for h.mon
5.6 years ago by
h.mon31k wrote:

The software Rohit mentioned on his answer, QUAST, accepts a reference genome and provide an analysis of the assemble against it, including %overlap, %missing and missassemblies. I believe QUAST uses nucmer from the MUMmer package to perform this analysis.

BLAST Ring Image Generator may also be helpful for your purpose.

ADD COMMENTlink written 5.6 years ago by h.mon31k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1901 users visited in the last hour