Question: How To Decide Which Genome Assembly Is Better?
-1
gravatar for thiagomafra
6.8 years ago by
thiagomafra70
Brazil
thiagomafra70 wrote:

Hi everybody,

I've two draft's of yeast genomes, one assembled by Velvet and other assembled by CLC Workbench, with the following report:

Assembled by Velvet

Length      11354767
GC%        38.22%
N's            1579
Scaffolds   214
N50           360969
Min            302
Max           705532

Assembled by CLC

Length       11533444
GC%         37.22%
N's             2490
Scaffolds    851
N50           72755
Min            301
Max           215625

My question: The N50 value is the principal parameter for determine which the best assembling? This case, the draft by Velvet.

assembly denovo velvet • 4.2k views
ADD COMMENTlink modified 6.1 years ago by Hayssam270 • written 6.8 years ago by thiagomafra70
1

Take a look at How To Assess The Quality Of An Assembly? (Is There No Magic Formula?) for some other ways to compare the assemblies. The Velvet assembly looks more contiguous and has fewer Ns but there are some additional comparisons you could make (discussed in that previous question) which might help you decide.

ADD REPLYlink written 6.8 years ago by SES8.2k

I would also have a look at FRCBam ( http://arxiv.org/abs/1210.1095) I had good experiences with it.

ADD REPLYlink written 6.8 years ago by Hayssam270
3
gravatar for Ryan Thompson
6.8 years ago by
Ryan Thompson3.4k
TSRI, La Jolla, CA
Ryan Thompson3.4k wrote:

Assuming that the scaffolds have all been assembled correctly, the Velvet assembly is clearly superior. However, you should align your scaffolds to the reference genome to verify that there are no assembly errors. When tool A has a larger N50 than tool B, it is always a question of whether tool A is superior because it correctly assembled more, or whether tool B is superior because ti correctly rejected more incorrect assemblies. You can't know which one is true unless you align to the reference genome.

(of course, the reference genome can also have errors, but these should show up consistently against both of your assemblies.

ADD COMMENTlink written 6.8 years ago by Ryan Thompson3.4k
1

If the alignments look similar, Assessing The Quality Of De Novo Assembled Data with a couple of tools that you can use to judge the quality of your assembly, I've had good experiences with hagfish (albeit that one goes into much more detail), have never tried QUAST but that looks like it could help you out more.

ADD REPLYlink written 6.8 years ago by Philipp Bayer6.5k
1
gravatar for Hayssam
6.1 years ago by
Hayssam270
France
Hayssam270 wrote:

Hi,

Confirming Philipp suggestion, you should have a look at the QUAST tool. We used it extensively for our publication about finishing bacterial genome assemblies by mixing various assemblers results and QUAST outperformed in terms of features, stability and customization. It can run in reference mode (even if you only have a closely related specie) or in no reference mode. In the specific case of bacterial genome (might hold for other kingdom), we showed in our MIX publication (there if you're interested) that N50 does a good job at picking the best assemblies in no reference mode.

ADD COMMENTlink written 6.1 years ago by Hayssam270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 940 users visited in the last hour