Question: Spades vs Velvelt_why do I get much more contigs with SPADEs?
gravatar for anna
19 days ago by
anna0 wrote:

Hi all,

I am running SPADEs assembly for Illumina reads on several bacterial genomes. I used this script: -1 ...R1.fastq -2 ...R2.fastq --careful -t 3 -m 30 -o

The same genomes were previously assembled with Velvet (exactly same raw data).

When I looked at the results, I have much more contigs with Spades!! Any idea why? Is there something I can do after running spades to improve the assembly quality of my genomes? Here an example of the differences I get with the 2 assemblers (after running "seqkit stats"):

scaffolds.fasta  FASTA   DNA        877  2,223,301       56  2,535.1  116,526  111  216    297       17  24,875 SPADES
Velvet.fa       FASTA   DNA        313  2,108,423      197  6,736.2  116,024  240  414  7,858        0  24,461  VELVET

file             format  type  num_seqs    sum_len  min_len  avg_len  max_len   Q1   Q2     Q3  sum_gap     N50 
scaffolds.fasta  FASTA   DNA      1,234  2,319,934       56    1,880  132,700  168  223    295       18  25,849 SPADES
Velvet.fa       FASTA   DNA        332  2,122,470      193    6,393  132,734  234  344  6,301        0  26,473  VELVET

thanks for any possible help! Anna

assembly • 141 views
ADD COMMENTlink modified 19 days ago by WouterDeCoster26k • written 19 days ago by anna0
gravatar for h.mon
19 days ago by
h.mon12k wrote:

It seems you are applying different minimal contig length to both datasets, with a lower threshold for SPAdes. This would increase contig count, without improving assembly quality - in fact, the opposite is probably true, these small contigs are most likely noise (unresolved repeats, low quality / low coverage contigs, etc).

Apply the same filter to both assemblies before comparing. And, most importantly, remember that length metrics alone are not a good indicator of assembly quality.

ADD COMMENTlink written 19 days ago by h.mon12k

thanks for your suggestion. However, I could not find a way to set the min contig size in Spades. Any idea if this is possible and how?

ADD REPLYlink written 19 days ago by anna0

Why don't you filter the assembly fasta? Suggestions on how to do this here, and here. You can also use from the BBTools package.

ADD REPLYlink written 19 days ago by h.mon12k
gravatar for Carambakaracho
19 days ago by
Carambakaracho60 wrote:

I'd recommend using quast for the comparison of assemblies. It gives you the number of contigs bigger than certain thresholds, N50, plus some overview graphs. Definitely worth trying

ADD COMMENTlink written 19 days ago by Carambakaracho60
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1585 users visited in the last hour