Question: Spades vs Velvelt_why do I get much more contigs with SPADEs?
1
gravatar for anna
3 months ago by
anna10
anna10 wrote:

Hi all,

I am running SPADEs assembly for Illumina reads on several bacterial genomes. I used this script:

spades.py -1 ...R1.fastq -2 ...R2.fastq --careful -t 3 -m 30 -o

The same genomes were previously assembled with Velvet (exactly same raw data).

When I looked at the results, I have much more contigs with Spades!! Any idea why? Is there something I can do after running spades to improve the assembly quality of my genomes? Here an example of the differences I get with the 2 assemblers (after running "seqkit stats"):

scaffolds.fasta  FASTA   DNA        877  2,223,301       56  2,535.1  116,526  111  216    297       17  24,875 SPADES
Velvet.fa       FASTA   DNA        313  2,108,423      197  6,736.2  116,024  240  414  7,858        0  24,461  VELVET

file             format  type  num_seqs    sum_len  min_len  avg_len  max_len   Q1   Q2     Q3  sum_gap     N50 
scaffolds.fasta  FASTA   DNA      1,234  2,319,934       56    1,880  132,700  168  223    295       18  25,849 SPADES
Velvet.fa       FASTA   DNA        332  2,122,470      193    6,393  132,734  234  344  6,301        0  26,473  VELVET

thanks for any possible help! Anna

assembly • 290 views
ADD COMMENTlink modified 3 months ago by WouterDeCoster29k • written 3 months ago by anna10
0
gravatar for h.mon
3 months ago by
h.mon15k
Brazil
h.mon15k wrote:

It seems you are applying different minimal contig length to both datasets, with a lower threshold for SPAdes. This would increase contig count, without improving assembly quality - in fact, the opposite is probably true, these small contigs are most likely noise (unresolved repeats, low quality / low coverage contigs, etc).

Apply the same filter to both assemblies before comparing. And, most importantly, remember that length metrics alone are not a good indicator of assembly quality.

ADD COMMENTlink written 3 months ago by h.mon15k

thanks for your suggestion. However, I could not find a way to set the min contig size in Spades. Any idea if this is possible and how?

ADD REPLYlink written 3 months ago by anna10

Why don't you filter the assembly fasta? Suggestions on how to do this here, and here. You can also use reformat.sh from the BBTools package.

ADD REPLYlink written 3 months ago by h.mon15k
0
gravatar for Carambakaracho
3 months ago by
Switzerland
Carambakaracho270 wrote:

I'd recommend using quast for the comparison of assemblies. It gives you the number of contigs bigger than certain thresholds, N50, plus some overview graphs. Definitely worth trying

ADD COMMENTlink written 3 months ago by Carambakaracho270
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 793 users visited in the last hour