Hi guys!
I am trying to assemble a genome of ~5MB and I have 250bp PE reads. I tried assembling with Abyss:
abyss-pe k=64 name=novo in='r1.fastq r2.fastq'
This resulted in a scaffold N50 of 50k and total assembly size of 5MB (and if you run with k=128, N50=60k and total assembly size=4.5MB). I wanted to see if I can improve this using SPades so I ran it like this:
spades.py -1 r1.fastq -2 r2.fastq --careful -k 21,33,55,77,99,127 -o spades_assembly
And then used QUAST to get assembly stats like so:
quast-5.0.2/quast.py scaffolds.fasta -o report
The surprising result is that the resulting stats are a lot worse (low N50 of 1455, very high total length of 18 million) and I have to think that something went wrong or maybe I am missing something. --careful flag runs error correcting which is not done in Abyss but I don't think this is the reason? Full output of QUAST is below:
Assembly scaffolds
# contigs (>= 0 bp) 32842
# contigs (>= 1000 bp) 2918
# contigs (>= 5000 bp) 170
# contigs (>= 10000 bp) 85
# contigs (>= 25000 bp) 50
# contigs (>= 50000 bp) 34
Total length (>= 0 bp) 25940092
Total length (>= 1000 bp) 11070707
Total length (>= 5000 bp) 6648867
Total length (>= 10000 bp) 6094453
Total length (>= 25000 bp) 5557460
Total length (>= 50000 bp) 4920491
# contigs 14063
Largest contig 446405
Total length 18374693
GC (%) 42.66
N50 1455
N75 708
L50 1327
L75 6199
# N's per 100 kbp 84.79
Thanks for any input!