Hello Everyone,
I am very new at next generation sequencing and have a question about choosing which assembly is the best to use going forward. My samples are isolates of Helicobacter pylori that I have sent for whole genome sequencing. I have paired end Illumina reads and used Trimmomatic to process them and FastQC to make sure everything appeared acceptable. I then tried deNovo assembly of the forward and reverse paired reads using Velvet, Abyss and SPAdes. I then took the contig files produced from these 3 assembly methods and ran them through Quast to evaluate which assembly worked the best. I have attached links to the alignment produced and the summary file.
Alignment:
https://drive.google.com/file/d/0B1G2M5ad3_x4Vm0tbzBhMGJSVVk/view?usp=sharing
Summary File
https://drive.google.com/file/d/0B1G2M5ad3_x4amtFc3RiNkxRMVE/view?usp=sharing
Abyss and Spades had similar output, with SPAde perhaps being marginally better based on # contigs, largest contig, and N50. Velvet was quite different from Abyss and SPAde and had much fewer misassemblies (6 vs 35 for AByss and 31 for SPAdes). I am not sure what would account for this large difference.
If anyone could point me in the right direction as to which assembly is the best to use and/or how to improve my assemblies I would really appreciate it. Like I said I am super to to NGS and have limited computing skills so this has been a huge learning experience for me (but a fun one!).
Thanks in advance!
Did you use the same minimum contig length threshold for all three assemblies? If not, the number of contigs and N50 value are pretty much meaningless..
Yes I used the same minimum contig length of 200 for all three assemblies.