Question: Generate single genome consensus fasta file from SPAdes output: viral genome sequencing
12 weeks ago by
I am assembling a viral genome of 30kb de novo using SPAdes. I have fasta files for scaffolds and contigs. How do I process these into one final genome sequence, i.e. one fasta file with one sequence? What softwares are there for doing this? I would like to do pairwise alignment with published viral genome sequences from other labs/sources.Thanks in advance!

12 weeks ago by
The contigs file you have are the best the algorithm could do in assembling the genome. The scaffolds are contigs stitched together using paired-end data and some guessing about the insert size. I'm surprised you didn't get the entire genome in one contig. Do you have a reference to compare to? (my guess is you have thousands by now)

Hi, thanks for your reply! Yes, I'm sequencing SARS-CoV-2. My largest contig is about 3kb and all scaffolds/contigs only cover only about 85-90% of the genome. Before assembly, I trimmed adapters and primers and normalizedto a depth of 100X. The per nucleotide coverage looked fine before assembly and after normalization, so I'm not quite sure why I am getting poor assembly results.

Maybe take a look at the assembly graph. Also, you can do amplicon sequencing:

