Question: Generate single genome consensus fasta file from SPAdes output: viral genome sequencing
gravatar for snow_seq
12 weeks ago by
snow_seq0 wrote:

I am assembling a viral genome of 30kb de novo using SPAdes. I have fasta files for scaffolds and contigs. How do I process these into one final genome sequence, i.e. one fasta file with one sequence? What softwares are there for doing this? I would like to do pairwise alignment with published viral genome sequences from other labs/sources.Thanks in advance!

ADD COMMENTlink modified 12 weeks ago by Asaf8.3k • written 12 weeks ago by snow_seq0
gravatar for Asaf
12 weeks ago by
Asaf8.3k wrote:

The contigs file you have are the best the algorithm could do in assembling the genome. The scaffolds are contigs stitched together using paired-end data and some guessing about the insert size. I'm surprised you didn't get the entire genome in one contig. Do you have a reference to compare to? (my guess is you have thousands by now)

ADD COMMENTlink written 12 weeks ago by Asaf8.3k

Hi, thanks for your reply! Yes, I'm sequencing SARS-CoV-2. My largest contig is about 3kb and all scaffolds/contigs only cover only about 85-90% of the genome. Before assembly, I trimmed adapters and primers and normalizedto a depth of 100X. The per nucleotide coverage looked fine before assembly and after normalization, so I'm not quite sure why I am getting poor assembly results.

ADD REPLYlink written 12 weeks ago by snow_seq0

Maybe take a look at the assembly graph. Also, you can do amplicon sequencing:

ADD REPLYlink written 12 weeks ago by Asaf8.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour