Question: Generate single genome consensus fasta file from SPAdes output: viral genome sequencing
gravatar for snow_seq
8 months ago by
snow_seq0 wrote:

I am assembling a viral genome of 30kb de novo using SPAdes. I have fasta files for scaffolds and contigs. How do I process these into one final genome sequence, i.e. one fasta file with one sequence? What softwares are there for doing this? I would like to do pairwise alignment with published viral genome sequences from other labs/sources.Thanks in advance!

ADD COMMENTlink modified 8 months ago by Asaf8.5k • written 8 months ago by snow_seq0
gravatar for Asaf
8 months ago by
Asaf8.5k wrote:

The contigs file you have are the best the algorithm could do in assembling the genome. The scaffolds are contigs stitched together using paired-end data and some guessing about the insert size. I'm surprised you didn't get the entire genome in one contig. Do you have a reference to compare to? (my guess is you have thousands by now)

ADD COMMENTlink written 8 months ago by Asaf8.5k

Hi, thanks for your reply! Yes, I'm sequencing SARS-CoV-2. My largest contig is about 3kb and all scaffolds/contigs only cover only about 85-90% of the genome. Before assembly, I trimmed adapters and primers and normalizedto a depth of 100X. The per nucleotide coverage looked fine before assembly and after normalization, so I'm not quite sure why I am getting poor assembly results.

ADD REPLYlink written 8 months ago by snow_seq0

Maybe take a look at the assembly graph. Also, you can do amplicon sequencing:

ADD REPLYlink written 8 months ago by Asaf8.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2597 users visited in the last hour