I am testing out some different methods for assembling viral genomes from Illumina data and am having some surprising results from SPAdes. I have mapped the reads to a reference genome in Geneious for comparison and can see that my reads cover 99.1% of the 10 kb genome with an average depth of 11,258 (I sequenced very deeply and enriched the library for viral reads). So I assumed there should be more than enough data for SPAdes to output the entire genome.
However, when I run SPAdes (in paired end mode) two unusual things happen. First, it is not able to assemble the entire viral genome or even any substantial contigs of it. When I map the scaffolds back to the reference genome, I only have about 46% coverage. I can bring this up to 85% by using the "trusted contig" option but this is still well below the 99% I get from mapping all the reads directly. Does anyone have an idea why this might be the case or where I should start looking for the problem? I know SPAdes works for many people and the data I am inputting seems like it should be more than sufficient to get back a full genome.
Second, when I map the scaffolds back to the reference I can see that many of them overlap with each other substantially. Can anyone explain why they wouldn't be joined into a larger contig/scaffold? And are there any options I can add in SPAdes to join them?
Would appreciate any direction anyone can suggest to figure out what is going wrong. Thanks in advance!