I have isolated a virus (phage), sequenced it, and ran a taxonomic search using Kraken. The most similar taxa is phage BT-1011:
Then, I created a consensus sequence with BBmap and mapped against the genome of this virus. With fastANI, I got the visualization of the mapping:
There are two issues I am not sure about:
- There is still a large portion of the genome that is not classified (the pink area in the first plot).
- The genome of the isolate is much smaller than the reference (second plot).
I fear that the mapping against BT-1011 has left out a large chunk of genome.
My question is: what is the correct procedure for the assembly of a phage genome?
Would the mapping against the most closely related phage be enough (as I did)?
Or shall I generate a de novo assembly and leave it as is?
NOTE: I made such an assembly but I got several contigs; how can I get a single consensus sequence from those?
Thank you