I am fairly new to bioinformatics and I am trying to wrap my head around genome sequencing and assembly. I need to analyze this genome of this bacteria called Shewanella benthica KT99 to predict genes encoded in its genome. However, the reported assembly level of the genome is in contig level. I understand that contigs are fragments of the genome for which the order of the bases is known to be correct. However I don't quite understand why the authors of the paper reporting the draft genome sequence could not assemble it to a genome assembly level of complete genome. I have attached the information I got from NCBI regarding the bacteria. On another note, there is a very closely related bacteria called Shewanella piezotolerans WP3 which has a genome assembly level of "complete genome" on NCBI. Both were sequences using ABI 3730 family DNA sequencers, and were separated by only a year or two. Why are the assembly levels different? Below are the details of my bacteria of interest.
So far I have used an established pipeline to work with complete genome sequences of bacteria. So is there a way to just take all these contigs and assemble them together to obtain the full genome this bacteria? If so, how do I do it, and what software packages (open source) do I use to do that?
Thank you in advance!
Organism name: Shewanella benthica KT99 (g-proteobacteria)
Infraspecific name: Strain: KT99
Project: PRJNA13387Submitter: The Gordon and Betty Moore Foundation Marine Microbiology Initiative
Assembly level: Contig
Genome representation: full
RefSeq category: representative genome
GenBank assembly accession: GCA_000172075.1 (latest)
RefSeq assembly accession: GCF_000172075.1 (latest)
RefSeq assembly and GenBank assembly identical: yes
WGS Project: ABIC01*