we did some genome sequencing of bacteria from a biotech company, They used de-novo assembly for sequencing with Shovill assembly method using Illumina NovaSeq I need to submission those genome data in NCBI but the problem I am facing with their provided modifier details they mentioned genome coverage of a bacteria eg. 497 Could anyone make me understand what it literally means?
I know that genome coverage can be up to 100% it exceeds 100 how it is possible? and a good genome sequencing depth is approximately 30x than what does it mean genome coverage - 497?
please share your point of view or any article/ weblink. Thanks in advance.
Okay, thank you so much for your response. Can you tell me which is the standard depth for bacterial genome?
A rule of thumb is that a coverage of 50 is usually enough to make good assemblies of prokaryotic and eukaryotic genomes. N50 approximately reaches a plateau when the coverage is 50 or so. For example, see https://pubmed.ncbi.nlm.nih.gov/23593174/ , https://pubmed.ncbi.nlm.nih.gov/26315384/ , https://pubmed.ncbi.nlm.nih.gov/32781410/ , https://pubmed.ncbi.nlm.nih.gov/34485177/ , https://pubmed.ncbi.nlm.nih.gov/32385271/ .
However, there are exceptions. The coverage by Illumina reads (unlike the coverage by Nanopore or PacBio reads) highly depends on the GC-content (https://pubmed.ncbi.nlm.nih.gov/22323520/). The coverage is the highest in regions with the GC-content of approximately 50% and is lower in GC-rich or AT-rich regions. I once assembled a bacterial genome that had several very GC-rich regions (https://pubmed.ncbi.nlm.nih.gov/32793774/). Even though the average coverage by Illumina reads was 467, the coverage in these regions was 0. We had to sequence this genome with a Nanopore sequencer to make a complete assembly.
Thank you so much for your informative response.