Question: What is the minimum number/threshold of gaps (NNNNNN) permitted in a draft genome?
Kumar40 wrote:

I did a bacterial whole genome sequencing by illumina platform. The quality raw reads were subjected for de novo assembly by using CGE-Bacterial Analysis Pipeline and obtained 4.9 Mb size draft genome, which is closer to the expected genome size of the bacteria. I have subjected my assembled draft genome for various comparative genome analysis such as, genomic island prediction, pan-genome analysis etc., At some point of time, I have noticed that my assembled bacterial genome has 1463bp N residue (0.001463 Mb out of 4.9 Mb). Is it negligible factor for further downstream analysis, if it not please let me know the possibilities to fix this issue.

Thank you in advance.

There is no defined threshold for permitted number of gaps. Perhaps you will require additional data, for e.g. long read - oxford nanopore / pacbio data to fill the remaining gaps.

cschu1812.5k wrote:

I'd say this is expected. Since you're only using short reads your contigs/scaffolds cannot bridge certain low complexity, repetitive regions, so the assembly process will introduce stretches of N where assembly paths cannot be uniquely resolved. I wouldn't worry about this.

Thank you cschu181

