Question: What is the minimum number/threshold of gaps (NNNNNN) permitted in a draft genome?
gravatar for Kumar
18 months ago by
Kumar40 wrote:

I did a bacterial whole genome sequencing by illumina platform. The quality raw reads were subjected for de novo assembly by using CGE-Bacterial Analysis Pipeline and obtained 4.9 Mb size draft genome, which is closer to the expected genome size of the bacteria. I have subjected my assembled draft genome for various comparative genome analysis such as, genomic island prediction, pan-genome analysis etc., At some point of time, I have noticed that my assembled bacterial genome has 1463bp N residue (0.001463 Mb out of 4.9 Mb). Is it negligible factor for further downstream analysis, if it not please let me know the possibilities to fix this issue.

Thank you in advance.

ADD COMMENTlink modified 18 months ago by lakhujanivijay5.3k • written 18 months ago by Kumar40

There is no defined threshold for permitted number of gaps. Perhaps you will require additional data, for e.g. long read - oxford nanopore / pacbio data to fill the remaining gaps.

ADD REPLYlink written 18 months ago by lakhujanivijay5.3k
gravatar for cschu181
18 months ago by
cschu1812.5k wrote:

I'd say this is expected. Since you're only using short reads your contigs/scaffolds cannot bridge certain low complexity, repetitive regions, so the assembly process will introduce stretches of N where assembly paths cannot be uniquely resolved. I wouldn't worry about this.

ADD COMMENTlink written 18 months ago by cschu1812.5k

Thank you cschu181

ADD REPLYlink modified 18 months ago by ATpoint44k • written 18 months ago by Kumar40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1478 users visited in the last hour