Question: Is it better to annotate contigs or scaffolds
6.1 years ago by
United Kingdom
mgalactus750 wrote:


I'm annotating some bacterial genomes, and I was wondering whether it makes more sense to annotate the contigs and then scaffold them or if it would have been ok to annotate the scaffolds. I'm planning to submit these genomes to NCBI, so it should comply with their standards as well.


6.1 years ago
6.1 years ago by
dago2.6k wrote:

I would say that annotating scaffolds makes much more sense. One scaffold can be done by many contings, and it could be that at the end of one conting you find a CDS broken in the middle or maybe a gene cluster broken in the middle. This can produce incorrect annotation or can give you a partial information on the gene order in the genome. Instead, likely, in the scaffolds this bias should be reduced.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  -> scaffold

xxxxxx                     xxxxxxxxxxxxx 

          xxxxxxxxxxxx                          xxxxxxxxxxxxx -> contings


6.1 years ago

Thanks for the reply: you are probably right, but do you have any information regarding if NCBI wants the scaffolding information to be given somehow?

6.1 years ago

Thake a look here

6.1 years ago
6.1 years ago by
HG1.1k wrote:

Please find and email response long back i got from NCBI

"We do accept gapped submissions if N's represent gaps between ordered
and oriented contiguous sequences.  If you are using estimated gap sizes,
then the number of N's should exactly match the estimated gap size.
If you are unsure of the gap size, you should add 100 N's in the sequence

For more information on preparing a gapped submission please see

Please note we offer two submission pathways (Complete and WGS):

1. The genome assembly could be submitted as a complete genome if it falls into either of these cases:
  a. You have sequenced the complete circular genome and there are no gaps
  b. You know the order and orientation of the contigs and were able to assemble your sequences, with Ns between the contigs, into a single scaffold representing the circular genome with no extra unplaced contigs

Genomes in the complete category should be submitted as .sqn files with or without annotation using GenomesMacroSend ( as described in

2. If the genome assembly is in multiple pieces that you were unable to assemble into a complete chromosome, then submit the contigs to our Whole Genome Shotgun (WGS) database using the WGS submission portal (  See the WGS page, for details.

Please contact us at if you have additional questions."

6.1 years ago
