Question: Is it better to annotate contigs or scaffolds
2
gravatar for mgalactus
6.1 years ago by
mgalactus750
United Kingdom
mgalactus750 wrote:

Hi,

I'm annotating some bacterial genomes, and I was wondering whether it makes more sense to annotate the contigs and then scaffold them or if it would have been ok to annotate the scaffolds. I'm planning to submit these genomes to NCBI, so it should comply with their standards as well.

Thanks

ADD COMMENTlink modified 6.1 years ago by HG1.1k • written 6.1 years ago by mgalactus750
3
gravatar for dago
6.1 years ago by
dago2.6k
Germany
dago2.6k wrote:

I would say that annotating scaffolds makes much more sense. One scaffold can be done by many contings, and it could be that at the end of one conting you find a CDS broken in the middle or maybe a gene cluster broken in the middle. This can produce incorrect annotation or can give you a partial information on the gene order in the genome. Instead, likely, in the scaffolds this bias should be reduced.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  -> scaffold

xxxxxx                     xxxxxxxxxxxxx 

          xxxxxxxxxxxx                          xxxxxxxxxxxxx -> contings

 

ADD COMMENTlink written 6.1 years ago by dago2.6k
1

Thanks for the reply: you are probably right, but do you have any information regarding if NCBI wants the scaffolding information to be given somehow?

ADD REPLYlink written 6.1 years ago by mgalactus750
2

Thake a look here

ADD REPLYlink written 6.1 years ago by dago2.6k
1
gravatar for HG
6.1 years ago by
HG1.1k
Germany
HG1.1k wrote:

Please find and email response long back i got from NCBI

"We do accept gapped submissions if N's represent gaps between ordered
and oriented contiguous sequences.  If you are using estimated gap sizes,
then the number of N's should exactly match the estimated gap size.
If you are unsure of the gap size, you should add 100 N's in the sequence
file.

For more information on preparing a gapped submission please see
http://www.ncbi.nlm.nih.gov/genbank/wgs_gapped

Please note we offer two submission pathways (Complete and WGS):

1. The genome assembly could be submitted as a complete genome if it falls into either of these cases:
  a. You have sequenced the complete circular genome and there are no gaps
  b. You know the order and orientation of the contigs and were able to assemble your sequences, with Ns between the contigs, into a single scaffold representing the circular genome with no extra unplaced contigs

Genomes in the complete category should be submitted as .sqn files with or without annotation using GenomesMacroSend (http://www.ncbi.nlm.nih.gov/projects/GenomeSubmit/genome_submit.cgi) as described in http://www.ncbi.nlm.nih.gov/Genbank/genomesubmit.html.

2. If the genome assembly is in multiple pieces that you were unable to assemble into a complete chromosome, then submit the contigs to our Whole Genome Shotgun (WGS) database using the WGS submission portal (https://submit.ncbi.nlm.nih.gov/subs/wgs/).  See the WGS page, http://www.ncbi.nlm.nih.gov/Genbank/wgs.submit.html for details.

Please contact us at genomes@ncbi.nlm.nih.gov if you have additional questions."

ADD COMMENTlink written 6.1 years ago by HG1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 981 users visited in the last hour