Question: Submission of a draft genome with or without repetitive contigs?
0
gravatar for minions-b
17 months ago by
minions-b0
minions-b0 wrote:

I have a draft fungal genome assembled/scaffolded with spade (ca. 1,400 scaffolds), which I plan to submit to EMBL. However, ca. 150 of them with very short length (200-500 bp) are identified as repetitive contigs when I used nucmer/funannotate for sanity check before gene prediction. Should I exclude the repetitive scaffolds from the submission?

All suggestions are highly appreciated!

genome • 469 views
ADD COMMENTlink modified 17 months ago by Juke-343.7k • written 17 months ago by minions-b0

I would hazard a guess that EMBL would reject those contigs anyway.

I think you’re best off removing them as contigs less than a kilobase or more are uninformative and most likely junk

ADD REPLYlink modified 17 months ago • written 17 months ago by Joe16k

Thanks! I will remove them. :)

ADD REPLYlink written 17 months ago by minions-b0
1
gravatar for Juke-34
17 months ago by
Juke-343.7k
Sweden
Juke-343.7k wrote:

it depends, they don’t care if it’s repeat element or not, they just care about the length of sequences. What I remember it’s that they ask to motivate your choice when you want to keep short sequences/contigs <100 bp. The best approach would be to keep all of them then when you have your embl flat file you launch the embl flat file validator and it will tell you what you have to remove (none I guess in your case).

ADD COMMENTlink modified 17 months ago • written 17 months ago by Juke-343.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1914 users visited in the last hour