Question: Submission of a draft genome with or without repetitive contigs?
0
gravatar for minions-b
6 months ago by
minions-b0
minions-b0 wrote:

I have a draft fungal genome assembled/scaffolded with spade (ca. 1,400 scaffolds), which I plan to submit to EMBL. However, ca. 150 of them with very short length (200-500 bp) are identified as repetitive contigs when I used nucmer/funannotate for sanity check before gene prediction. Should I exclude the repetitive scaffolds from the submission?

All suggestions are highly appreciated!

genome • 271 views
ADD COMMENTlink modified 6 months ago by Juke-342.2k • written 6 months ago by minions-b0

I would hazard a guess that EMBL would reject those contigs anyway.

I think you’re best off removing them as contigs less than a kilobase or more are uninformative and most likely junk

ADD REPLYlink modified 6 months ago • written 6 months ago by jrj.healey12k

Thanks! I will remove them. :)

ADD REPLYlink written 6 months ago by minions-b0
1
gravatar for Juke-34
6 months ago by
Juke-342.2k
Sweden
Juke-342.2k wrote:

it depends, they don’t care if it’s repeat element or not, they just care about the length of sequences. What I remember it’s that they ask to motivate your choice when you want to keep short sequences/contigs <100 bp. The best approach would be to keep all of them then when you have your embl flat file you launch the embl flat file validator and it will tell you what you have to remove (none I guess in your case).

ADD COMMENTlink modified 6 months ago • written 6 months ago by Juke-342.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1274 users visited in the last hour