Question: Submission of a draft genome with or without repetitive contigs?
0
gravatar for minions-b
23 months ago by
minions-b0
minions-b0 wrote:

I have a draft fungal genome assembled/scaffolded with spade (ca. 1,400 scaffolds), which I plan to submit to EMBL. However, ca. 150 of them with very short length (200-500 bp) are identified as repetitive contigs when I used nucmer/funannotate for sanity check before gene prediction. Should I exclude the repetitive scaffolds from the submission?

All suggestions are highly appreciated!

genome • 577 views
ADD COMMENTlink modified 23 months ago by Juke344.8k • written 23 months ago by minions-b0

I would hazard a guess that EMBL would reject those contigs anyway.

I think you’re best off removing them as contigs less than a kilobase or more are uninformative and most likely junk

ADD REPLYlink modified 23 months ago • written 23 months ago by Joe18k

Thanks! I will remove them. :)

ADD REPLYlink written 23 months ago by minions-b0
1
gravatar for Juke34
23 months ago by
Juke344.8k
Sweden
Juke344.8k wrote:

it depends, they don’t care if it’s repeat element or not, they just care about the length of sequences. What I remember it’s that they ask to motivate your choice when you want to keep short sequences/contigs <100 bp. The best approach would be to keep all of them then when you have your embl flat file you launch the embl flat file validator and it will tell you what you have to remove (none I guess in your case).

ADD COMMENTlink modified 23 months ago • written 23 months ago by Juke344.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1002 users visited in the last hour