I have a draft fungal genome assembled/scaffolded with spade (ca. 1,400 scaffolds), which I plan to submit to EMBL. However, ca. 150 of them with very short length (200-500 bp) are identified as repetitive contigs when I used nucmer/funannotate for sanity check before gene prediction. Should I exclude the repetitive scaffolds from the submission?
All suggestions are highly appreciated!
I would hazard a guess that EMBL would reject those contigs anyway.
I think you’re best off removing them as contigs less than a kilobase or more are uninformative and most likely junk
Thanks! I will remove them. :)