Hello everyone,
We've got a set of S. meliloti genomes sequenced with Illumina MiSeq and presented as pair reads. After processing reads with Trimmomatic in the pair-end mode, we use SPAdes to assemble the genome and then Prokka to annotate it.
The problem is that genomes are expected to be enriched with repetative sequences (say, IS elements), which are lost in the final variant of assembly/annotation. To make the things worse, several housekeeping genes and other coding sequences miss to be annotated from the assembled genome as well.
We assume that assembling against a reference with polyN-masked ISs may help solve the problem but do not whether it would work (and how to perfrom that properly), so I'd like to get advice on this issue.