I have assembled plasmids purified from E. coli DH10B where they had been multiplicated. After the isolation of plasmids a contamination of genomic DNA was reduced using plasmid-safe DNase. Then my samples were sequenced by Illumina and I got paired-end reads (250 bp, the total size of a fragment is 800 bp).
I used SPAdes to assemble the reads. Some of my scaffolds belong to DH10B, so I plan to delete them from my final set of scaffolds. If an alignment to DH10B has the same length as my scaffold does, it's OK to get rid of this scaffold, right? Or I need to check if it has ever been found on plasmids? And what to do if the scaffold aligns with 80% of its length? With just a half?
Unfortunately, DH10B contains mobile elements. IS elements longer than sequenced fragments can't be added to scaffolds so I don't know if both the host and my plasmid carry this element or not. But SPAdes also presents paths, how contigs came from edges and scaffolds were organized from contigs and their edges. And there are IS elements connected to several plasmid scaffolds, thus they were not merged with other edges. These are plasmid IS elements, for sure. And IS element encoded by DH10B didn't connect anything. So maybe I can delete this isolated regions? May I delete all isolated regions even if they do not align to DH10B?
And if SPAdes made a scaffold shorter than the size of a sequenced fragment, it means that the assembler met a deadlock on the graph, so these short scaffolds can be neglected, right? And they are shorter that 800 bp, so they are meaningless too.
And more general question, do you perform any tests on your scaffolds after assembling? Do you check them on presence of adapters (of cause, I trimmed them, but it's better to check once again, right?)? Any other tests?