Entering edit mode
2.5 years ago
MSRS ▴ 580
Hi Biostars Community,
I have spades assembly contigs of E. coli complete genome. From plasmidfinder (https://cge.cbs.dtu.dk/services/PlasmidFinder/) we found several plasmids are present in these contigs with around 150-650 bp.
Plasmid Identity Query / Template length Contig Position in contig Note Accession number Col(BS512) 100 233 / 233 contigs113 1956..2188 NC010656 IncFIA 100 388 / 388 contigs98 13867..14254 AP001918 IncFIB(pB171) 99.22 643 / 643 contigs99 1765..2407 AB024946 IncFII 98.08 261 / 261 contigs96 3490..3750 AY458016 IncI(Gamma) 100 137 / 141 contigs97 24329..24465 AP011954
Is there any way to separate those plasmids from contigs/fastq files? Thank you.
Not easily no. Removing named sequences is easy (check the forum for lots of answers), however unambiguously identifying a plasmid is an unsolved problem.
You can use plasmid finder tools and then separate the fasta's by name, but it's never going to be 100% effective. 150bp isn't so much a contig, as a read. They're all so short they're likely just junk. You aren't going to get a useful plasmid assembly out of those no matter what you do with them.