Plasmid sequences seperation from assembly contigs/fastq file of bacterial complete genome?
Entering edit mode
11 months ago
MSRS ▴ 520

Hi Biostars Community,

I have spades assembly contigs of E. coli complete genome. From plasmidfinder ( we found several plasmids are present in these contigs with around 150-650 bp.

Plasmid Identity    Query / Template length Contig  Position in contig  Note    Accession number
Col(BS512)  100 233 / 233   contigs113  1956..2188      NC010656

IncFIA  100 388 / 388   contigs98   13867..14254        AP001918

IncFIB(pB171)   99.22   643 / 643   contigs99   1765..2407      AB024946

IncFII  98.08   261 / 261   contigs96   3490..3750      AY458016

IncI(Gamma) 100 137 / 141   contigs97   24329..24465        AP011954

Is there any way to separate those plasmids from contigs/fastq files? Thank you.

Assembly plasmid • 301 views
Entering edit mode

Not easily no. Removing named sequences is easy (check the forum for lots of answers), however unambiguously identifying a plasmid is an unsolved problem.

You can use plasmid finder tools and then separate the fasta's by name, but it's never going to be 100% effective. 150bp isn't so much a contig, as a read. They're all so short they're likely just junk. You aren't going to get a useful plasmid assembly out of those no matter what you do with them.


Login before adding your answer.

Traffic: 1628 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6