Question: Assembling a chromosome for an E. coli strain using contigs/scaffolds from two WGS projects
gravatar for rnnh
5.6 years ago by
rnnh30 wrote:

Hello everyone,

I'm working on improving the gene coverage for the chromosome of an E. coli strain, using existing WGS data. There are two WGS projects for this strain on NCBI (according to the assembly report, the assembly level for one of them is "scaffold", and for the other is "contig").

So far, I've obtained the contigs from both of these projects from NCBI (which I saved as two separate multifasta files), found a decent reference genome, and re-ordered the contig sets against it using Mauve. I then ran the re-ordered contigs through GLIMMER-3, which produced annotated Genbank format files for each WGS project contig set. Basically, I've been following the steps in this tutorial:

After plotting the two sets of annotated contigs against each other using DNA plotter, it looks like together they can give decent gene coverage. There are gaps in the CDS in each case which are complemented by the other.

I have been using contigs from both WGS projects so far, but I think I'm going to start using the scaffolds from the scaffold level assembly; as I've just found out that the scaffold level project includes two plasmids (the contig level WGS project is just a chromosome), and I know which scaffolds are the plasmids, so I can exclude them.

How can I go about assembling a single chromosome using data from both WGS projects? Is this possible? If not, which programmes can highlight the differences between the assembled chromosomes?

Should I be using the scaffolds instead of the contigs for the WGS project which is at a scaffold level?

Are there any programmes for closing gaps between/in scaffolds using additional contigs?

Thanks for your time,


alignment assembly genome • 3.1k views
ADD COMMENTlink modified 5.6 years ago by h.mon32k • written 5.6 years ago by rnnh30

@r.harrington747 There mauve tool I only used it once, but it allows you to compare different assemblies.

Here is similar question I think to yours: From Contigs To Chromosome Scale Scaffold

Also to get you contigs and scaffolds maybe try spades raw read assembler I've used that instead of velvet

ADD REPLYlink modified 14 months ago by _r_am32k • written 5.6 years ago by Kirill300
gravatar for h.mon
5.6 years ago by
h.mon32k wrote:

If the raw sequencing reads were also released, you could try GAM-NGS. Another option is to use MIRA to perform a genome-guided assembly, use the scaffolds as reference, and the two sets of contigs as two additional strains to be assembled. A third option is to use CAP3 to merge the assemblies, but I remember reading (from more than one source, though I do not recall a single one) this is not recommended, as it introduces mis-assemblies.

edit: I saw Plasmid Assembly Use Of Cisa Contig Intergrator post listed as similar to yours, seems like a good alternative to try.

ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by h.mon32k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2585 users visited in the last hour