Question: de novo genome assembly of bacterial genome
gravatar for rthapa
11 weeks ago by
rthapa40 wrote:

Hi, I am doing de novo genome assembly with canu. I got two contigs, one longer contig and another shorter contig. It seems like the longer one is genome and shorter one is plasmid. When I checked the assembly after aligning the longer contig with reference genome. I see that big part of genome is aligned in different place. It is probably due to circular genome of bacteria. I want to see structural variants compared with the reference genome. I am afraid the misalignment affect on accurate estimation of structural variants. Does anyone have suggestions how to to deal with circular genome on estimating structural variants? Thanks

The aligned genome looks like the one in the following link.

Mauve alignment

bacteria assembly genome • 242 views
ADD COMMENTlink modified 10 weeks ago by h.mon32k • written 11 weeks ago by rthapa40

Are you sure this isn't simply that the order of your 2 contigs is different compared to the reference? You can just reorder them and it will be almost a perfect match - or am I misunderstanding?

Also, are you sure this is a chromosome and a plasmid? The reference sequence appears to be a single contiguous sequence, and that would be an enormous plasmid. Or is the plasmid you refer to the tiny turquoise block on the right?

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by Joe19k

OP has said that there is only one contig in other post I linked above. So this may be just a matter of identifying correct origin of replication perhaps. Or a possibility is that the published reference is incorrect. But that may be a long shot.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by GenoMax96k

Yes, I have only one contig to align with the reference genome. Since, it is a bacterial genome and circular, it may be a matter of identifying the origin of replication. Do you have any suggestion how could I proceed with circular genome? My ultimate goal is to find the structural variants in the genome, so for this I need to align the assembled genome with the reference genome properly.

ADD REPLYlink written 11 weeks ago by rthapa40

There is a prior post by original poster about this here:

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by GenoMax96k
gravatar for h.mon
10 weeks ago by
h.mon32k wrote:

There is no evidence of structural variation in your assembly.

The Mauve image above actually contains two contigs for both the reference genome and your assembled genome, probably the chromosome and a plasmid (this genomic structure seems to be common in this species, e.g., ).

If you look at the main bacterial chromosome, you can see your assembly and the reference are completely colinear, the only difference is the arbitrary break of the circular contigs has been made at different locations between the assemblies. CANU breaks the circular assembly at a random location, it may also be slightly imprecise with the sequence at these breaks - you should consider running Circlator to fix these issues.

ADD COMMENTlink written 10 weeks ago by h.mon32k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1164 users visited in the last hour