MUMmer plot visualization
0
0
Entering edit mode
3.2 years ago
rthapa ▴ 90

Hi, I am using MUMmer plot to compare between a de novo assembly with reference genome. The percent identity between two genomes is more than 99% but when I plot the two genomes with MUMmer plot, the plot (https://ibb.co/q08xsZw) doesn't look like that. Does anyone have any idea? Thanks

enter image description here

MUMmer plot • 3.7k views
ADD COMMENT
0
Entering edit mode

Have you tried to reverse complement one of the sequences before trying this again? I don't recollect if MUMmer tries that.

ADD REPLY
0
Entering edit mode

No, I did't try the reverse complement. Do you have any suggestion on tools that we can use for getting reverse complement? The alignment looks like https://ibb.co/cL6ydYr with mauve. enter image description here

ADD REPLY
0
Entering edit mode

You can use reformat.sh from BBMap suite to do the reverse complement.

reformat.sh in=seq.fa out=rc.fa rcomp=t
ADD REPLY
0
Entering edit mode

It looks like the reverse complement is better to visualize. It seems like there are two inversions in the assembled genome https://ibb.co/k5MqYqZ.

enter image description here Is the red bar present in the right corner duplicated region in the reference genome? Do you have any idea if I need to get reverse complement to call the structural variants with Mummer or I can just use the de novo assembly?

ADD REPLY
0
Entering edit mode

Those blue segments represent inversions and translocations (which are apparent in your mauve plots too).

ADD REPLY
0
Entering edit mode

Thank you. But when I check the mummer results for structural variation, there are four inversion in the assembly compared to reference genome. I wonder why mummer plot is showing only two inversions.

tig00000001 INV 228534  226344  -2189
tig00000001 GAP 745271  742507  -2763   -232    -2531
tig00000001 GAP 765321  764991  -329    -151    -178
tig00000001 INV 872150  866517  -5632
tig00000001 GAP 1199745 1199745 1   -47 48
tig00000001 GAP 1900829 1900827 -1  -13978  13977
tig00000001 GAP 2499469 2498978 -490    -336    -154
tig00000001 DUP 2561758 2562745 988
tig00000001 BRK 2562746 2563397 652
tig00000001 DUP 2563398 2563566 169
tig00000001 BRK 2563567 2563570 4
tig00000001 INV 2769150 2763519 -5630
tig00000001 INV 3375024 3372831 -2192
tig00000001 JMP 3688215 3688214 0

``

ADD REPLY
0
Entering edit mode

Hi, just saw your post and I know it's been quite too long since you started the discursion, but Mummer does actually show both four inversions you see in your plot. For example, in mauve output, seems like the reference sequence wasn't properly assembled. I'd guess that the red region at the beggining and at the end are overlaps, is this genome circular? If it is that, you can reverse complement that small region from any end (5'or 3') and align to the other, if that' true you can crop that region from the genome and the redo your analysis with your contigs.

I've seen some cases like that before, and that actually worked for me. The reason the tool reports 4 inverts its because that region is probably repeating in the reference sequence.

Best,

Carlos Costa

ADD REPLY

Login before adding your answer.

Traffic: 2937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6