Is it possible to do mummerplot with reference and query sequences that have many contigs?
1
0
Entering edit mode
11 months ago

Hi, I have a reference genome that has 9 contigs and query genome that has 56 contigs. They both has more than 99 % identical nucleotides. I tried to make mummerplot but got really weird results. I don't have a screenshot to show you but the lines were all over the plot, there are usually 2 lines but I got multiple lines and I can't really use that output to present my findings. So is there a way to do mummerplot with multiple contigs or is there a similar tool that can use multiple contigs?

alignment mummer mummerplot • 911 views
0
Entering edit mode

It's hard to picture your description and figure out the issue.

A quick test would be d-genies so then you could rapidly sort the contigs and filter lower identity matches etc

But with 50 or so contigs it should work with mummerplot.

0
Entering edit mode

sorry if it was not clearly written. So the problem is that on the plot I get a lot of lines that doesn't really make sense.

1
Entering edit mode
11 months ago

You could sort and concatenate your 56 contigs into 9 contigs using mummer based on the alignments.

nucmer your_9_contigs.fasta your_56_contigs.fasta
# this generates an out.delta
show-tiling -p concatenated_contigs.fasta out.delta > out.tiling


concatenated_contigs.fasta will contain your 56 contigs, concatenated based on how they align. You can then rerun mummer with those to get a cleaner plot.

But you have to be careful - by default I think mummer cuts off the stuff that doesn't align, and fills up gaps with Ns instead of putting the leftover from your aligning contigs there. You could try whether RagTag does a better job https://github.com/malonge/RagTag

I don't remember whether show-tiling wants a single contig reference or whether it's ok with nine, give it a try