Question: Mauve Similarity plots - similar or not?
1
gravatar for rrcutler
2.2 years ago by
rrcutler110
United States
rrcutler110 wrote:

Hello all. I have been using Mauve to visualize genome draft assemblies compared against a reference genome. I have been reading the website on how to interpret the similarity plots that are displayed in the viewer. Specifically they say "The height of the similarity profile is calculated to be inversely proportional to the average alignment column entropy over a region of the alignment." - I understand this as a the greater the height, the more similar two sequences are (correct me if I'm wrong). However, when choosing the similarity ranges, I am having trouble understanding how to interpret this.

So it seems that when comparing similar sequences the similarity plots (purple) should be fairly high and full, like in this: mauve plot

When selecting to display both the similarity plot and range my plots look like this, Similarity range lines are dark purple:

mauve plot

Further more, I visualized two identical sequences in mauve and got this:

mauve plot

So how do I interpret what the similarity plot ranges mean? What I think is that the higher a similarity range line is, the less range of differences there are between the sequences. The spiky purple shading also indicates ranges, but it seems to be more exaggerated than the dark purple line.

Also in the mauve output, what are the .backbone and .bbcols files?

Many Thanks

genome assembly mauve • 1.3k views
ADD COMMENTlink modified 2.1 years ago by aaron.darling40 • written 2.2 years ago by rrcutler110
4
gravatar for aaron.darling
2.1 years ago by
aaron.darling40 wrote:

Starting with the 2.4 releases, Mauve is no longer shading the entire area under the similarity curve, but instead draws a bolded line for the median similarity value over a region. The 'ranges', which are creating the fattened jagged areas in lighter purple in the 2nd plot of your post are showing the range of similarity values in that region. This adds context to the median line, so that it's possible visually to see regions where there may be small pieces missing or changed that are below the resolution of the current zoom level. As a result the plot can become visually intense to the point I thought of dubbing the 2.4 release the 'visual assault' release. I am certain that Mauve is violating some fundamental laws and regulations of data visualization.

In your 3rd plot the two sequences are identical so the median similarity is 100%, and the range covers 100%-100%, so it all just appears as a single purple line.

The user guide contains a description of the backbone file, I won't recapitulate it here.

Hope that helps!

ADD COMMENTlink written 2.1 years ago by aaron.darling40

@Dr. Darling: Glad to have you on Biostars. Now we can get authoritative support/answers for Mauve!

ADD REPLYlink written 2.1 years ago by genomax55k

Thanks a lot for the clarification!

ADD REPLYlink written 2.1 years ago by rrcutler110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 821 users visited in the last hour