Question: Bray-Curtis dissimilarity for comparing genomes, how to normalize?
gravatar for Buffo
9 weeks ago by
Buffo1.8k wrote:

Hi everyone,

I am comparing whole genomes of a closely related species using Nucmer. I want to calculate the %of similarity based on the bases aligned from each genome using Bray-Curtis dissimilarity (I had also tried Sorensen coefficient). The problem is that some genomes have very different genome sizes (let's say ranging from 40 Mb to 170 Mb). In those cases, I get similarity values above 1 (or 100%) which is impossible. I have tried some normalizations as those recommended by Somerfield 2008, and Yoshioka (2008) but nothing worked.

Some suggestions? alternatives?

ADD COMMENTlink written 9 weeks ago by Buffo1.8k

I suggest you try FastANI. It compares sequences without actual alignment by calculating k-mer similarity, which in most cases is related to sequence identity. The upside for your purposes is that sequences of different lengths can be used.

ADD REPLYlink written 9 weeks ago by Mensur Dlakic7.2k

FastANI is faster than others programs such as Nucmer but it has a few disadvantages such as:

.- Is not recommended for fragmented or short genomes (N50 be ≥10 Kbp).
.- No ANI output is reported for a genome pair if ANI value is much below 80%.

But in my case would be a good alternative. Thanks.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by Buffo1.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1846 users visited in the last hour