Question: Neighbor Joining algorithm: why n-2?
0
gravatar for michael
2.4 years ago by
michael0
michael0 wrote:

I have been looking into how the neighbor joining algorithm works and I have two questions about the first step, i.e. computing the matrix Q using the following formula:

Q(i,j) = (n - 2)d(i,j) - Σd(i,k) - Σd(j,k)

Where both summations are from k = 1 to n, n is the number of taxons, d() is the distance between two elements and i and j are specific taxons.

From my understanding this formula creates a new matrix containing the distance between leaves i and j multiplied by (n-2). From this value we subtract the total distance from i to all other nodes and the total distance from j to all other nodes. This gives us a sense of how far away any given pair of nodes is from all other clusters.

I think we are using the term (n-2) as a scaling factor to ensure that the value calculated by the two sums does not dominate the result. If this is the case, why is it n-2 and not n-1 or just n?

Finally, I have also encountered the following formula for Q:

Q(i,j) = d(i,j) - 1/(n-2)Σd(i,k) - 1/(n-2)Σd(j,k)

Are these two formulas equivalent? If so why?

Thank you.

ADD COMMENTlink modified 2.4 years ago by kloetzl1.1k • written 2.4 years ago by michael0

This is not a scaling factor, it comes from doing the sum over all leaves except i and j.

ADD REPLYlink written 2.4 years ago by Jean-Karim Heriche22k
2
gravatar for kloetzl
2.4 years ago by
kloetzl1.1k
European Union
kloetzl1.1k wrote:

All of my references have a different notation, but I will try to answer anyway.

If this is the case, why is it n-2 and not n-1 or just n?

You kind of answered the question yourself in the paragraph above. You want (try to) to fuse two nodes i and j into a new node p. By doing that you remove one edge from the tree, hence n-1. Knock off another -1 for each "self distance" i.e. value of 0 in the sum, thus n-2.

Are these two formulas equivalent? If so why?

They differ only by a factor of n-2. However, what it most important is to join the nodes with the minimum distance; and that doesn't change.

ADD COMMENTlink written 2.4 years ago by kloetzl1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1780 users visited in the last hour