Fast algorithm to compute the distance matrix for a weighted tree (Stepic/Rosalind/Coursera Problem)
1
0
Entering edit mode
6.2 years ago

I'm stuck with this Rosalind problem. I'm trying to find time and memory efficient algorithm to compute the distance matrix from a weighted tree. 

Input: a weighted tree with n leaves

Output: An n x n matrix (di,j) where di,j is the length of the path between leaves i and j

Sample Input

4

0->4:11
1->4:2
2->5:6
3->5:7
4->0:11
4->1:2
4->5:4
5->4:4
5->3:7
5->2:6

Sample Output

0   13  21  22
13  0   12  13
21  12  0   13
22  13  13  0

 

 

Any clues?

 

python Rosalind evolutionary tree distance matrix • 2.4k views
ADD COMMENT
3
Entering edit mode
6.2 years ago
Ryan Layer ▴ 60

This is the "all paris shortest path" question.  The Floyd-Warshall algorithm (https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm) is a good place to start.

ADD COMMENT
0
Entering edit mode

Thanks a lot!

ADD REPLY
0
Entering edit mode

Coming back to this years late unfortunately, but I think the Floyd-Warshall algorithm would be a poor choice for this, right? For a graph with V vertices, the algorithm is O(V^3), whereas this problem can be solved in O(V^2) for a binary tree. For example, for a binary tree with n leaves:

M = empty *n* by *n* matrix
for u in tree.postorder_traversal():
    if u is a leaf:
        u.leaf_dists = [(u,0)]
    else:
        u.leaf_dists = empty list
        for c1,d1 in u.left_child.leaf_dists:
            add (c1, d1+c1.edge_length) to u.leaf_dists
            for c2,d2 in u.right_child.leaf_dists:
                add (c2, d2+c2.edge_length) to u.leaf_dists
                M(c1,c2) = M(c2,c1) = d1+c1.edge_length+d2+c2.edge_length

The two extremes are a perfectly balanced binary tree and a perfectly unbalanced (i.e., ladder-like) tree. In the case of a perfectly balanced binary tree, each level of the tree will perform a O(n) operation, and there are O(log n) total levels, making the overall time complexity O(n log n). For a perfectly unbalanced (i.e., ladder-like) tree, the internal nodes will perform n + n-1 + n-2 + ... + 2 operations and each leaf will perform 1 operation (so n total for all the leaves), so the overall time complexity will be n(n+1)/2 - 1 + n = O(n^2)

ADD REPLY

Login before adding your answer.

Traffic: 2491 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6