Phylogenetic Tree Comparison
1
1
Entering edit mode
12.2 years ago
Mehdi ▴ 10

I asked similar question in stackoverflow, I didn't get good answer there, So I ask it here.

I developed new algorithm for phylogenetic tree comparison(phylogenetic tree is simply rooted binary tree). As an input we have two trees, we want to calculate their similarity percentage. one example of these type of algorithms is here.

But most of these algorithms(as I know all of them) did not offer a good way to check the accuracy of their algorithms. e.g if you look at the following figure, you can see there is more similarity between T1 and T3, versus T1 and T2.

examples of three phylogenetic trees

I need a method for checking its accuracy of similarity measure, To be sure that my algorithm is better than previous algorithms !!! (it is not difficult in most of the case by human eyes but I don't know how to extend it to my application)

your validity measure should be independent from algorithm.

phylogenetics graphs similarity • 5.5k views
ADD COMMENT
5
Entering edit mode

Sorry, I guess that doesn't work that way. There is no accuracy per-se for distance measures (of trees) unless you define it, and then I think it doesn't make much sense to define it intrinsically. Given two distance or similarity measures e.g. 'Euclidean' and 'Manhattan' (yes not for trees I know, but the prob is the same), how would you define that euclidean is better than manhattan distance or vice versa. One way to get around might be to define it in the sense of being more meaningful biologically, then you need some good example.

ADD REPLY
0
Entering edit mode

Are you looking for 'tree edit distance'? Then this question on Stackoverflow might contain what you are looking for.

ADD REPLY
4
Entering edit mode
12.2 years ago
lh3 33k

A useful distance for measuring LGT is the SPR distance. However, computing SPR is NP-hard. More frequently we use the Robinson-Foulds distance (RF). It is perhaps more suitable for evaluating tree-building algorithms. Both distances are originally defined on unrooted trees, but it is trivial to define them on rooted trees as well.

In your case, the SPR distance for both T1-T2 and T1-T3 equals 1. I forget how to compute RF exactly (you may google it), but for sure under RF, T1 is closer to T3 than to T2.

ADD COMMENT

Login before adding your answer.

Traffic: 2596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6