Is there a script somewhere around (MATLAB, R, PERL) that calculates a distance matrix based on a tree file? Hence, I already have the tree but want all the distance information in a matrix.

Script To Calculate A Distance Matrix Based On Tree File

3

Entering edit mode

13.3 years ago

Arjen Ten Have
▴
360

Is there a script somewhere around (MATLAB, R, PERL) that calculates a distance matrix based on a tree file? Hence, I already have the tree but want all the distance information in a matrix.

11

Entering edit mode

13.3 years ago

Raquel Dsl
▴
110

If you are interested in an another solution (in R) try this

```
library(ape)
tree<-read.tree("/file.tre")
PatristicDistMatrix<-cophenetic(tree)
```

0

Entering edit mode

This is an old reply, but I found this and it misled me to spend time understanding the `ape::cophenetic.phylo()`

function.

I have an integer distance tree and it returned nonsense (as far as I can tell). The function this answer alludes is stats::cophenetic() and it works wonderfully with phylo objects.

*I validated the results with the output from the T-Rex server.*

4

Entering edit mode

13.3 years ago

Casey Bergman
18k

There is a Newick->distance matrix converter in the T-rex package, available as a web-server and as source code. There are no details as far as I can see describing what this method actually does.

To calculate the patristic distance (i.e. the distance between two leaves measured along the tree) from a variety of tree formats between all pairs of leaves, you can try the PATRISTIC Java application.

It also looks like the DendroPy package can generate patristic distance from a variety of tree formats. E.g. from the Treestats tutorial: http://packages.python.org/DendroPy/tutorial/treestats.html

```
#! /usr/bin/env python
import dendropy
from dendropy import treecalc
tree = dendropy.Tree.get_from_path("input.nex", "nexus")
pdm = treecalc.PatristicDistanceMatrix(tree)
for i, t1 in enumerate(tree.taxon_set):
for t2 in tree.taxon_set[i+1:]:
print("Distance between '%s' and '%s': %s" % (t1.label, t2.label, pdm(t1, t2)))
```

0

Entering edit mode

Casey!

Great I ran the webserver with some simple trees and it runs OK. It has some problems with nodes but that is likely a matter of not including the nodes in the tree file. Great! No I will try with the source (prefer to pipe it) and see what the others do! Thanks really a great help.

Arjen

PS Didn't you use to work on polygalacturonases?

2

Entering edit mode

13.3 years ago

Michael Kuhn
5.0k

Normally, trees are generated from a distance matrix and you certainly lose information while constructing the tree. What you could get from a tree would be the sum of branch lengths for each pair of nodes. This might be possible in R by loading the tree with ape, then converting to igraph with `as.igraph`

and finally computing the shortest paths.

0

Entering edit mode

Thanks for this solution, seems the way except that I never worked with R. I managed to install R as well as ape and igraph, opened the tree using ape but then I cannot figure out how to use as.igraph The usage is as.igraph(object) but I do not have a clue how I should refer to my object.

0

Entering edit mode

I managed a bit further. I got it converted to igraph but now igraph does not do what I want, probably due to my misunderstanding of how to use the commands. If I put

```
shortest.paths(graph, v=132, weights = NULL)
```

I get the same as when I put

```
shortest.paths(graph, v=132)
```

which is basically the amount of edges between nodes

Hence weight is not included properly

To be sure: I transferred "mytree" from ape to igraph as "graph", it has 133 edges, hence 0-132. Then by putting `weights = NULL`

it should include the weights per edge BUT some edges have a value of 0.0, could this be the problem?

0

Entering edit mode

I haven't used igraph very much, but perhaps you can write your graph e.g. by `write.graph(graph, "graph.ncol", format="ncol")`

and see how it looks? http://igraph.sourceforge.net/igraphbook/igraphbook-foreign.html

2

Entering edit mode

9.2 years ago

guanhomer
▴
20

I wrote a code in Perl myself. It works for me.

I used the example tree from http://en.wikipedia.org/wiki/Newick_format

The output of the following code is:

```
A B C D
A 0 0.3 0.9 1
B 0.3 0 1 1.1
C 0.9 1 0 0.7
D 1 1.1 0.7 0
```

```
#!/usr/bin/perl -w
use strict;
my $tree = "(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);";
##record the distance of parentheses
my %dis;
my $par = -1;
my @current;
while($tree =~ /./g)
{if ($& eq '(')
{$par ++;
next if $par == 0;
$current[$#current+1] = $par;
}
elsif($& eq ')')
{(my $tem) = $' =~ /:(\d+\.\d+|\d+)/;
next if $#current == -1;
$dis{'node_'.$current[$#current]} = $tem;
pop @current;
}
}
##record the distance of leaves
my @order;
while ($tree =~ /([^\(\):,]+):(\d+\.\d+|\d+)/g)
{$dis{$1} = $2;
$order[$#order+1] = $1;
}
##record parents of leaves
my %pare;
@current = ();
$par = -1;
while($tree =~ /(\(|\)|([^\(\):,]+):)/g)
{if ($& eq '(')
{$par ++;
next if $par == 0;
$current[$#current+1] = $par;
}
elsif($& eq ')')
{pop @current;
}
else{map {$pare{$2}{$_} = 1} @current;
$pare{$2} = [@current];
}
}
##Distance matrix
my %dis2;
foreach my $i (0..$#order)
{foreach my $j ($i..$#order)
{if ($i == $j)
{$dis2{$order[$i]}{$order[$j]} = 0;
}
else{my $tem = $dis{$order[$i]} + $dis{$order[$j]};
my $tem2 = -1;
foreach my $k (0..$#{$pare{$order[$i]}})
{last if ($k > $#{$pare{$order[$j]}});
if ($pare{$order[$i]}[$k] eq $pare{$order[$j]}[$k])
{$tem2 = $k;
}
}
if ($#{$pare{$order[$i]}} != -1)
{map {$tem += $dis{'node_'.$_}} map {$pare{$order[$i]}[$_]} ($tem2+1)..$#{$pare{$order[$i]}};
}
if ($#{$pare{$order[$j]}} != -1)
{map {$tem += $dis{'node_'.$_}} map {$pare{$order[$j]}[$_]} ($tem2+1)..$#{$pare{$order[$j]}};
}
$dis2{$order[$i]}{$order[$j]} = $dis2{$order[$j]}{$order[$i]} = $tem;
}
}
}
##output
print join("\t",'',@order),"\n";
foreach my $i (@order)
{print join("\t",$i,map {$dis2{$i}{$_}} @order),"\n";
}
close(OUT);
```

1

Entering edit mode

13.3 years ago

Zhidkov
▴
600

You can try BioPerl modules: http://bioperl.org/wiki/HOWTO:Trees

Similar Posts

Loading Similar Posts

Traffic: 3618 users visited in the last hour

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

So, the matrix would show distances between all leafs of the tree?

The matrix would show distances between all leafs of the tree?

Indeed, probably it would also calculate the distance between all internal nodes but that might even turn out handy