How to scale a cladogram to reflect branching time?
2
0
Entering edit mode
5.3 years ago
ddowlin ▴ 70

Hi all,

I have a cladogram generated using PhyloT for 13 yeast species.

The tree just shows the branching order rather than the distance between branching events in terms of geological time.

Similar to the middle tree here.

I was wondering is there an easy way to scale the tree so that the branch points are proportional to time?

phylogenetics cladistics • 2.5k views
1
Entering edit mode
0
Entering edit mode

Thanks, this looks a very useful package.

Does it have a function to add pre-determined divergence times to a tree file?

0
Entering edit mode

PhyloT generated trees cannot be used with ggtree without adding the branch length first, because they cannot be imported. Indeed, I must take back my statement about phyloT being useful, because the generated trees cannot be opened in R, at least using package ape.

1
Entering edit mode

PhyloT is a nice tool for extracting trees from the taxonomy. You need to set branch lengths using some divergence timing information. You could try to get divergence time estimates from TimeTree, but while TimeTree might work well for higher level taxa, it might not have enough information on the species level. Maybe you could resort to making your own phylogeny like here: http://femsyr.oxfordjournals.org/content/15/6/fov050 ?

0
Entering edit mode

Thanks for your suggestion. Do you know if there is a way to add divergence times to the tree without having to edit the tree file (newick format in my case) in a text editor.

0
Entering edit mode

I think I would parse the tree with a perl script using Bio::TrioIO, then set each branch depth using Bio::Tree::NodeI function branch_depth() and print the tree again. as in the example http://search.cpan.org/~cjfields/BioPerl-1.6.924/Bio/Tree/NodeI.pm

1
Entering edit mode
5.2 years ago
Guangchuang Yu ★ 2.5k

I just create a function, read.phyloT, in ggtree to parse newick format of phyloT output.

read.phyloT(file=textConnection("((Escherichia_coli,(Drosophila_melanogaster,((Homo_sapiens,Mus_musculus)Euarchontoglires,Gallus_gallus)Amniota)Bilateria)cellular_organisms);"))

Phylogenetic tree with 5 tips and 4 internal nodes.

Tip labels:
[1] "Escherichia_coli"        "Drosophila_melanogaster"
[3] "Homo_sapiens"            "Mus_musculus"
[5] "Gallus_gallus"
Node labels:
[1] "cellular_organisms" "Bilateria"          "Amniota"
[4] "Euarchontoglires"

Rooted; no branch lengths.


This function is available in ggtree (version >=1.5.16).

0
Entering edit mode
5.2 years ago

There are several options to visualize and assign branch depth to the output of PhyloT, however the output needs to be generated using certain settings or it will not work.

Parse tree using R-package ape, further use with ggtree

Choose example #1 in PhyloT. You need to set Internal nodes to collapsed, and Polytomy to No and export as Newick, this gives:

     ((((Gallus_gallus,(Homo_sapiens,Mus_musculus)Euarchontoglires)Amniota,Drosophila_melanogaster)Bilateria,Escherichia_coli)cellular_organisms);


which is wrong newick format. When you try to import this using read.tree, you will get:

   > read.tree('~/perl/test.nwk')
Error in read.tree("test.nwk") :
The tree has apparently singleton node(s): cannot read tree file.
Reading Newick file aborted at tree no. 1


To avoid the error, edit the newick string and remove the outer parenthesis, such that is looks like this:

   (((Gallus_gallus,(Homo_sapiens,Mus_musculus)Euarchontoglires)Amniota,Drosophila_melanogaster)Bilateria,Escherichia_coli)cellular_organisms;


You can now read the tree into R and set the edge length in the following way:

  library(ggtree)
ggtree(tree) + geom_tiplab(size=2) # plot tree without assigned lenghts
tree$edge.length <- rpois(nrow(tree$edge), 10) # assign random edge lengths
ggtree(tree) + geom_tiplab(size=2) # plot tree again
write.tree(tree) # output the modified tree
[1] "(((Gallus_gallus:9,(Homo_sapiens:16,Mus_musculus:10)Euarchontoglires:11)Amniota:8,Drosophila_melanogaster:8)Bilateria:10,Escherichia_coli:8)cellular_organisms;"


If you get the following error instead: Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1 && dim(obj[[i]]$edge)[1] > : missing value where TRUE/FALSE needed then you likely used Expanded Internal nodes.

Using BioPerl's Bio::TreeIO module

You can add a edge length by using the following simple demo script:

#!/usr/bin/env perl
use strict;
use warnings;
use Bio::TreeIO;
# read in a tree in newick format
my $treeio = Bio::TreeIO->new( -format => 'newick', -file =>$ARGV[0] );
my $treeout = Bio::TreeIO->new( -format => 'newick' ); my$tree     = $treeio->next_tree; # you might want to test that it was defined my$rootnode = $tree->get_root_node; # process just the next generation my @nodes =$rootnode->get_all_Descendents();
map { $_->branch_length(1) } @nodes; # set constant branch length for all node # for a more realistic scenario you have to parse a node length file and match with$node->id
$treeout->write_tree($tree);
print "\n";


Output:

\$ ./tree_add_branchlength.pl test.nwk
(((Gallus_gallus:1,(Homo_sapiens:1,Mus_musculus:1)Euarchontoglires:1)Amniota:1,Drosophila_melanogaster:1)Bilateria:1,Escherichia_coli:1)cellular_organisms;