Question: Are There Multiple Ways To Write The Same Unrooted Tree Using Newick Format?
gravatar for jli99
7.8 years ago by
jli99150 wrote:


I have the same unrooted tree manipulated by different software resulting in different Newick files. I'm relatively sure that after the manipulation the tree remains unrooted and therefore should still be the same topology despite different branch lengths. But the problem is that can the same unrooted tree have different Newick forms? And if yes how can I rewrite the Newick files to convert them to look the same (except branch lengths)?


phylogenetics tree • 4.8k views
ADD COMMENTlink modified 7.5 years ago by aidan-budd1.9k • written 7.8 years ago by jli99150
gravatar for Farhat
7.8 years ago by
Pune, India
Farhat2.9k wrote:

When you represent an unrooted tree in Newick format, an arbitrary node is chosen as the root. Thus, you can have different Newick forms for the same tree depending on where this root is placed. Also, you can move the order of nodes around without changing the topology, e.g. ((A,B),(C,D)) and ((C,D),(A,B)) represent the same tree.

ADD COMMENTlink modified 7.8 years ago • written 7.8 years ago by Farhat2.9k

In this case (A,(C,D),B) is also the same? I had some confusions about adding support values to the arbitrary root. But now I think the root just doesn't have support value?

ADD REPLYlink written 7.8 years ago by jli99150

For bootstrapping, the value is attached to a branch, not to a node. For four leaves, there is only one bootstrapping value, on the branch between the (A,B) clade and the (C,D) clade. Or in the (A,(C,D),B) way, the only value is on the branch connecting the root and the parent of C and D. Generally, each binary unrooted tree with n leaves always has n-3 bootstrapping values.

ADD REPLYlink modified 7.5 years ago • written 7.5 years ago by lh332k

No, that would have three descendants from the root node and would not be a binary tree any more. Though, (A,((C,D),B)) would be the same.

ADD REPLYlink modified 7.8 years ago • written 7.8 years ago by Farhat2.9k

If the string represents an unrooted tree, (A,(C,D),B) is the exactly same as ((A,B),(C,D)). Actually some software intentionally put a trifurcation at the root to emphasize that this is an unrooted tree.

ADD REPLYlink modified 7.5 years ago • written 7.5 years ago by lh332k
gravatar for aidan-budd
7.5 years ago by
aidan-budd1.9k wrote:

A description of the Newick tree format is given here, on Joe Felsenstein/Mary Kuhner's lab webpages

As Farhat says above, Newick represents a rooted tree. Convention is to represent a tree that is binary and unrooted with a polytomy/multifurcation at the root node. However, note that this is only convention. The tree (A,B,(C,D)) may represent a binary, unrooted, four-taxon tree. But it might also represent a non-binary rooted tree with a polytomy at the root, the root node linked to two terminal branches (one leading to the OTU A, the other to OTU B), and an internal branch leading to the internal node that is linked to the external branches leading to OUTs C and D (OTU: Operational Taxonomic Unit )

Note the cute "paradox" that most methods of tree estimation used these days estimate unrooted trees, but for many applications of trees, we want/need to make some inference about where the root is (or isn't). Hence we need to find some way/assumptions about the position of the root (ideally that we can defend, and that hopefully we state when presenting our rooted trees) to be able to use the trees we estimate for applications requiring rooted trees.

ADD COMMENTlink written 7.5 years ago by aidan-budd1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 932 users visited in the last hour