Inferring Information About The Phylogenetic Tree For A Set Of Species From A Tree Of A Superset Of These
2
4
Entering edit mode
13.1 years ago
Ehamberg ▴ 130

Say that I have found the following phylogenetic tree for four species – a, b, c, and d, and this tree has a high likelihood:

      /\
   t₁/  \ t₂
    /    \
   a     /\
      t₃/  \ t₄      [tree 1]
       /    \
      /\     d
   t₅/  \ t₆
    /    \
   b      c

If I want to “extract information” about the tree of only the two species a and b, from the above tree – to the degree that this is possible,

    /\
 t₁/  \ ?           [tree 2]
  /    \
 a      b

what should be my guess for the branch length marked with a question mark in the second tree?

I am going to use this “subtree” as the starting point for further heuristic search, so I want a good guess to reduce the search time.

After “pruning” c and d from tree 1, there are several options for the branch length between root and b in tree 2:

  • it could be set to t₂+t₃+t₅
  • b could be moved up to the t₃/t₄ branch, making it t₂
  • it could be set to the average value of t₂, t₃, and t₅.
  • it could be set to the branch length connecting b to its parenemphasized textt in the first tree, i.e. t₅.

Does any of these options make more sense than others? (Is there an obvious answer?) Is there any theory on this I could look up?

[My initial thought was that t₂+t₃+t₅ is the best estimate since this conserves the time between the root and b which – assuming tree 1 is a good one – makes the states observed in b most likely.]

phylogenetics • 2.5k views
ADD COMMENT
3
Entering edit mode
13.1 years ago
Rvosa ▴ 580

I agree that the sum of branch lengths (t2+t3+t5) is quite sensible, from a biological p.o.v.. Having said that, if you use the pruned subtree as an input for an additional iteration, the branch length that subsequently will be recovered will be smaller because of the lessened "node density effect" (e.g. see http://scholar.google.co.uk/scholar?q=node+density+effect, especially the work done by Pagel, Meade and Venditti in various publications). If all you are after is reduced search time you might as well pick a shorter branch (t5?) because it will quite likely be nearer the optimum that will subsequently be recovered.

ADD COMMENT
0
Entering edit mode

Thanks! That's really interesting, I will definitely look into the node density effect. :-)

ADD REPLY
2
Entering edit mode
13.1 years ago

Without being an expert on phylogenetics by any means, I would say that your initial thought is correct. Assuming that tree 1 correctly reflects the evolution of the four species, the total branch length leading from the last common ancestor of a and b to b is t[?]2[?]+t[?]3[?]+t[?]5[?].

ADD COMMENT

Login before adding your answer.

Traffic: 1628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6