Question

Verifying Phylogeny Trees

2

Entering edit mode

13.4 years ago

Jamand ▴ 110

Hi all

I've build a phylogenetic tree using Phyml (ML method), but some branches have low bootstrap values and so seemed not to be confirmed. How can I fix it?

Best regards

phylogenetics tree • 7.0k views

ADD COMMENT • link updated 5.5 years ago by Biostar 20 • written 13.4 years ago by Jamand ▴ 110

score 8 · Answer 1 · 2010-12-06

"Fix it" is not really the correct way to think about this.

Bootstrapping is a statistical method, used to get around the problem that you have imperfect data (i.e. not as many sequences as you would like). Typically, the way it works is to generate a "fake" sequence based on the real sequences, add it to the alignment and see whether sequences still cluster on the same branch. You repeat that an arbitrary number of times (perhaps 100 or 1000) and count the proportion of times that you got the same branch.

Low values mean lower confidence. You may be able to improve the scores using better data, such as more real sequences or better optimization of the alignment. However, it's important to realise that these are statistical methods and they do not really give "right" or "wrong" answers. Rather, you should think of the statistic as telling you something about your data.

score 3 · Answer 2 · 2010-12-08

Like Neil eluded to, the way to "fix" low bootstrap values is to improve the data based on which the tree is constructed. Here are a few things that you may consider doing:

If you are building a species tree, you should strongly consider basing it on multiple genes using either concatenated alignments or super-tree methods.
Poor support for a given branch pattern may be due to poor species coverage in that part of the tree. Including additional genes that diverged around the same point in the tree may improve your resolution and hence bootstrap values.
Bootstrap values (and trees in general) can be badly affected by alignment errors. Using a better multiple alignment program, improving the alignment with refinement programs such as RASCAL, manually checking the alignment, and eliminating the less confident parts of your alignment (e.g. with GBlocks) can thus all improve your results.

score 1 · Answer 3 · 2013-01-11

One way of addressing part of this "problem" uses the concept of "rogue" taxa (see Wilkinson M. 1996. Majority-rule reduced consensus trees and their use in bootstrapping. Mol. Biol. Evol. 13: 437–444.), taxa which are found in different "positions" in different trees (e.g. the trees estimated from different bootstrapped replicates of your alignment dataset.

By removing/pruning these taxa from your analysis, you can often see a gratifying "improvement" in the support values for for other branches.

Andre and Denis, working with Alexis Stamantakis in Heidelberg, just published a webservice that can be used to help identifying such taxa, that you might like to take a look at

obviously, if the taxa you're most interested in working with are identified as rogues, you're going to need to think carefully about your analysis!

score 0 · Answer 4 · 2010-12-10

Very good answers from neilfws and Lars Juhl Jensen. I'll add just a minor point that rooting the tree and changing the root can also have an effect on the relationships of other taxa to themselves. In other words, your problematic area(s) may look different if you change or add an outgroup to root the tree. I prefer to use more than one outgroup when doing this. For example, if I'm building a tree with primate sequences, I may through in a dog and bovine, maybe cat sequences as well so that my outgroup can also form a branching pattern.

score 0 · Answer 5 · 2011-03-29

0

Entering edit mode

13.1 years ago

Yo_O ▴ 130

Hi,

you can cutoff your low bootstrap values if you want to with this webservice : http://www.ibarcode.org/collapse

Your input have to be a Newick file.

Best regards.

ADD COMMENT • link 13.1 years ago by Yo_O ▴ 130