Tree input for PAML ancestral state reconstruction
1
2
Entering edit mode
6.4 years ago
memory_donk ▴ 330

Hi Biostars,

I have a set of about 6,000 groups of orthologous proteins. Each orthologous group has a representative protein from anywhere between 7 and 16 species (1 per species). I'm trying to do ancestral state reconstructions for each of these groups based on their well-established phylogenies, and then to compare each amino acid position in a species of interest to the reconstructed sequences at key ancestral nodes.

I've been trying to use PAML (codeml) for this purpose, following the method used in this blog post http://evosite3d.blogspot.com.au/2014/09/tutorial-on-ancestral-sequence.html

My problem is that when using clock = 0 (no molecular clock), PAML requires an unrooted tree. According to PAML's manual " a rooted tree has a bifurcation at the root, while an unrooted tree has a trifurcation or multifurcation at the root."

This is problem is that the mammalian tree that I have (composed of eutherians and marsupials) is a well-established bifurcating tree. How can I do an ancestral state reconstruction with PAML which requires a multifurcating tree when the true phylogeny is bifurcating? Adding an outgroup like platypus or chicken would still be a rooted tree since they're both outgroups and a polytomy of chicken/platypus, eutherians and marsupials would be false. I'm sure I've just deeply misunderstood something along the way. Any help would be really greatly appreciated!

gene software error • 3.7k views
3
Entering edit mode
6.4 years ago
Brice Sarver ★ 3.7k

The short answer is to unroot the tree. This can be accomplished easily in R using unroot() in ape.

It is possible for a tree that is normally bifurcating when rooted to have a node that has an order greater than two when unrooted. Consider a three-taxa case. There is one possible unrooted tree (shaped like a 'Y') and three possible rooted trees (whether rooted using the first, second, or third taxon) that are bifurcating. Remember that the placement of the root is a hypothesis.

0
Entering edit mode

I can confirm the above. 'unroot' will work well for this.

On Brice's suggestion I have been using 'ape' for a while know. It's easily the most useful phylogenetics packages for R out there.

0
Entering edit mode

Thanks Brice! Codeml is happy now.