Question: etetoolkit, Visualization of phylogenetic trees
3.8 years ago by
I'm trying to adapt an example of Visualization of phylogenetic trees.  Unfortunately it fails on line recon_tree, events = genetree.reconcile(sptree) with The following species are not contained in the species tree: ARA,THE'. And indeed, there are no "ARA" ant "THE" species definded.

However the script is able to generate both genetree and sptree.

I don't understand where this error comes from(second day with python) there is of course no "ARA" and "THE" species in the tree.

Thank you!

The whole code is:

from ete2 import PhyloTree, TreeStyle

alg = """

def get_example_tree():

    # Performs a tree reconciliation analysis
    gene_tree_nw = '(PHYPA_A9T28,((ARATH_Q9C5A9,THELL_Tp6g34190)((ORYSJ_Q0DG48,BRADI_I1HQG7),SOLTU_M1A5X1)));'
    species_tree_nw = "((SELML, PHYPA), ((ORYSJ, BRADI), (SOLTU, (ARATH, THELL))));"
    genetree = PhyloTree(gene_tree_nw)
    sptree = PhyloTree(species_tree_nw)
    recon_tree, events = genetree.reconcile(sptree)
    return recon_tree, TreeStyle()

if __name__ == "__main__":
    # Visualize the reconciled tree
    t, ts = get_example_tree()
    #recon_tree.render("phylotree.png", w=750)


ete2 ete python phylogeny • 1.2k views
written 3.8 years ago by droidlove0
3.7 years ago by
PhyloTree instances are prepared to extract species names/codes directly from the sequence name. By default, the rule is to use the first three letters as species code.

In your example, you just need to modify such rule to extract species name by splitting seq. names at the first underscore: 

def get_spcode(nodename):
  return nodename.split("_")[0]
genetree = PhyloTree(gene_tree_nw, sp_naming_function=get_spcode)

The same should be done with the reference species tree. For instance, if it only contains plain species names:

def get_spcode2(nodename):
  return nodename 
sptree = PhyloTree(species_tree_nw, sp_naming_function=get_spcode2)

This is documented in more detail at

written 3.7 years ago by jhc2.7k
