Extract monophyletic subtree with ETE3 toolkit
0
0
Entering edit mode
3.1 years ago

Hello,

I have a tree with duplication. I need to extract a subtree that is monophyletic to get orthologs. I have to following tree:

t2 = PhyloTree("((RED_18455.t1:0.00147625,(YEL_2874.t1:0.0138986,YEL_23839.t1:0.0506878)n2:0.00563198)n1:0.000294125,BLU_7991.t1:0.00203445)n0;", format=1)
print(t2)
      /-RED_18455.t1
   /-|
  |  |   /-YEL_2874.t1
--|   \-|
  |      \-YEL_23839.t1
  |
   \-BLU_7991.t1
  
R = t2.get_midpoint_outgroup()
t2.set_outgroup(R)
print(t2)
  /-YEL_23839.t1
--|
  |   /-YEL_2874.t1
   \-|
     |   /-RED_18455.t1
      \-|
         \-BLU_7991.t1
  
t2.set_species_naming_function(lambda node: node.name.split("_")[0])
print(t2.get_species)

{'RED', 'BLU', 'YEL'}

for node in t2.split_by_dups():
    print(node)
--YEL_23839.t1

   /-YEL_2874.t1
--|
  |   /-RED_18455.t1
   \-|
      \-BLU_7991.t1
  

As you can see, there are two subtrees. How can I pick the second subtree that has all three species represented once? or extract a single copy orthogroup from this?

Thanks.

ETE3 python3 • 579 views
ADD COMMENT

Login before adding your answer.

Traffic: 1860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6