BioPython: collapsing nodes on tree if node in list.
1
0
Entering edit mode
9.2 years ago
atapee ▴ 10

I have phylogenetic tree in newick format and I would like to remove all species from it that are on a specific list

This is the tree:

((((((((('Capra hircus', 'Ovis aries'), 'Pantholops hodgsonii'), ('Bubalus bubalis', ('Bison bison bison', ('Bos mutus', ('Bos primigenius', 'Bos taurus'))))), 'Moschus sp. RWM-2011'), ('Cervus nippon taiouanus', 'Muntiacus muntjak vaginalis')), 'Okapia johnstoni'), 'Antilocapra americana'), 'Tragulus napu'), ((((((((('Stenella coeruleoalba', 'Tursiops truncatus'), 'Lagenorhynchus acutus'), 'Orcinus orca'), ('Phocoena phocoena', 'Delphinapterus leucas')), (('Pontoporia blainvillei', 'Inia geoffrensis'), 'Lipotes vexillifer')), 'Mesoplodon bidens'), 'Platanista minor'), ('Physeter catodon', 'Kogia breviceps')), (((('Megaptera novaeangliae', 'Eschrichtius robustus'), ('Balaenoptera acutorostrata', 'Balaenoptera acutorostrata scammoni')), 'Caperea marginata'), ('Eubalaena glacialis', 'Eubalaena australis'))));

It looks like this:

                             ___ Capra hircus
                          __|
                      ___|  |___ Ovis aries
                     |   |
                   __|   |__ Pantholops hodgsonii
                  |  |
                  |  |    __ Bubalus bubalis
                  |  |___|
                  |      |   ___ Bison bison bison
                  |      |__|
               ___|         |    __ Bos mutus
              |   |         |___|
              |   |             |   ___ Bos primigenius
              |   |             |__|
            __|   |                |___ Bos taurus
           |  |   |
           |  |   |__ Moschus sp. RWM-2011
           |  |
        ___|  |    __ Cervus nippon taiouanus
       |   |  |___|
       |   |      |__ Muntiacus muntjak vaginalis
     __|   |
    |  |   |__ Okapia johnstoni
  __|  |
 |  |  |___ Antilocapra americana
 |  |
 |  |__ Tragulus napu
 |
 |                               __ Stenella coeruleoalba
 |                           ___|
 |                        __|   |__ Tursiops truncatus
 |                       |  |
 |                    ___|  |___ Lagenorhynchus acutus
 |                   |   |
 |                 __|   |__ Orcinus orca
 |                |  |
 |                |  |    __ Phocoena phocoena
 |                |  |___|
 |             ___|      |__ Delphinapterus leucas
_|            |   |
 |            |   |       __ Pontoporia blainvillei
 |            |   |   ___|
 |          __|   |__|   |__ Inia geoffrensis
 |         |  |      |
 |         |  |      |___ Lipotes vexillifer
 |      ___|  |
 |     |   |  |___ Mesoplodon bidens
 |     |   |
 |   __|   |__ Platanista minor
 |  |  |
 |  |  |    __ Physeter catodon
 |  |  |___|
 |  |      |__ Kogia breviceps
 |  |
 |  |              __ Megaptera novaeangliae
 |__|          ___|
    |         |   |__ Eschrichtius robustus
    |       __|
    |      |  |    __ Balaenoptera acutorostrata
    |   ___|  |___|
    |  |   |      |__ Balaenoptera acutorostrata scammoni
    |  |   |
    |__|   |__ Caperea marginata
       |
       |    __ Eubalaena glacialis
       |___|
           |__ Eubalaena australis

I've been running the tree.collapse, removing all species from the list, and it works as expected (description: Deletes target from the tree, relinking its children to its parent), the only problem is that it leaves empty clades behind in a lot of places where it collapsed the tree.

                              ___ ____ ____ Ovis aries
                    ____ ____|
                   |         |    ____ Bubalus bubalis
                   |         |___|
                   |             |     ____ Bison bison bison
                   |             |____|
           ____ ___|                  |     ____ Bos mutus
          |        |                  |____|
          |        |                       |____ Clade #HERE
          |        |
      ____|        |     ____ Cervus nippon taiouanus
     |    |        |____|
     |    |             |____ Muntiacus muntjak vaginalis
  ___|    |
 |   |    |____ Antilocapra americana
 |   |
 |   |____ Tragulus napu
 |
 |                            ___ ____ ____ ____ Stenella coeruleoalba
 |                       ____|
_|                      |    |___ Clade #HERE
 |         ____ ___ ____|
 |        |             |         ____ Pontoporia blainvillei
 |    ____|             |____ ___|
 |   |    |                      |____ Inia geoffrensis
 |   |    |
 |   |    |____ ___ Kogia breviceps
 |___|
     |                   ____ Megaptera novaeangliae
     |              ____|
     |     ____ ___|    |____ Eschrichtius robustus
     |    |        |
     |____|        |____ Clade #HERE
          |
          |     ___ Eubalaena glacialis
          |____|
               |___ Eubalaena australis

And this is what the newick file looks like:

((((((((('Ovis aries':0.00000):0.00000):0.00000,('Bubalus bubalis':0.00000,('Bison bison bison':0.00000,('Bos mutus':0.00000,:0.00000):0.00000):0.00000):0.00000):0.00000):0.00000,('Cervus nippon taiouanus':0.00000,'Muntiacus muntjak vaginalis':0.00000):0.00000):0.00000):0.00000,'Antilocapra americana':0.00000):0.00000,'Tragulus napu':0.00000):0.00000,((((((((('Stenella coeruleoalba':0.00000):0.00000):0.00000):0.00000,:0.00000):0.00000,(('Pontoporia blainvillei':0.00000,'Inia geoffrensis':0.00000):0.00000):0.00000):0.00000):0.00000):0.00000,('Kogia breviceps':0.00000):0.00000):0.00000,(((('Megaptera novaeangliae':0.00000,'Eschrichtius robustus':0.00000):0.00000,:0.00000):0.00000):0.00000,('Eubalaena glacialis':0.00000,'Eubalaena australis':0.00000):0.00000):0.00000):0.00000):0.00000[&r];

The problem gets even more severe, if I use a larger tree. I would like to remove these empty clades, or prevent them from forming in the first place. I have tried also running tree.collapse(""), but it will not remove the empty clades, because it insists on having a keyword.

Also, how can I remove the distances, that are now present.

This is the piece of code that does this:

tree = Phylo.read("C:/Users/.../tree.nwk", "newick")
newtree = copy.deepcopy(tree)
for name in listofnames:
    newtree.collapse(name)
Phylo.write(newtree, 'C:/Users/.../newtree.nwk', 'newick')
BioPython Phyllo phylogenetics • 3.0k views
ADD COMMENT
2
Entering edit mode
9.2 years ago
Ram 43k

Will newtree.prune(target='Clade') be useful in any way?

ADD COMMENT
0
Entering edit mode

Well, not directly, but replacing

newtree.collapse(name)

with

newtree.prune(target=name)

fixed everything, thank you very much! :)

ADD REPLY
0
Entering edit mode

Glad it helped :)

ADD REPLY

Login before adding your answer.

Traffic: 2029 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6