Question: Construct Newick Tree from tab-delimited or csv file of phylogeny, preferably in Python
0
gravatar for weslfield
5.7 years ago by
weslfield90
European Union
weslfield90 wrote:

So I am looking for a way, preferably a Pythonic way (packages are ok), to convert a tab-delimited or csv of hierarchical phylogenies into the classic Newick format for tree visualization.

Each line of the file has ['Phylum','Class','Order','Family','Genus','Species','Subspecies','gi'] as values and I would like to create a Newick tree representaiton. Any help greatly appreciated. Thanks!

newick tree python phylogeny • 5.5k views
ADD COMMENTlink modified 21 months ago by Biostar ♦♦ 20 • written 5.7 years ago by weslfield90

Hello weslfield!

It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/26146623

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 5.7 years ago by Pierre Lindenbaum128k
1
gravatar for Asaf
5.7 years ago by
Asaf7.6k
Israel
Asaf7.6k wrote:

You can use dendropy.

The easiest way I can think of (without thinking too much) is for each line go from the phylum down to gi and create a child node or select a node you already created. Then you can export the tree in Newick format.  

ADD COMMENTlink written 5.7 years ago by Asaf7.6k

Thanks Asaf, the code below in my answer addressed the problem using the node approach. 

תודה רבה מטכניון :)

ADD REPLYlink written 5.7 years ago by weslfield90
1

ד"ש לרותי

ADD REPLYlink written 5.7 years ago by Asaf7.6k
1
gravatar for weslfield
5.7 years ago by
weslfield90
European Union
weslfield90 wrote:
import csv
from collections import defaultdict
from pprint import pprint

def tree(): return defaultdict(tree)

def tree_add(t, path):
  for node in path:
    t = t[node]

def pprint_tree(tree_instance):
    def dicts(t): return {k: dicts(t[k]) for k in t}
    pprint(dicts(tree_instance))

def csv_to_tree(input):
    t = tree()
    for row in csv.reader(input, quotechar='\''):
        tree_add(t, row)
    return t

def tree_to_newick(root):
    items = []
    for k in root.iterkeys():
        s = ''
        if len(root[k].keys()) > 0:
            sub_tree = tree_to_newick(root[k])
            if sub_tree != '':
                s += '(' + sub_tree + ')'
        s += k
        items.append(s)
    return ','.join(items)

def csv_to_weightless_newick(input):
    t = csv_to_tree(input)
    #pprint_tree(t)
    return tree_to_newick(t)
ADD COMMENTlink written 5.7 years ago by weslfield90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 833 users visited in the last hour