Unable to parse Open Tree of Life Newick with ete3 (python)
1
0
Entering edit mode
2.1 years ago
liorglic ★ 1.4k

I am trying to parse a Newick file downloaded from the Open Tree of Life server using the ete3 python package:

from ete3 import Tree
tree = Tree('Vertebrata.tre', format=1)

and getting the following error:

raise NewickError('Broken newick structure at: %s' %chunk)
ete3.parser.newick.NewickError: Broken newick structure at:
Malacothrix_typica_ott600700)'Malacothrix You may want to check other
newick loading flags like 'format' or 'quoted_node_names'.

I also tried all other possible values for the 'format' option, but this did not solve the problem.
I've seen this mentioned in an old Github issue, but this is not very helpful.
Anyone ever tried this, or can help me figure it out? In case you want to try it out, the download link is here,

Thanks!

newick python ete3 • 895 views
ADD COMMENT
1
Entering edit mode

As the error suggests, it sounds like your input file is broken. Typically with NEWICK this is because one or more brackets is missing. ETE3 can be quite fussy, especially if its a multi-line NEWICK representation.

I'd hazard a guess that your found solution was less fussy about the input format, but has 'fixed' your file when you wrote out a new tree.

ADD REPLY
1
Entering edit mode
2.1 years ago
liorglic ★ 1.4k

I was able to work around this problem by opening the file using R phytools read.tree and then writing it to with write.tree. Somehow it fixed the problem, but I still don't know what it was.

ADD COMMENT

Login before adding your answer.

Traffic: 1804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6