Question: Extract Tree Topology From Ncbi Taxonomy Database
1
gravatar for Lhl
7.6 years ago by
Lhl730
United States
Lhl730 wrote:

Hi There,

I am wondering if i can extract tree for a given list of species using bioperl from NCBI taxonnomy database.

I hope some of you can help me figure this out.

Many thanks in advance.

Kind Regards,

Lhl

taxonomy bioperl • 2.7k views
ADD COMMENTlink modified 12 months ago by peterlageweg60310 • written 7.6 years ago by Lhl730
2

Not a BioPerl solution, but this python script will do the job: https://github.com/jhcepas/ncbi_taxonomy . I wrote some more info at http://jhcepas.cgenomics.org/?p=216

ADD REPLYlink modified 7.6 years ago • written 7.6 years ago by jhc2.8k
1

something like Last Common Ancestor From Ncbi Taxonomy Using Java ?

ADD REPLYlink written 7.6 years ago by Pierre Lindenbaum128k

Not exactly, but i think it is useful.I might use it in the future. Thanks a lot.

Lhl

ADD REPLYlink written 7.6 years ago by Lhl730

Yes. This is exactly what i want.

Many thanks.

Lhl

ADD REPLYlink written 7.6 years ago by Lhl730

I'm trying this method as well.

from ete3 import NCBITaxa
ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])
print tree.get_ascii(attributes=["sci_name", "rank"])

Can someone tell me how I can convert the output to a Newick file?

ADD REPLYlink written 12 months ago by peterlageweg60310
1
gravatar for a.zielezinski
12 months ago by
a.zielezinski9.1k
a.zielezinski9.1k wrote:
from ete3 import NCBITaxa

ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 9782])
print(tree.write(format=9, features=["sci_name", "rank"]))
ADD COMMENTlink written 12 months ago by a.zielezinski9.1k

Thanks! This works as well. Is there a way to replace the TaxIDs with the scientific names?

ADD REPLYlink written 12 months ago by peterlageweg60310
1
gravatar for SMK
12 months ago by
SMK1.9k
SMK1.9k wrote:
from ete3 import NCBITaxa
ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])
tree.write(features=["sci_name", "rank"], outfile="tree.nw")

gives you an output tree.nw:

(7707:1[&&NHX:sci_name=Dendrochirotida:rank=order],(((9606:1[&&NHX:sci_name=Homo sapiens:rank=species],9598:1[&&NHX:sci_name=Pan troglodytes:rank=species])1:1[&&NHX:sci_name=Homininae:rank=subfamily],10090:1[&&NHX:sci_name=Mus musculus:rank=species])1:1[&&NHX:sci_name=Euarchontoglires:rank=superorder],8782:1[&&NHX:sci_name=Aves:rank=class])1:1[&&NHX:sci_name=Amniota:rank=no rank]);
ADD COMMENTlink written 12 months ago by SMK1.9k

Thanks for the fast responding! Works like a charm :)

ADD REPLYlink written 12 months ago by peterlageweg60310

Is there a way to replace the TaxIDs with the scientific names?

ADD REPLYlink written 12 months ago by peterlageweg60310

A handy way would be:

ete3 ncbiquery --tree --search 9606 9598 10090 7707 8782

You'll get a tree in Newick format and scientific names as node names:

ete3_ncbiquery_tree

ADD REPLYlink modified 12 months ago • written 12 months ago by SMK1.9k

You can do:

for node in tree.traverse():
    node.name = node.sci_name
ADD REPLYlink written 12 months ago by SMK1.9k

Thanks SMK. I tried this but it does not work.

tree.write(features=["sci_name", "rank"], outfile="tree.nw")
t_file = Tree("tree.nw")

for node in t_file.traverse():
    node.name = node.sci_name

tree.write(features=["node.name"], outfile="labeled_tree.nw")

AttributeError: 'TreeNode' object has no attribute 'sci_name'

ADD REPLYlink modified 12 months ago • written 12 months ago by peterlageweg60310
1

Hi peterlageweg603,

You have to traverse the one that you got from ncbi.get_topology():

from ete3 import NCBITaxa
ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])

for node in tree.traverse():
    node.name = node.sci_name

print(tree.write())

This gives you:

(Dendrochirotida:1,(((Homo sapiens:1,Pan troglodytes:1)1:1,Mus musculus:1)1:1,Aves:1)1:1);
ADD REPLYlink modified 12 months ago • written 12 months ago by SMK1.9k

Ah thank you for the fast responding and help!! This is exactly what I needed :)

ADD REPLYlink modified 12 months ago • written 12 months ago by peterlageweg60310
0
gravatar for briano
7.6 years ago by
briano0
briano0 wrote:

Lhl,

The Bioperl Bio::Tree::TreeFunctionsI module has a get_lca method. Is this what you're looking for?

Brian O.

ADD COMMENTlink written 7.6 years ago by briano0

HI Briano,

I want to get species tree given a list of species. given a list containing (speciesA, species B and Species C), i want to know how to get tree for these species. Thank you anyway. I think i might need Bio::Tree::Tree and get_lca in the future.

Cheers

Lhl

ADD REPLYlink written 7.6 years ago by Lhl730
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1599 users visited in the last hour