Question: all by all phylogenetic distances for massive set of ncbi taxon ID's
1
gravatar for bioguy
17 months ago by
bioguy50
bioguy50 wrote:

Any ideas on how to extract phylogenetic tree distances/dissimilarity for a massive group of ncbi taxonomic ids, like 50-100K? Ideally I'd be able to generate or download a file of the form:

Taxa1,Taxa2,Distance
Taxa1,Taxa3,Distance

......

This was the closest thing I could find (https://www.biostars.org/p/312148/), but it seems to require doing it in R, and I'm pretty sure R can't handle matrices of the scale I'm thinking.

ADD COMMENTlink modified 17 months ago by i.sudbery7.8k • written 17 months ago by bioguy50
2

I think you can use the ETE3 toolkit to generate a tree representation of NCBI taxid's (you can give it a hierarchy level I think (e.g. primates) and get all the taxa below it) and calculate all-vs-all inter-tip distances (I have some code which can do this last bit, but no idea how it'll scale).

ADD REPLYlink modified 17 months ago • written 17 months ago by Joe16k

ok cool, this helps a lot, thank you. Giving it a shot now, we'll see how it works out...

ADD REPLYlink written 17 months ago by bioguy50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1627 users visited in the last hour