Question: all by all phylogenetic distances for massive set of ncbi taxon ID's
1
gravatar for bioguy
7 weeks ago by
bioguy20
bioguy20 wrote:

Any ideas on how to extract phylogenetic tree distances/dissimilarity for a massive group of ncbi taxonomic ids, like 50-100K? Ideally I'd be able to generate or download a file of the form:

Taxa1,Taxa2,Distance
Taxa1,Taxa3,Distance

......

This was the closest thing I could find (https://www.biostars.org/p/312148/), but it seems to require doing it in R, and I'm pretty sure R can't handle matrices of the scale I'm thinking.

ADD COMMENTlink modified 7 weeks ago by i.sudbery3.8k • written 7 weeks ago by bioguy20
2

I think you can use the ETE3 toolkit to generate a tree representation of NCBI taxid's (you can give it a hierarchy level I think (e.g. primates) and get all the taxa below it) and calculate all-vs-all inter-tip distances (I have some code which can do this last bit, but no idea how it'll scale).

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by jrj.healey10k

ok cool, this helps a lot, thank you. Giving it a shot now, we'll see how it works out...

ADD REPLYlink written 7 weeks ago by bioguy20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1400 users visited in the last hour