Question: all by all phylogenetic distances for massive set of ncbi taxon ID's
1
gravatar for bioguy
6 months ago by
bioguy30
bioguy30 wrote:

Any ideas on how to extract phylogenetic tree distances/dissimilarity for a massive group of ncbi taxonomic ids, like 50-100K? Ideally I'd be able to generate or download a file of the form:

Taxa1,Taxa2,Distance
Taxa1,Taxa3,Distance

......

This was the closest thing I could find (https://www.biostars.org/p/312148/), but it seems to require doing it in R, and I'm pretty sure R can't handle matrices of the scale I'm thinking.

ADD COMMENTlink modified 6 months ago by i.sudbery5.0k • written 6 months ago by bioguy30
2

I think you can use the ETE3 toolkit to generate a tree representation of NCBI taxid's (you can give it a hierarchy level I think (e.g. primates) and get all the taxa below it) and calculate all-vs-all inter-tip distances (I have some code which can do this last bit, but no idea how it'll scale).

ADD REPLYlink modified 6 months ago • written 6 months ago by jrj.healey13k

ok cool, this helps a lot, thank you. Giving it a shot now, we'll see how it works out...

ADD REPLYlink written 6 months ago by bioguy30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2006 users visited in the last hour