all by all phylogenetic distances for massive set of ncbi taxon ID's
0
1
Entering edit mode
3.6 years ago
bioguy ▴ 50

Any ideas on how to extract phylogenetic tree distances/dissimilarity for a massive group of ncbi taxonomic ids, like 50-100K? Ideally I'd be able to generate or download a file of the form:

Taxa1,Taxa2,Distance
Taxa1,Taxa3,Distance


......

This was the closest thing I could find (https://www.biostars.org/p/312148/), but it seems to require doing it in R, and I'm pretty sure R can't handle matrices of the scale I'm thinking.

phylogeny microbiology taxonomy ncbi taxon id • 1.3k views
2
Entering edit mode

I think you can use the ETE3 toolkit to generate a tree representation of NCBI taxid's (you can give it a hierarchy level I think (e.g. primates) and get all the taxa below it) and calculate all-vs-all inter-tip distances (I have some code which can do this last bit, but no idea how it'll scale).

0
Entering edit mode

ok cool, this helps a lot, thank you. Giving it a shot now, we'll see how it works out...