Question: Hierarchical tree based on phylogenetic profile matrix (presence/absence per species)?
0
4.8 years ago by
a1ultima700
London
a1ultima700 wrote:

I have a matrix that represents presence/absence of some character (rows) in a list of species (columns), e.g.

 species 1 species 2 species3 1 0 0 1 2 0 1 1 3 0 1 0 4 0 0 0 5 0 0 0 6 0 0 0 7 0 1 0 8 0 1 0 9 0 0 1 10 0 0 0

Is there a way I that can process this into a hierarchical tree such that similar rows group closer together?

In order of preference, I would hope that the solution comes in either:

• A python/R script
• A python/R package that I can make a script from
• Linux command-line software
• Webtool

modified 4.8 years ago by David W4.7k • written 4.8 years ago by a1ultima700
3
4.8 years ago by
David W4.7k
New Zealand
David W4.7k wrote:

You can do it all in base R, using `dist` and `hclust:`

`fake_pa <- t(replicate(10, rbinom(10, 1, 0.1)))`

`head(fake_pa)`

```#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,]    1    0    0    0    0    0    1    0    0     0
#[2,]    1    0    0    0    0    0    0    0    1     0
#[3,]    0    0    1    0    0    0    0    0    0     0
#[4,]    0    0    0    1    0    0    0    0    0     1
#[5,]    0    0    0    1    0    0    0    0    0     0
#[6,]    0    0    0    0    0    0    0    0    1     0```

`dm <- dist(fake_pa, method="manhattan")`

`plot(hclust(dm))`

As you may know, there have been many pages spent on the question of the "best" distance and clustering methods for binary data. You might want to check out some of the functions in `ADE4` which impliments different methods.

that was brill thanks!

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.