Why is an hclust plot not ultrametric with "average" agglomeration method?
0
2
Entering edit mode
5.1 years ago
BlastedBadger ▴ 160

I created a gene distance matrix, based on GO terms similarity (using the R package GOSemSim).

I am then basically doing the following, where gene_dist is my distance matrix:

plot(hclust(gene_dist, method="average"))

I don't understand why the displayed tree from plot.hclust is not ultrametric, even though "average" agglomeration is UPGMA. The similarities from GOSemSim::mgeneSim (measure "Wang", combine "Best Matching Average") are likely not producing euclidean distances, but that shouldn't matter.

Also, if I use the library ape for plotting:

library(ape)
plot(as.phylo(hclust(gene_dist, method="average")))

It shows as ultrametric. Why? Is it just a display choice?


Here is some reproducible example code:

library(GOSemSim)

examplelist <- c("ZNF575", "GALNT11", "GJC3", "POLRMT", "PKDCC", "COL18A1", "INS-IGF2", "IQSEC1", "CFC1", "OPA3")
hsGOex <- godata('org.Hs.eg.db', ont='BP', computeIC=F, keytype='SYMBOL')
gene_sim_ex <- mgeneSim(genes=examplelist, semData=hsGOex, measure='Wang', combine='BMA')
isSymmetric(gene_sim_ex)
## [1] TRUE

gene_dist_ex <- as.dist(1 - gene_sim_ex)
labels(gene_dist_ex)
## [1] "GALNT11" "POLRMT"  "PKDCC"   "COL18A1" "IQSEC1"  "CFC1"    "OPA3"

plot(hclust(gene_dist_ex, method="average"))

library(ape)
dev.new()
plot(as.phylo(hclust(gene_dist_ex, method="average")))
axisPhylo()
hierarchical clustering R • 1.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6