Question: Why is an hclust plot not ultrametric with "average" agglomeration method?
2
gravatar for BlastedBadger
23 months ago by
Thule
BlastedBadger90 wrote:

I created a gene distance matrix, based on GO terms similarity (using the R package GOSemSim).

I am then basically doing the following, where gene_dist is my distance matrix:

plot(hclust(gene_dist, method="average"))

I don't understand why the displayed tree from plot.hclust is not ultrametric, even though "average" agglomeration is UPGMA. The similarities from GOSemSim::mgeneSim (measure "Wang", combine "Best Matching Average") are likely not producing euclidean distances, but that shouldn't matter.

Also, if I use the library ape for plotting:

library(ape)
plot(as.phylo(hclust(gene_dist, method="average")))

It shows as ultrametric. Why? Is it just a display choice?


Here is some reproducible example code:

library(GOSemSim)

examplelist <- c("ZNF575", "GALNT11", "GJC3", "POLRMT", "PKDCC", "COL18A1", "INS-IGF2", "IQSEC1", "CFC1", "OPA3")
hsGOex <- godata('org.Hs.eg.db', ont='BP', computeIC=F, keytype='SYMBOL')
gene_sim_ex <- mgeneSim(genes=examplelist, semData=hsGOex, measure='Wang', combine='BMA')
isSymmetric(gene_sim_ex)
## [1] TRUE

gene_dist_ex <- as.dist(1 - gene_sim_ex)
labels(gene_dist_ex)
## [1] "GALNT11" "POLRMT"  "PKDCC"   "COL18A1" "IQSEC1"  "CFC1"    "OPA3"

plot(hclust(gene_dist_ex, method="average"))

library(ape)
dev.new()
plot(as.phylo(hclust(gene_dist_ex, method="average")))
axisPhylo()
hierarchical clustering R • 360 views
ADD COMMENTlink modified 23 months ago • written 23 months ago by BlastedBadger90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2570 users visited in the last hour
_