Question: pheatmap vs. manual hclust with different results
gravatar for sdominguez
3.5 years ago by
sdominguez0 wrote:

I am stuck in a problem with hierarchical clustering. I want to make a dendrogram and a heatmap, with a distance method of correlation (d_mydata=dist(1-cor(t(mydata))) and ward.D2 as clustering method.

As a gadget in the package pheatmap you can plot the dendrogram on the left side to visualize the clusters.

The pipeline of my analysis would be this:

create the dendrogram test how many cluster would be the optimal (k) extract the subjects in each cluster create a heatmap My surprise comes up when the dendrogram plotted in the heatmap is not the same as the one plotted before even when methods are the same.

So I decided to create a pheatmap colouring by the clusters classified before by cutree and test if the colours correspond to the clusters in the dendrogram.

This is my code:

Create test matrix

test = matrix(rnorm(200), 20, 10)
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")

Create a dendrogram with this test matrix

dist_test<-dist(test) hc=hclust(dist_test, method="ward.D2")


dend<-as.dendrogram(hc, check=F, nodePar=list(cex = .000007),leaflab="none", cex.main=3, axes=F, adjust=F)

clus2 <- as.factor(cutree(hc, k=2)) # cut tree into 2 clusters groups<-data.frame(clus2) groups$id<-rownames(groups)


test$id<-rownames(test) clusters<-merge(groups, test, by.x="id") rownames(clusters)<-clusters$id

clusters$clus2<-as.character(clusters$clus2) clusters$clus2[clusters$clus2== "1"]= "cluster1" clusters$clus2[clusters$clus2=="2"]<-"cluster2"

plot(dend, main = "test", horiz = TRUE, leaflab = "none")

d_clusters<-dist(1-cor(t(clusters[,7:10]))) hc_cl=hclust(d_clusters, method="ward.D2")

annotation_col = data.frame( Path = factor(colnames(clusters[3:12])) ) rownames(annotation_col) = colnames(clusters[3:12])

annotation_row = data.frame( Group = factor(clusters$clus2) ) rownames(annotation_row) = rownames(clusters)

Specify colors

ann_colors = list( Path= c(Test1="darkseagreen", Test2="lavenderblush2", Test3="lightcyan3", Test4="mediumpurple", Test5="red", Test6="blue", Test7="brown", Test8="pink", Test9="black", Test10="grey"), Group = c(cluster1="yellow", cluster2="blue") )

require(RColorBrewer) library(RColorBrewer) cols <- colorRampPalette(brewer.pal(10, "RdYlBu"))(20) library(pheatmap) pheatmap(clusters[ ,3:12], color = rev(cols), scale = "column", kmeans_k = NA, show_rownames = F, show_colnames = T, main = "Heatmap CK14, CK5/6, GATA3 and FOXA1 n=492 SCALE", clustering_method = "ward.D2", cluster_rows = TRUE, cluster_cols = TRUE, clustering_distance_rows = "correlation", clustering_distance_cols = "correlation", annotation_row = annotation_row, annotation_col = annotation_col,
annotation_colors=ann_colors )

clustering pheatmap R cluster • 4.9k views
ADD COMMENTlink modified 3.5 years ago by igor11k • written 3.5 years ago by sdominguez0

you are not scaling your data when you do hclust(dist(data)). But in pheatmap, you scale your data based on column ?

In pheatmap help section it says it uses hclust therefore, I think your error was caused by not giving the same input. pheatmap also have distance matrix output so check; 1) if your distance matrix == pheatmaps 2) make sure you scale your data as well in nonpheatmap way.

ADD REPLYlink written 3.5 years ago by morovatunc460

I assume you have to change


with something like

dist_test<-as.dist((1 - cor(test))/2)

to use correlation distance.

ADD REPLYlink written 3.5 years ago by e.rempel900
gravatar for igor
3.5 years ago by
United States
igor11k wrote:

I had a related issue before. You may find this thread helpful: Clustering differences between heatmap.2 and pheatmap

ADD COMMENTlink written 3.5 years ago by igor11k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1657 users visited in the last hour