Question: Clustering differences between heatmap.2 and pheatmap
gravatar for igor
3.7 years ago by
United States
igor7.6k wrote:

I have been using heatmap.2 for a while, but just discovered pheatmap. In heatmap.2, you can specify clustering settings via distfun and hclustfun. In pheatmap, you have clustering_distance_rows and clustering_method. However, if I set those parameters to use the same algorithms, the resulting heatmaps do not look similar. How can that be? Does pheatmap perform additional manipulations that heatmap.2 does not?

My code:

# pheatmap
pheatmap(vals, color=colors, scale="row", cluster_rows=T, cluster_cols=T, clustering_distance_rows = "euclidean", clustering_distance_cols = "euclidean", clustering_method = "complete")

# heatmap.2
hclust_fun = function(x) hclust(x, method="complete")
dist_fun = function(x) dist(x, method="euclidean")
heatmap.2( as.matrix(vals), scale="row", trace="none", dendrogram="both", Rowv=TRUE, Colv=TRUE, distfun=dist_fun, hclustfun=hclust_fun, col=colors)
heatmap R • 8.0k views
ADD COMMENTlink modified 3.3 years ago by Lerong80 • written 3.7 years ago by igor7.6k

I'm have the same problem!

ADD REPLYlink written 3.5 years ago by informatics bot560
gravatar for Lerong
3.3 years ago by
United States
Lerong80 wrote:

Basically when you show scaled data,  heatmap.2 scale data after clustering , whereas pheatmap scales data before clustering. I am guessing that makes the difference in the final output sometimes.

ADD COMMENTlink written 3.3 years ago by Lerong80

A related thread about scaling data and these 2 heatmap functions: cannot replicate the pheatmap scale function

ADD REPLYlink modified 4 months ago • written 4 months ago by Kevin Blighe41k
gravatar for informatics bot
3.5 years ago by
United States
informatics bot560 wrote:

heatmap.2 applies some reordering to the dendrogram that is not done by pheatmap. Here is an excerpt from heatmap.2 manual:

"If either is a vector (of “weights”) then the appropriate dendrogram is reordered according to the supplied values subject to the constraints imposed by the dendrogram, by reorder(dd, Rowv), in the row case. If either is missing, as by default, then the ordering of the corresponding dendrogram is by the mean value of the rows/columns, i.e., in the case of rows, Rowv <- rowMeans(x, na.rm=na.rm). If either is NULL, no reordering will be done for the corresponding side."


I decided to specify the clustering method for both rows and columns in heatmap.2

distance.row = dist(as.matrix(vals), method = "euclidean")
cluster.row = hclust(distance.row, method = "ward.D")
distance.col = dist(t(as.matrix(vals)), method = "euclidean")
cluster.col = hclust(distance.col, method = "ward.D")
heatmap.2(vals, scale="row",trace="none", dendrogram="both", Rowv=as.dendrogram(cluster.row), Colv=as.dendrogram(cluster.col))

The order now is more similar to pheatmap, but not completely identical...


ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by informatics bot560
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1524 users visited in the last hour