Question: pheatmap vs. manual hclust with different results
0
gravatar for sdominguez
2.6 years ago by
sdominguez0
sdominguez0 wrote:

I am stuck in a problem with hierarchical clustering. I want to make a dendrogram and a heatmap, with a distance method of correlation (d_mydata=dist(1-cor(t(mydata))) and ward.D2 as clustering method.

As a gadget in the package pheatmap you can plot the dendrogram on the left side to visualize the clusters.

The pipeline of my analysis would be this:

create the dendrogram test how many cluster would be the optimal (k) extract the subjects in each cluster create a heatmap My surprise comes up when the dendrogram plotted in the heatmap is not the same as the one plotted before even when methods are the same.

So I decided to create a pheatmap colouring by the clusters classified before by cutree and test if the colours correspond to the clusters in the dendrogram.

This is my code:

Create test matrix

test = matrix(rnorm(200), 20, 10)
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")
test<-as.data.frame(test)

Create a dendrogram with this test matrix

dist_test<-dist(test) hc=hclust(dist_test, method="ward.D2")

plot(hc)

dend<-as.dendrogram(hc, check=F, nodePar=list(cex = .000007),leaflab="none", cex.main=3, axes=F, adjust=F)

clus2 <- as.factor(cutree(hc, k=2)) # cut tree into 2 clusters groups<-data.frame(clus2) groups$id<-rownames(groups)

-----------DATAFRAME WITH mydata AND THE CLASSIFICATION OF CLUSTERS AS FACTORS---------------------

test$id<-rownames(test) clusters<-merge(groups, test, by.x="id") rownames(clusters)<-clusters$id

clusters$clus2<-as.character(clusters$clus2) clusters$clus2[clusters$clus2== "1"]= "cluster1" clusters$clus2[clusters$clus2=="2"]<-"cluster2"

plot(dend, main = "test", horiz = TRUE, leaflab = "none")

d_clusters<-dist(1-cor(t(clusters[,7:10]))) hc_cl=hclust(d_clusters, method="ward.D2")

annotation_col = data.frame( Path = factor(colnames(clusters[3:12])) ) rownames(annotation_col) = colnames(clusters[3:12])

annotation_row = data.frame( Group = factor(clusters$clus2) ) rownames(annotation_row) = rownames(clusters)

Specify colors

ann_colors = list( Path= c(Test1="darkseagreen", Test2="lavenderblush2", Test3="lightcyan3", Test4="mediumpurple", Test5="red", Test6="blue", Test7="brown", Test8="pink", Test9="black", Test10="grey"), Group = c(cluster1="yellow", cluster2="blue") )

require(RColorBrewer) library(RColorBrewer) cols <- colorRampPalette(brewer.pal(10, "RdYlBu"))(20) library(pheatmap) pheatmap(clusters[ ,3:12], color = rev(cols), scale = "column", kmeans_k = NA, show_rownames = F, show_colnames = T, main = "Heatmap CK14, CK5/6, GATA3 and FOXA1 n=492 SCALE", clustering_method = "ward.D2", cluster_rows = TRUE, cluster_cols = TRUE, clustering_distance_rows = "correlation", clustering_distance_cols = "correlation", annotation_row = annotation_row, annotation_col = annotation_col,
annotation_colors=ann_colors )

clustering pheatmap R cluster • 3.0k views
ADD COMMENTlink modified 2.6 years ago by igor8.8k • written 2.6 years ago by sdominguez0

you are not scaling your data when you do hclust(dist(data)). But in pheatmap, you scale your data based on column ?

In pheatmap help section it says it uses hclust therefore, I think your error was caused by not giving the same input. pheatmap also have distance matrix output so check; 1) if your distance matrix == pheatmaps 2) make sure you scale your data as well in nonpheatmap way.

ADD REPLYlink written 2.6 years ago by morovatunc400

I assume you have to change

dist_test<-dist(test)

with something like

dist_test<-as.dist((1 - cor(test))/2)

to use correlation distance.

ADD REPLYlink written 2.6 years ago by e.rempel780
1
gravatar for igor
2.6 years ago by
igor8.8k
United States
igor8.8k wrote:

I had a related issue before. You may find this thread helpful: Clustering differences between heatmap.2 and pheatmap

ADD COMMENTlink written 2.6 years ago by igor8.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1965 users visited in the last hour