Question: Saving individual clusters from heatmaps
0
krushnach80690 wrote:

I am doing a gene based clustering ,while doing so i set of genes that are cluster together ,how can i take out those cluster set from the heatmap for further analysis . Can i do it while i create heatmap ,in other words can i define a function in the heatmap code to take out cluster or I have to do that manually through visual inspection ?

Any suggestion or help would be highly appreciated.

R • 4.0k views
modified 12 months ago by Ron990 • written 2.7 years ago by krushnach80690
2

I edited the title of this post so it is easy to find it in future by others.

What function are you using ? There are many ways of doing heatmaps in R. Depending on the function, the easiest way may be to use the same clustering function with the same parameters as used in your heatmap function, e.g. heatmap() uses hclust() by default. Note that for hierachical clustering, you would need to cut the tree to get clusters.

5
EagleEye6.6k wrote:

1) Below is an example for heatmap with three clusters and saving the entities from each clusters as a list (plain text file). Check ComplexHeatmap and for more options documentation for clustering.

``````library("ComplexHeatmap") ## For heatmap
library("circlize") ## For color options

## Creating heatmap with three clusters (See the ComplexHeatmap documentation for more options)
ht = Heatmap(mymatrix, km=3, col = colorRamp2(c(min(mymatrix), 0, max(mymatrix)), c("green", "white", "red")))
ht = draw(ht)

# Saving row names of cluster one
c1 <- t(t(row.names(mymatrix[row_order(ht)[],])))
write.table(c1,"c1_ids.list", sep="\n", quote=F, row.names=F,col.names=F)

# Saving row names of cluster two
c2 <- t(t(row.names(mymatrix[row_order(ht)[],])))
write.table(c2,"c2_ids.list", sep="\n", quote=F, row.names=F,col.names=F)

# Saving row names of cluster three
c3 <- t(t(row.names(mymatrix[row_order(ht)[],])))
write.table(c3,"c3_ids.list", sep="\n", quote=F, row.names=F,col.names=F)
``````

2) If you are using pheatmap, you can extract the same order from heatmap. Check this post.

3) If you are using single cell data, considering SC3 is the best option (article).

Suggestion/Request: It will be good and easy for other users to find the answer if you change the topic similar to 'Saving individual clusters from heatmaps'.

I tried your code it works perfectly fine with complex heatmap But when I doing the same with pheatmap basically this is what im doing

``````out <- pheatmap(data,
color=myColor,
breaks = myBreaks,
show_rownames = T,cluster_cols=T,cluster_rows=T,
cex=.5,clustering_distance_rows = "euclidean",cex=.5,
clustering_distance_cols = "euclidean", clustering_method = "complete",border_color = FALSE)

res <- data[c(out\$tree_row[["order"]]),out\$tree_col[["order"]]]
``````

when i View(res) Im not getting individual cluster which i did get for complexheatmap rather its complete list..

With pheatmap you will not get individual clusters rather you get entities in the same order from the heatmap

would it help if I use `kmeans_k =` this argument ?for pheatmap

@EagleEye the code with complex heatmap works fine , i m getting the clusters of gene that is getting clustered but can i get the cluster of genes along with their values from the heatmap , because in seqmonk when it plots a heatmap and makes cluster ,when i take out these clusters it gives the list of gene in a respective cluster as well as the values. Of course i can take out genes names from the clsuter and again map it but , if there is a way to define it and take the genes as well as their values from the cluster then it would be really good.

3
Alex Reynolds29k wrote:

If you're doing supervised k-means clustering, you could do something like the following:

``````output_dir_prefix <- "results"
kclusters <- c(2, 3, 4, 6, 8, 10, 15, 20, 25, 50, 100)
for (kcluster in kclusters) {
print(kcluster)
dir.create(file.path(output_dir_prefix, kcluster))
clustering <- kmeans(data, kcluster)
for (i in seq(1, kcluster, 1)) {
print(paste(kcluster, i, sep=":"))
out_fn <- paste(output_dir_prefix, kcluster, paste(i, ".mtx", sep=""), sep="/")
body <- data[clustering\$cluster == i,]
write.table(body, file=out_fn, quote=FALSE, sep="\t", row.names=FALSE, col.names=FALSE, append=TRUE)
}
}
``````

This gives submatrices of results from running k-means clustering on your `data` over k clusters in `kclusters`.

You could use `matrix2png` on each clustering submatrix `i.mtx` file, in order to generate a heatmap visualization for that submatrix.

If you're doing unsupervised, hierarchical clustering, you could use `cutree` to cut a tree into segments at a specified tree height. You might visually inspect the clustering to decide on the height, or use some sensibly arbitrary heuristic. Examples of this are described in answers to another Biostars question.

How are you defining these ```kclusters <- c(2, 3, 4, 6, 8, 10, 15, 20, 25, 50, 100)? ```

I m bit confused with your code may be it bit advanced for me. Would you explain it .What i understand so far is you are defining possible clusters , after that im not getting it . if you can explain it would be really helpful and i would be glad

1

The `for (kcluster in kclusters)` line loops through each of the values in `kclusters`.

Inside this loop, I apply k-means clustering on the rows of `data`, for some value k, which is just the variable `kcluster`: 2, 3, and so on up to 100 clusters. I run `kmeans(data, kcluster)` and store the clustering result in a variable called `clustering`.

The variable `clustering` stores an assignment of each row in `data` to one of k clusters. So when `kcluster` is `3`, for example, there are three clusters in `clustering` that I can access: `clustering\$cluster == 1`, `clustering\$cluster == 2` and `clustering\$cluster == 3`.

The line `for (i in seq(1, kcluster, 1))` simply loops over each value from 1 to `kcluster`, stores that loop counter in a variable called `i`. The loop writes out the rows in `clustering` where `clustering\$cluster == i` with `write.table`.

Thank your very much for a very clear explanation.

3
Ron990 wrote:

For extracting the clusters based on Columns.(ComplexHeatmaps)

``````mat = matrix(rnorm(80, 2), 8, 10)
mat = rbind(mat, matrix(rnorm(40,-2), 4, 10))
rownames(mat) = letters[1:12]
colnames(mat) = letters[1:10]
HM <- Heatmap(mat, km=3 , column_km = 3)  HM

for (i in 1:length(column_order(HM))){   if (i == 1) {
clu <- t(t(colnames(mat[,column_order(HM)[[i]]])))
out <- cbind(clu, paste("cluster", i, sep=""))
colnames(out) <- c("GeneID", "Cluster")   } else {
clu <- t(t(colnames(mat[,column_order(HM)[[i]]])))
clu <- cbind(clu, paste("cluster", i, sep=""))
out <- rbind(out, clu)   }
}
``````

The above example is similar to extracting the row based clustering given here.

https://github.com/jokergoo/ComplexHeatmap/issues/136