Hi everyone, I have a question regarding the Complex Heatmap. I am aware of the post about reordering Complex Heatmap: Changing order of clusters . This does not solve my problem.
I have a dataframe containing column with genes 1-4, and lots of rows containing proteins. In the cells are numbers 0-3, corresponding to 2 different methods in experiments. 0 means the gene was not found, 1 and 2 correspond to 2 different methods, and 3 indicates it was found using both methods.
I am looking for a heatmap that can
a) be split into 4 slices according to in how many genes it is found (<0)
b) that can be sorted in a way that those proteins which are found in all genes (columns) are on top, then 3, then 2,1. There is no 0.
c) show with color in which method it was found (0-4)
d) contains a dendrogram clustering within the sections mentioned, to see the clusters within those slices.
Now, since it´s a complex problem, Complex Heatmap should be appropriate.
What I am doing so far:
My initial dataframe looks like this, just way more rows:
a b c d 3 2 2 2 3 2 2 2 3 0 0 0 1 0 3 3
to achieve the heatmap being split into 4 sections I am transforming to a matrix containing only 1 if >0 and 0=0, then using rowsum to get a value indicating in how many genes it was found. I am saving a copy of it to plot_matrix_frame, since i will need the original data later. (I had an additional column with names in the beginning, that is not shown in the example data, which i delete by the [,-1].)
plot_matrix_frame <- matrix_frame[,-1] bol_matrix <- matrix_frame[,-1] >1 red_matrix_frame <- matrix_frame[,-1] red_matrix_frame[bol_matrix] <- 1 class(red_matrix_frame) <- "numeric" red_matrix_frame <- cbind(red_matrix_frame, rowSums(red_matrix_frame))
This leaves the new matrix like this. It is transformed to numeric.
a b c d V5 1 1 1 1 4 1 1 1 1 4 1 0 0 0 1 1 0 1 1 3
I am then using the dendsort package to get better cluster sorting, and I am using method ward.D2 since kmeans only showed very insufficient clustering. I am using the red_matrix_frame with 1 and 0 to get clustering, and applying this to the heatmap function used on the original data (plot_matrix_frame). I am using the split function to get 4 sections.
`dend = dendsort(hclust(dist(red_matrix_frame), method="ward.D2")) default.hmap <- Heatmap(plot_matrix_frame[,c(1,2,3,4)],split=4, column_names_side="top", column_order= c("a","b","c","d"), cluster_row_slices=FALSE, #heatmap_legend_param= list( #labels=c("none","Bi","C","Both")), cluster_rows=dend, col=c("white","blue","green","red"))
This leaves me with a matrix that has clustering by in how many columns the value is <0, it is showing the colors according to the experiment. But I cannot fullfill the sorting of the clusters in a way that all of the rows that have values >0 in 3 columns are in the second section, all with >0 in 2 columns are in 3rd slice, and all with only >0 in one column are in 4th slice.
My alternative approach leaves me with a heatmap that is nicely sliced according to my needs, but has no clustering in the slices and no dendrogram. I believe that the issue is that i am using clustering from a different matrix, but i am not aware of a way that will allow both and keep my colors according to the experiment, without including those into the clustering.
split <- paste0("Cl\n", kclus$cluster) default.hmap <- Heatmap(plot_matrix_frame[,c(1,2,3,4)],split=split, column_names_side="top", column_order= c("a","b","c","d"), cluster_row_slices=FALSE, cluster_rows=T, #heatmap_legend_param= list( #labels=c("none","Bi","C","Both")), col=c("white","blue","green","red"))
Any help is greatly appreciated! I have also thought about a possibility to alter the colors later, but I am a bit at a loss if that is at all possible.