How to Plot Heatmap for Lower Triangular Matrix Using R
1
0
Entering edit mode
10 weeks ago
ruth ▴ 10

Hi everyone,

I’m currently working on a project where I need to visualize a lower triangular phylogenetic distance matrix as a heatmap in R. My data is stored in a matrix format, but I only want to display the lower triangular part of it in the heatmap. I tried out suggestions from related questions but to no avail. I keep getting this error frequently -

Error in hclust(d, method = method): NA/NaN/Inf in foreign function
call (arg 10) In addition: Warning message: In dist(mat, method = > distance) : NAs introduced by coercion


Could anyone guide me on how to create a heatmap for this lower triangular matrix? Any help with the code or functions to use would be greatly appreciated!

Thank you!

R Heatmap • 537 views
0
Entering edit mode

Would it be possible to provide some code that you have tried out, or possibly some example data as well? It might help to find out what could be causing the error (if it is code related or data related)

0
Entering edit mode

Surely! Here's the code:

convert_to_full_matrix <- function(lower_tri_matrix) {
n <- nrow(lower_tri_matrix)
full_matrix <- matrix(0, n, n)
colnames(full_matrix) <- rownames(lower_tri_matrix)
rownames(full_matrix) <- rownames(lower_tri_matrix)

for (i in 1:n) {
for (j in 1:i) {
full_matrix[i, j] <- lower_tri_matrix[i, j]
full_matrix[j, i] <- lower_tri_matrix[i, j]
}
}
return(full_matrix)
}

distance_matrix <- read_xlsx("F:/Users/phylogenetic_data/distancematrix.xlsx", sheet = "MatrixOutput")
lower_tri_matrix <- as.matrix(distance_matrix, row.names = 1)

dist_matrix <- convert_to_full_matrix(lower_tri_matrix)

pheatmap(dist_matrix,
cluster_rows = TRUE,
cluster_cols = TRUE,
main = "Heatmap of Distance Matrix",
color = colorRampPalette(c("red", "white", "blue"))(100),
show_rownames = TRUE,
show_colnames = TRUE)


An example data table:

1
Entering edit mode
8 weeks ago

The problem is that the pheatmap function does not only plot the data, but also runs the hierarchical clustering to determine the order, and that fails because of the missing values in the distance matrix.

Either you have to run the hierarchical clustering yourself separately with the full matrix and then have a custom plot function e.g. with ggplot's geom_tile() or geom_raster() or you use the corrplot package. I recommend the latter, since it has support with one simple argument: corrplot(..., type="lower").