Question

How to plot a heatmap with two different distance matrices for X and Y

1

Entering edit mode

6.6 years ago

Nikleotide ▴ 130

Hello all,

I have a series of .idat files I am willing to cluster (I am actually replicating some publication's results). Here is how the authors have described the methodology they have used in their paper fro clustering methylation data:

"Samples were clustered using Pearson correlation coeffcient as the distance measure and average linkage (x-axis). Methylation probes were reordered by hierarchical clustering using Euclidean distance and average linkage (y-axis)".

I have worked around this but the problem is I am not getting the exact same heatmap as theirs in the paper.

Bellow is the heatmap command in R that I have used for this clustering.

heatmap.2(rcorr(beta[o[1:probecount],],type = "pearson")$r, labRow = FALSE,labCol =phenoData$Diagnosis, margins = c(7,1), hclustfun = function(x) hclust(dist(x),"ave") , trace = "none", dendrogram = "column",Rowv = FALSE)

Can someone kindly let me know if this is the right way to plot the heatmap based on the methodology described above and if there is something that needs to be changed would you please point out what that might be?

The beta values were quantile normalized and the SNP rows were removed before. The rest of the methylation data extraction follows the routine methodology. I don't copy the entire code here so I won't pollute here but I can add it as a comment if needed.

Thank you all very much for your help in advance.

Cheers

heatmap methylation Pearson clustering • 7.1k views

ADD COMMENT • link updated 6.6 years ago by Kevin Blighe 87k • written 6.6 years ago by Nikleotide ▴ 130

score 6 · Answer 1 · 2017-09-18

I think that I did this in the past with heatmap.2.

I would pay close attention to the wording of the text, though, particularly the word 'reordered'. In heatmap.2, there is a parameter called reorderfun, which is how they may have done it, but I'm not sure exactly what the code would be. This is how I normally set distance metric, linkage function, and re-order function in heatmap.2:

1 - Pearson correlation distance: distfun=function(x) as.dist(1-cor(t(x)))
Average linkage: hclustfun=function(x) hclust(x, method="average"))
Re-order by mean: reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean)

I do know that what you were thinking is achievable through ComplexHeatmap ( http://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html ), and here is some sample code to do it:

require(ComplexHeatmap)
require(circlize)
require(cluster)

pamClusters <- pam(heat, k=5)

hmap <- Heatmap(heat,
        name="Transcript Z-score",
        col=colorRamp2(myBreaks, myCol),
        heatmap_legend_param=list(
              color_bar="continuous",
              legend_direction="horizontal",
              legend_width=unit(5,"cm"),
              title_position="topcenter",
              title_gp=gpar(fontsize=15, fontface="bold")),
        split=paste0("", pamClusters$clustering),
        row_title="Transcripts",
        row_title_side="left",
        row_title_gp=gpar(fontsize=15, fontface="bold"),
        show_row_names=FALSE,
        column_title="",
        column_title_side="top",
        column_title_gp=gpar(fontsize=15, fontface="bold"),
        column_title_rot=0,
        show_column_names=FALSE,
        clustering_distance_columns=function(x) as.dist(1-cor(t(x))),
        clustering_method_columns="ward.D2",
        clustering_distance_rows="euclidean",
        clustering_method_rows="ward.D2",
        row_dend_width=unit(30,"mm"),
        column_dend_height=unit(30,"mm"),
        top_annotation=colAnn,
        top_annotation_height=unit(1.75,"cm"),
        bottom_annotation=sampleBoxplot,
        bottom_annotation_height=unit(4, "cm"))

draw(hmap, heatmap_legend_side="top", annotation_legend_side="right")

Note, in particular, the 4 parameters:

clustering_distance_columns
clustering_method_columns
clustering_distance_rows
clustering_method_rows

I think that it would be worth your time looking at ComplexHeatmap because with it you have much greater flexibility in arranging heatmaps, and can even put multiple heatmaps on the same print sheet. Here is the figure that was generated by my code above, with key parts removed that could identify to what the data relates:

Here is a good tutorial to ComplexHeatmap: https://bioconductor.org/packages/devel/bioc/vignettes/ComplexHeatmap/inst/doc/s1.introduction.html

Good luck! Kevin