How to plot a heatmap with two different distance matrices for X and Y
1
1
Entering edit mode
5.2 years ago
Nikleotide ▴ 130

Hello all,

I have a series of .idat files I am willing to cluster (I am actually replicating some publication's results). Here is how the authors have described the methodology they have used in their paper fro clustering methylation data:

"Samples were clustered using Pearson correlation coeffcient as the distance measure and average linkage (x-axis). Methylation probes were reordered by hierarchical clustering using Euclidean distance and average linkage (y-axis)".

I have worked around this but the problem is I am not getting the exact same heatmap as theirs in the paper.

Bellow is the heatmap command in R that I have used for this clustering.

heatmap.2(rcorr(beta[o[1:probecount],],type = "pearson")$r, labRow = FALSE,labCol =phenoData$Diagnosis, margins = c(7,1), hclustfun = function(x) hclust(dist(x),"ave") , trace = "none", dendrogram = "column",Rowv = FALSE)

Can someone kindly let me know if this is the right way to plot the heatmap based on the methodology described above and if there is something that needs to be changed would you please point out what that might be?

The beta values were quantile normalized and the SNP rows were removed before. The rest of the methylation data extraction follows the routine methodology. I don't copy the entire code here so I won't pollute here but I can add it as a comment if needed.

Thank you all very much for your help in advance.

Cheers

heatmap methylation Pearson clustering • 5.9k views
ADD COMMENT
6
Entering edit mode
5.2 years ago

I think that I did this in the past with heatmap.2.

I would pay close attention to the wording of the text, though, particularly the word 'reordered'. In heatmap.2, there is a parameter called reorderfun, which is how they may have done it, but I'm not sure exactly what the code would be. This is how I normally set distance metric, linkage function, and re-order function in heatmap.2:

  • 1 - Pearson correlation distance: distfun=function(x) as.dist(1-cor(t(x)))
  • Average linkage: hclustfun=function(x) hclust(x, method="average"))
  • Re-order by mean: reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean)

I do know that what you were thinking is achievable through ComplexHeatmap ( http://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html ), and here is some sample code to do it:

require(ComplexHeatmap)
require(circlize)
require(cluster)

pamClusters <- pam(heat, k=5)

hmap <- Heatmap(heat,
        name="Transcript Z-score",
        col=colorRamp2(myBreaks, myCol),
        heatmap_legend_param=list(
              color_bar="continuous",
              legend_direction="horizontal",
              legend_width=unit(5,"cm"),
              title_position="topcenter",
              title_gp=gpar(fontsize=15, fontface="bold")),
        split=paste0("", pamClusters$clustering),
        row_title="Transcripts",
        row_title_side="left",
        row_title_gp=gpar(fontsize=15, fontface="bold"),
        show_row_names=FALSE,
        column_title="",
        column_title_side="top",
        column_title_gp=gpar(fontsize=15, fontface="bold"),
        column_title_rot=0,
        show_column_names=FALSE,
        clustering_distance_columns=function(x) as.dist(1-cor(t(x))),
        clustering_method_columns="ward.D2",
        clustering_distance_rows="euclidean",
        clustering_method_rows="ward.D2",
        row_dend_width=unit(30,"mm"),
        column_dend_height=unit(30,"mm"),
        top_annotation=colAnn,
        top_annotation_height=unit(1.75,"cm"),
        bottom_annotation=sampleBoxplot,
        bottom_annotation_height=unit(4, "cm"))

draw(hmap, heatmap_legend_side="top", annotation_legend_side="right")

Note, in particular, the 4 parameters:

  • clustering_distance_columns
  • clustering_method_columns
  • clustering_distance_rows
  • clustering_method_rows

I think that it would be worth your time looking at ComplexHeatmap because with it you have much greater flexibility in arranging heatmaps, and can even put multiple heatmaps on the same print sheet. Here is the figure that was generated by my code above, with key parts removed that could identify to what the data relates:

Captura_de_tela_de_2017_09_18_01_14_23

Here is a good tutorial to ComplexHeatmap: https://bioconductor.org/packages/devel/bioc/vignettes/ComplexHeatmap/inst/doc/s1.introduction.html

Good luck! Kevin

ADD COMMENT
0
Entering edit mode

Thank you very much Kevin for your very thorough explanation. I appreciate you taking your time to answer. I will try both ways you suggested.

Cheers

ADD REPLY
1
Entering edit mode

No problem. Good luck and let me know if I can help further.

ADD REPLY

Login before adding your answer.

Traffic: 1591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6