Question: How to plot a heatmap with two different distance matrices for X and Y
0
gravatar for Nikleotide
23 months ago by
Nikleotide100
Canada
Nikleotide100 wrote:

Hello all,

I have a series of .idat files I am willing to cluster (I am actually replicating some publication's results). Here is how the authors have described the methodology they have used in their paper fro clustering methylation data:

"Samples were clustered using Pearson correlation coeffcient as the distance measure and average linkage (x-axis). Methylation probes were reordered by hierarchical clustering using Euclidean distance and average linkage (y-axis)".

I have worked around this but the problem is I am not getting the exact same heatmap as theirs in the paper.

Bellow is the heatmap command in R that I have used for this clustering.

heatmap.2(rcorr(beta[o[1:probecount],],type = "pearson")$r, labRow = FALSE,labCol =phenoData$Diagnosis, margins = c(7,1), hclustfun = function(x) hclust(dist(x),"ave") , trace = "none", dendrogram = "column",Rowv = FALSE)

Can someone kindly let me know if this is the right way to plot the heatmap based on the methodology described above and if there is something that needs to be changed would you please point out what that might be?

The beta values were quantile normalized and the SNP rows were removed before. The rest of the methylation data extraction follows the routine methodology. I don't copy the entire code here so I won't pollute here but I can add it as a comment if needed.

Thank you all very much for your help in advance.

Cheers

ADD COMMENTlink modified 23 months ago by Kevin Blighe46k • written 23 months ago by Nikleotide100
5
gravatar for Kevin Blighe
23 months ago by
Kevin Blighe46k
Kevin Blighe46k wrote:

I think that I did this in the past with heatmap.2, but I have literally 1000s of scripts on my computer and I don't know where to start looking..

I would pay close attention to the wording of the text, though, particularly the word 'reordered'. In heatmap.2, there is a parameter called reorderfun, which is how they may have done it, but I'm not sure exactly what the code would be. This is how I normally set distance metric, linkage function, and re-order function in heatmap.2:

  • 1 - Pearson correlation distance: distfun=function(x) as.dist(1-cor(t(x)))
  • Average linkage: hclustfun=function(x) hclust(x, method="average"))
  • Re-order by mean: reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean)

I do know that what you were thinking is achievable through ComplexHeatmap (http://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html), and here is some sample code to do it:

require(ComplexHeatmap)
require(circlize)
require(cluster)

pamClusters <- pam(heat, k=5)

hmap <- Heatmap(heat,
        name="Transcript Z-score",
        col=colorRamp2(myBreaks, myCol),
        heatmap_legend_param=list(
              color_bar="continuous",
              legend_direction="horizontal",
              legend_width=unit(5,"cm"),
              title_position="topcenter",
              title_gp=gpar(fontsize=15, fontface="bold")),
        split=paste0("", pamClusters$clustering),
        row_title="Transcripts",
        row_title_side="left",
        row_title_gp=gpar(fontsize=15, fontface="bold"),
        show_row_names=FALSE,
        column_title="",
        column_title_side="top",
        column_title_gp=gpar(fontsize=15, fontface="bold"),
        column_title_rot=0,
        show_column_names=FALSE,
        clustering_distance_columns=function(x) as.dist(1-cor(t(x))),
        clustering_method_columns="ward.D2",
        clustering_distance_rows="euclidean",
        clustering_method_rows="ward.D2",
        row_dend_width=unit(30,"mm"),
        column_dend_height=unit(30,"mm"),
        top_annotation=colAnn,
        top_annotation_height=unit(1.75,"cm"),
        bottom_annotation=sampleBoxplot,
        bottom_annotation_height=unit(4, "cm"))

draw(hmap, heatmap_legend_side="top", annotation_legend_side="right")

Note, in particular, the 4 parameters:

  • clustering_distance_columns
  • clustering_method_columns
  • clustering_distance_rows
  • clustering_method_rows

I think that it would be worth your time looking at ComplexHeatmap because with it you have much greater flexibility in arranging heatmaps, and can even put multiple heatmaps on the same print sheet. Here is the figure that was generated by my code above, with key parts removed that could identify to what the data relates:

Captura_de_tela_de_2017_09_18_01_14_23

Here is a good tutorial to ComplexHeatmap: https://bioconductor.org/packages/devel/bioc/vignettes/ComplexHeatmap/inst/doc/s1.introduction.html

Good luck! Kevin

ADD COMMENTlink modified 10 months ago • written 23 months ago by Kevin Blighe46k

Thank you very much Kevin for your very thorough explanation. I appreciate you taking your time to answer. I will try both ways you suggested.

Cheers

ADD REPLYlink written 23 months ago by Nikleotide100
1

No problem. Good luck and let me know if I can help further.

ADD REPLYlink written 23 months ago by Kevin Blighe46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1207 users visited in the last hour