Question

Plot heatmap by average based on sample groups

1

Entering edit mode

16 months ago

mohammedtoufiq91 ▴ 250

Hi,

I am working with ComplexHeatmap to plot heatmaps. Usually, I plot heatmaps based on all individual columns of the data.matrix and color/categorize by associated column annotations. This time I was exploring a way to average samples belonging to each group and plot the heatmap (see example below). Is there a way either in ComplexHeatmap package or any other packages like pheatmap or others?

Perhaps, only way is to average manually, and then plot the data like for instance below plot?

dput(sample_metadata)
structure(list(Groups = c("Control", "Control", "Control", "Treated", 
                          "Treated", "Treated"), Factor = c("A", "B", "A", "B", "A", "B"
                          )), class = "data.frame", row.names = c("C1", "C2", "C3", "C4", 
                                                                  "C5", "C6"))
#>     Groups Factor
#> C1 Control      A
#> C2 Control      B
#> C3 Control      A
#> C4 Treated      B
#> C5 Treated      A
#> C6 Treated      B

column_ha = HeatmapAnnotation(df = data.frame(Groups = sample_metadata$Groups),
                              show_annotation_name = TRUE,
                              col = list(Groups = c('Control' = 'green', 'Treated' = 'brown')),
                              simple_anno_size = unit(1.5, "cm"))

dput(mat)
structure(c(0.740164959138429, -2.4099321614319, -0.659774236619576, 
            1.91950484556249, -0.650192056020683, -1.34086079396285, 0.905902019871148, 
            -2.37012805281368, -2.39964455270247, 0.229088582390984, 0.82424862266824, 
            -1.0219550974085, 0.507758786136239, 2.52898525952656, -0.337321730507543, 
            0.578083299127343, 0.819229800537084, 0.20181490681342, 1.24986388764609, 
            1.30607002214597, 0.31844153675257, -0.470576913255826, 0.385549814425907, 
            -1.21337741920392, -2.04002193751137, -1.08393531546505, -0.770593611498049, 
            -0.480755266166721, -0.798580343529609, -1.45728654517006, -0.132837552127625, 
            -0.657523388272781, -0.362183719051338, 0.147142052910538, -0.610173008121441, 
            0.0930305728786384, 0.739470089578251, 0.234072626585632, -0.74750971036684, 
            0.444774561828359, 0.861126444788902, 1.50797148685585, 0.153805667506553, 
            1.00428105980781, 1.14894021489583, 1.04106231136572, 0.0641890695429588, 
            -0.141144169407476, -0.431139540088279, -1.41529060711872, -0.498739314019377, 
            2.40301450349229, -1.12793345263585, 1.28961670292723, -0.598957876185795, 
            0.363547696086119, 1.3286605371786, 0.447120272367067, -0.61719107273225, 
            -1.06088593762651), dim = c(10L, 6L), dimnames = list(c("R1", 
                                                                    "R2", "R3", "R4", "R5", "R6", "R7", "R8", "R9", "R10"), NULL))
#>           [,1]       [,2]       [,3]        [,4]        [,5]       [,6]
#> R1   0.7401650  0.8242486  0.3184415 -0.13283755  0.86112644 -0.4987393
#> R2  -2.4099322 -1.0219551 -0.4705769 -0.65752339  1.50797149  2.4030145
#> R3  -0.6597742  0.5077588  0.3855498 -0.36218372  0.15380567 -1.1279335
#> R4   1.9195048  2.5289853 -1.2133774  0.14714205  1.00428106  1.2896167
#> R5  -0.6501921 -0.3373217 -2.0400219 -0.61017301  1.14894021 -0.5989579
#> R6  -1.3408608  0.5780833 -1.0839353  0.09303057  1.04106231  0.3635477
#> R7   0.9059020  0.8192298 -0.7705936  0.73947009  0.06418907  1.3286605
#> R8  -2.3701281  0.2018149 -0.4807553  0.23407263 -0.14114417  0.4471203
#> R9  -2.3996446  1.2498639 -0.7985803 -0.74750971 -0.43113954 -0.6171911
#> R10  0.2290886  1.3060700 -1.4572865  0.44477456 -1.41529061 -1.0608859

Heatmap(mat, cluster_rows = F, cluster_columns = F, name = "Abundance", top_annotation = column_ha)

dput(mat.avg)
structure(list(Control = c(0.318441537, -0.470576913, 0.385549814, 
                           -1.213377419, -2.040021938, -1.083935315, -0.770593611, -0.480755266, 
                           -0.798580344, -1.457286545), Treated = c(0.076516526, 1.084487534, 
                                                                    -0.445437168, 0.813679939, -0.020063556, 0.499213527, 0.710773232, 
                                                                    0.180016243, -0.598613441, -0.677133994)), class = "data.frame", row.names = c("R1", 
                                                                                                                                                   "R2", "R3", "R4", "R5", "R6", "R7", "R8", "R9", "R10"))
#>        Control     Treated
#> R1   0.3184415  0.07651653
#> R2  -0.4705769  1.08448753
#> R3   0.3855498 -0.44543717
#> R4  -1.2133774  0.81367994
#> R5  -2.0400219 -0.02006356
#> R6  -1.0839353  0.49921353
#> R7  -0.7705936  0.71077323
#> R8  -0.4807553  0.18001624
#> R9  -0.7985803 -0.59861344
#> R10 -1.4572865 -0.67713399

Heatmap(mat.avg, cluster_rows = F, cluster_columns = F, name = "Avg.Abundance")

Thank you,

Mohammed

pheatmap complexheatmap heatmap ggplot2 R • 1.3k views

ADD COMMENT • link 16 months ago by mohammedtoufiq91 ▴ 250

1

Entering edit mode

you are probably best off averaging the values separately and then plotting a matrix of those values.

ADD REPLY • link 16 months ago by jv ★ 1.8k

0

Entering edit mode

jv OK. Yes, I usually, I try to average the data, then plot. This time I was exploring if there is a much easier way similar to this type of feature is implemented in any package.

ADD REPLY • link 16 months ago by mohammedtoufiq91 ▴ 250