Plotting heat map with significance based on multiple columns
Entering edit mode
14 months ago

Hello Everyone,

I have a dataframe with columns having different sample information. It looks like this:

Pathways  s1_adjPval s1logFC s2_adjPval s2_logFC s3_adjPval s3_logFC
X1            0.001        0.6             0.25            -0.6         0.002          0.34

I want to plot the graph in such a way that I have colour of the heat map on the basis of logFC value, and star if the value in adjPval is less than 0.05 for individual samples.

I tried using annotations of heat map but I could not change on the basis of multiple column values. If someone has had such experience, will be grateful if you can share how to project on basis of multiple columns.

PS - Both R and Python visualization packages are okay for me.

Any help is greatly appreciated. Thanks

R Python Heatmap • 1.7k views
Entering edit mode
14 months ago

It is still a bit unclear to me what exactly you wish to achieve, but I hope this gives you a basis for further modifications:


#simulate data
sampledata <- data.frame(
  "Pathways" = paste("Pathway", LETTERS),
  "s1_adjPval" = runif(26, min = 0, max = 1),
  "s1_logFC" = log(rweibull(26, 1.1, 2), 2),
  "s2_adjPval" = runif(26, min = 0, max = 1),
  "s2_logFC" = log(rweibull(26, 1.1, 2), 2),
  "s3_adjPval" = runif(26, min = 0, max = 1),
  "s3_logFC" = log(rweibull(26, 1.1, 2), 2)

#reshape to long format for plotting
plotdata <- sampledata %>%
    cols = !Pathways,
    names_to = c("sample", ".value"),
    names_sep = "_"
  ) %>% mutate(label = cut(
    breaks = c(0, 0.001, 0.01, 0.05, 1),
    labels = c("***", "**", "*", "n.s.")

#cluster to order rows
clustering <- sampledata %>% select(ends_with("logFC")) %>% dist() %>% hclust(method="median")
plotdata[["Pathways"]] <- factor(plotdata[["Pathways"]],levels=paste("Pathway", LETTERS)[clustering[["order"]]])

# create plot
heatmap <- ggplot(plotdata,aes(x=sample,y=Pathways,fill=logFC,label=label)) + geom_tile() + geom_text() + scale_fill_viridis_c(option="magma")

For the clustering, you can to play around with the settings depending on your values, e.g. choose a different distance measure than euclidean or also a different clustering method.

Sample heatmap

Entering edit mode
14 months ago
LauferVA 4.3k

Sidrah -

Two possible ways forward:

1) instead of stars, consider other ways of showing significance. imagine that your heatmap is in a square grid, like normal, but each value in the heatmap is a different shape depending on its significance (but its color still logFC value). This site should give you enough ideas on how to do this, here.

2) there is nothing wrong with using something like adobe illustrator to add in asterisks (stars) for significance. What I mean is, you make the plot in R or python the way you want it, but then just add the stars in using illustrator. it is wrong to alter the appearance of data with illustrator, but it is not wrong to simply add an annotation to a graph. might be the fastest way.


Login before adding your answer.

Traffic: 3178 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6