Question: gplots heatmap.2 scale function not generating Z-scores between -1 to +1
0
jaime.alvarez.benayas20 wrote:

Hello,

I am trying to display a heatmap using gplots heatmap.2 function, borrowing the data from: http://www.opiniomics.org/you-probably-dont-understand-heatmaps/

My problem is with regards to the scale="row" parameter. In theory, this is getting the raw data and performing a scaling (subtracting the row mean and dividing by the standard deviation). However, when I run it, I see that my Z-scores (color key) are beyond the -1 to +1 range:

``````library(gplots)

h1 <- c(10,20,10,20,10,20,10,20)
h2 <- c(20,10,20,10,20,10,20,10)

l1 <- c(1,3,1,3,1,3,1,3)
l2 <- c(3,1,3,1,3,1,3,1)

x <- rbind(h1,h2,l1,l2)

# Put samples as columns
x = t(x)

metric = "euclidean"

# First calculate the samples distance

# dist() calculates distances between rows, samples are on columns, therefore
# transpose the matrix
# Calculate the distance matrix between samples
samples_distance_matrix = dist(t(x), method=metric)

# Then calculate the distance between features, features are on rows
features_distance_matrix = dist(x, method=metric)

# Now produce the heatmap and dendrograms
# The rows and right will contain features
# Apply hclust with the linkage method specified

heatmap.2(x, Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))
`````` When I input manual scaling of the data and remove scale="row" I get the expected Z-scores between -1 and +1:

``````heatmap.2(scale(x, center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", col=colorRampPalette(c("white","darkblue")))
`````` I thought that heatmap.2 was just presenting the raw data after scaling rows, is something else going on? I have other data going from -3 to 3.5 in Z-scores.

gplots heatmap.2 R scale • 3.1k views
modified 8 months ago by samkioko270 • written 23 months ago by jaime.alvarez.benayas20
1

I take it you don't know what a z-score is otherwise, could you please clarify why it should be between -1 and 1 ?

Here's the code from the heatmap.2 function:

``````if (scale == "row") {
+         retval\$rowMeans <- rm <- rowMeans(x, na.rm = na.rm)
+         x <- sweep(x, 1, rm)
+         retval\$rowSDs <- sx <- apply(x, 1, sd, na.rm = na.rm)
+         x <- sweep(x, 1, sx, "/")
+     }
``````

So, yes, extracting the mean and then dividing by the standard deviation of the mean-subtracted data. As Jean-Karim mentions, this is Z-score scaling.

Generate random data and try it out:

``````random <- matrix(rexp(200, rate=.1), ncol=20)
rm <- rowMeans(random, na.rm=TRUE)
x <- sweep(x, 1, rm)
sx <- apply(x, 1, sd, na.rm=TRUE)
x <- sweep(x, 1, sx, "/")
range(x)
 -13.898808  -3.451125
``````
1

Sorry, silly mistake, as you say, I wasn't understanding Z-scores correctly.

So there are two issues here: one is that the Z-scores can go beyond -1 and +1 so there isn't a problem. The other one is that I made a mistake in my code since the "scale" function scales by columns and not rows, therefore scaling the transpose of the input matrix and then transposing gets the same answer:

``````heatmap.2(t(scale(t(x), center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))
``````

Scale should be either “none”, “row” or “column”

2
jaime.alvarez.benayas20 wrote:

Sorry, silly mistake, as you say, I wasn't understanding Z-scores correctly.

So there are two issues here: one is that the Z-scores can go beyond -1 and +1 so there isn't a problem. The other one is that I made a mistake in my code since the "scale" function scales by columns and not rows, therefore scaling the transpose of the input matrix and then transposing gets the same answer:

``````heatmap.2(t(scale(t(x), center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))
``````