Question: cannot replicate the pheatmap scale function
1
23 months ago by
lessismore890
Mexico
lessismore890 wrote:

Dear all,

i am using `pheatmap` to generate some heatmaps using the function `scale=row` but i cannot replicate (at least visually, because i didn't manage to export the scaled matrix) the results if i manually scale the matrix with `t(scale (t(my.mat)))` (the scale function scales by column). Normally that would be used as `scale(x, center = TRUE, scale = TRUE)`. I've seen that because when replotting with the `scale=none` the the manually scaled matrix the heatmaps look different.
If somebody has any idea about why this happens, it would be really appreciated.

p.s. i've checked the exact function they used:

``````scale_rows = function(x){
m = apply(x, 1, mean, na.rm = T)
s = apply(x, 1, sd, na.rm = T)
return((x - m) / s)
}
``````

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

pheatmap R • 1.4k views
modified 23 months ago • written 23 months ago by lessismore890

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

It should be the opposite. The scaling is the same, but the clustering is different due to the order or operations. See this previous thread: Clustering differences between heatmap.2 and pheatmap

Hey igor, it's the same clustering in my case because i am comparing
manually scaled > pheatmap function with `scale=none` VS pheatmap scaled with `scale=rows`

1
23 months ago by
Kevin Blighe65k
Kevin Blighe65k wrote:

The row scaling functions from both `pheatmap()` (pheatmap) and `heatmap.2()` (gplots) should produce the same results as `t(scale(t(x)))`. Here is the proof using the functions from these packages:

# random data

``````randomdata <- matrix(rexp(200, rate=.1), ncol=20)
``````

# heatmap.2 (gplots) row scaling

``````heatmap.2.scale <- function(x, na.rm) {
retval=NULL
retval\$rowMeans <- rm <- rowMeans(x, na.rm = na.rm)
x <- sweep(x, 1, rm)
retval\$rowSDs <- sx <- apply(x, 1, sd, na.rm = na.rm)
x <- sweep(x, 1, sx, "/")
}

randomdata.scaled1 <- round(heatmap.2.scale(randomdata, na.rm=TRUE), 3)
``````

# pheatmap row scaling

``````pheatmap.scale <- function(x) {
m = apply(x, 1, mean, na.rm = T)
s = apply(x, 1, sd, na.rm = T)
return((x - m) / s)
}

randomdata.scaled2 <- round(pheatmap.scale(randomdata), 3)
``````

# manual row scaling

``````randomdata.scaled3 <- round(data.frame(t(scale(t(randomdata)))), 3)
``````

# test if there are differences

``````all((randomdata.scaled1 == randomdata.scaled2) == TRUE)
[1] TRUE

all((randomdata.scaled1 == randomdata.scaled3) == TRUE)
[1] TRUE

all((randomdata.scaled2 == randomdata.scaled3) == TRUE)
[1] TRUE
``````

You should check your data for missing values.

Kevin

1

You edited the post while I answered. They may use different breaks, which would directly affect the colour bar and colour shading.