cannot replicate the pheatmap scale function
1
2
Entering edit mode
4.1 years ago
lessismore ★ 1.3k

Dear all,

i am using pheatmap to generate some heatmaps using the function scale=row but i cannot replicate (at least visually, because i didn't manage to export the scaled matrix) the results if i manually scale the matrix with t(scale (t(my.mat))) (the scale function scales by column). Normally that would be used as scale(x, center = TRUE, scale = TRUE). I've seen that because when replotting with the scale=none the the manually scaled matrix the heatmaps look different.
If somebody has any idea about why this happens, it would be really appreciated.

p.s. i've checked the exact function they used:

scale_rows = function(x){
m = apply(x, 1, mean, na.rm = T)
s = apply(x, 1, sd, na.rm = T)
return((x - m) / s)
}


the clustering in fact is exactly the same but it looks like a difference in the colour bar.

pheatmap r • 8.6k views
0
Entering edit mode

the clustering in fact is exactly the same but it looks like a difference in the colour bar.

It should be the opposite. The scaling is the same, but the clustering is different due to the order or operations. See this previous thread: Clustering differences between heatmap.2 and pheatmap

0
Entering edit mode

Hey igor, it's the same clustering in my case because i am comparing
manually scaled > pheatmap function with scale=none VS pheatmap scaled with scale=rows

2
Entering edit mode
4.1 years ago

The row scaling functions from both pheatmap() (pheatmap) and heatmap.2() (gplots) should produce the same results as t(scale(t(x))). Here is the proof using the functions from these packages:

# random data

randomdata <- matrix(rexp(200, rate=.1), ncol=20)


# heatmap.2 (gplots) row scaling

heatmap.2.scale <- function(x, na.rm) {
retval=NULL
retval$rowMeans <- rm <- rowMeans(x, na.rm = na.rm) x <- sweep(x, 1, rm) retval$rowSDs <- sx <- apply(x, 1, sd, na.rm = na.rm)
x <- sweep(x, 1, sx, "/")
}

randomdata.scaled1 <- round(heatmap.2.scale(randomdata, na.rm=TRUE), 3)


# pheatmap row scaling

pheatmap.scale <- function(x) {
m = apply(x, 1, mean, na.rm = T)
s = apply(x, 1, sd, na.rm = T)
return((x - m) / s)
}

randomdata.scaled2 <- round(pheatmap.scale(randomdata), 3)


# manual row scaling

randomdata.scaled3 <- round(data.frame(t(scale(t(randomdata)))), 3)


# test if there are differences

all((randomdata.scaled1 == randomdata.scaled2) == TRUE)
[1] TRUE

all((randomdata.scaled1 == randomdata.scaled3) == TRUE)
[1] TRUE

all((randomdata.scaled2 == randomdata.scaled3) == TRUE)
[1] TRUE


You should check your data for missing values.

Kevin

1
Entering edit mode

You edited the post while I answered. They may use different breaks, which would directly affect the colour bar and colour shading.

0
Entering edit mode

That's the only explanation! Thanks