In calculating z-scores for microarray or RNA-Seq data, I have found two main answers on how to obtain them.
For example, in R
, having a log2 expression matrix x
with genes in rows and samples in columns, I would do:
zscore <- function(x) {
z <- (x - mean(x)) / sd(x)
return(z)
}
But many often suggest to use the scale
base R function, on the transposed matrix. Like
mat_zscore <- t(scale(t(x)))
If I am not wrong, the two approaches are different, that is, in the first one I am subtracting population mean and dividing by population SD, while the second one operates by column by default, so transposing is done to calculate mean and SD for each gene in row.
My question is, is one of the two more correct than the other? And why are both given as valid alternatives?
Thanks