RNA-seq z-score normalization
2.2 years ago
margott.j ▴ 10

I have RNA-seq data that I have put into DeSeq2 (R) for analysis and I would like to create heatmaps. I have also run my data using the gsea software from the broad institute and I would like to replicate the heatmaps that come out of that in R. I assume that what the software does, is normalise the data using Z score. I have already log2 transformed the data using two methods:

1. vst transformation
2. a simple equation as seen in a previous post ( tdata <- log2(counts + 1))

There are previous forums on Z score normalisation, but they don't explain how to actually do this in R. Would anyone know the code for performing this normalisation?

There is a base R function for this: scale(). Just keep in mind that it scales columns so you may have to feed your transposed matrix (t(matrix)).

maybe you can have a look at this site : https://www.r-bloggers.com/2020/02/how-to-compute-the-z-score-with-r/

2.2 years ago

I suggest you to get the normalized counts from DESeq using the counts() function and setting TRUE the normalized = option. Then, filter your matrix of normalized counts selecting those genes associated with the gene set/pathway of interest. Finally, use the pheatmap package in order to plot your heatmaps. Take a look at the argument scale = within the pheatmap() function in order to scale your matrix respect to columns or rows and calculate z scores. Have a look at this r documentation site.

Best regards!

thank you! I see that the scale = function with pheatmap allows the scaling according to rows/columns, in my case this would be columns. But how do I get the calculated z score? sorry this is unclear from the link you sent me!

For retrieving your calculated z-scores (by column) you should apply the scale() function on your matrix of normalized counts as @Papyrus suggest you. I understand that you want to retrieve the scaled matrix, am I right?