I would like to perform unsupervised hierarchical clustering on some RNA-seq data, but I was told I need to normalize the data by z-score per gene.
My question is: what type of RNAseq data should z-score normalization be performed on? Is it better to do the normalization on RPKM, CPM, log2 CPM, etc?
I typically represent my RNAseq data as mean-centered log2 CPM: Can I perform z-score normalization per gene on mean-centered log2 CPM? Or is this not advised?