RPM values of RNASeq for clustering and heatmap
1
0
Entering edit mode
4.7 years ago
cheischra • 0

Hi,

We have RPM (not RPKM) values for 7 different plant lines (3 reps each). However, I want to perform an unsupervised clustering and plot a heatmap to look for interesting patterns. My question is how to mean center and normalise these RPM values for visualisation of heatmaps and clustering? Will log2 transformation of these RPM values help me in representing that? For example, in the method section (expression data analysis) of this paper http://www.plantphysiol.org/content/168/4/1684/tab-figures-data they say that "RPM values were centered around the mean and normalized using the sum-of-squares method". How do I actually do like this?

Thanks.

RNA-Seq normalization clustering RPM • 1.4k views
ADD COMMENT
0
Entering edit mode
4.7 years ago
ATpoint 81k

Not discussing now if RP(K)M is a good choice for count normalization or not (it is not), what you probably want is Z-normalization aka standardization.

In R, given a count matrix with column = samples and rows = genes that would be:

zscored <- t(scale(t(your.rpm)))

I suggest you log2-transform your data before you do that.

ADD COMMENT

Login before adding your answer.

Traffic: 2964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6