Impact Of Microarray Gene Expression Normalization On Clustering
2
2
Entering edit mode
11.0 years ago
elb83 ▴ 80

Hi guys I have a question about microarray gene expression normalization techniques and clustering. I have a gene expression microarray matrix of around 13.000 genes (the rows) and 200 samples (the columns). I normalized the matrix using RMA (that gives the values in log2 scale) and then I clusterized it (the samples and the genes) using the pearson correlation and "average linkage" for HCL. The genes and the samples clusterize very well! If I repeat the normalization but now using MAS5 (and then I log2 transform the data) and again if I clusterized using the same criteria as above, the genes and the samples do not cluster anymore!!!!! I tried to center the genes and the samples, that is for each row (gene) I subtracted the median value across the samples both after RMA and after Mas5 normalization but again the genes and the smples clusterize very well using RMA but not using Mas5. Then, for each gene (row) I computed the median across all samples and after RMA normalization the distribution of the median of the genes across the samples is Normal (as from Shapiro test) while after Mas5 it is not Normal. Can this aspect affect the quality of the clustering? Why this great difference using the two methods?

microarray gene-expression clustering normalization • 5.0k views
ADD COMMENT
0
Entering edit mode

What do you mean by cluster well? Are you getting more clusters with one versus the other? Are you getting better cluster densities? Are you getting better cluster separation? Do the clusters make more sense biologically?

ADD REPLY
0
Entering edit mode

Hi Damian! The genes and samples group well together. In other words with RMA I get better cluster separation!

ADD REPLY
3
Entering edit mode
11.0 years ago

Clustering is probably not really the best method to evaluate normalization. You might take a look at boxplots, density plots, and some MA plots, if necessary. RMA is probably the better normalization for most situations, though. Finally, be sure that you are comfortable with the quality of the data before proceeding too far.

ADD COMMENT
0
Entering edit mode

Hi Sean! Thank you a lot for suggestions!

ADD REPLY
1
Entering edit mode
11.0 years ago
Neilfws 49k

Agree with Sean that clustering is not a good measure of normalization.

To address the issue of why RMA and MAS5 give different clusters. It is not so surprising, when you consider that the former values are log2-transformed whereas the latter are not. Log transformation has the effect of "squashing" values closer together, which will result in smaller values in the distance matrix and hence, "tighter" clusters.

ADD COMMENT

Login before adding your answer.

Traffic: 2418 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6