Hello,

I am trying to do a microarray analysis on a time-series (5 time points) gene expression (GE) data of 3000 genes. I want to calculate all Vs all (3000 Vs 3000 genes') Pearson correlation coefficient (PCC) values. I searched a lot in literature and found that studies use values that are** 'Log (base 2) of the ratio of the median of test spot's intensity to median of control's intensity'. **The intensity values are already normalised (background normalised).** Whereas, some studies subtract the control GE intensity value of a gene from the test value.**

Following are the values of two genes (for 5 time points), which I used for calculating PCC. The problem is that when I plot a distribution of the Correlation Coefficient values** (3K * 3K = 900,000 values),** majority **(~50%) of values fall between 0.6 to 1.00.** This means most of the genes have a very high correlation among each other. But this is unexpected, because usually a very small fraction of the total genes show such a high correlation with each other. Therefore, I think that the kind of normalisation used to derive the following ratios might not be useful if the normalised values are to be used to calculate Correlation coefficient. :

**Gene A:** 0.0715 -0.1203 0.0039 0.7151 1.202

**Gene B:** -0.288 -0.0900 0.2310 0.3510 0.415

Kindly guide me on which of the above two kinds of values are appropriate for calculating pearson correlation coefficients of all Vs all genes**.**