My lab is interested in doing a meta-analysis of two microarray datasets downloaded from a public repository. In each dataset, a microarray analysis was done on tissue collected at two timepoints. We're interested in asking, across all timepoints, whether genes change expression abundance between timepoint 1 and timepoint 2.
Originally, I had approached this problem with RankProd, and found a few genes of interest that were significantly up/downregulated. A colleague took a different approach: first, background corrected/normalized data was batch effect-adjusted using ComBat. From this ComBat-adjusted expression data matrix, ten genes of interest were isolated. She then did a two-way ANOVA with factors for Gene (i.e., ten possible) and Timepoint (i.e., two possible). With this method, the found significant Gene x Timepoint interaction. With Tukey's post-hoc, she found several genes were significantly up- or down-regulated. Some of these, but not all, were the same genes that had been identified as up or down-regulated with RankProd.
My potential concerns are 1) different numbers of samples in each dataset (Dataset 1= 3 samples at each timepoint, Dataset 2=15 samples at each timepoint), and 2) potential statistical concerns we're not considering by doing an ANOVA on ComBat output.
Is it okay to apply ANOVA to microarray data in this way? If not, why not?
Thank you for your help!