I'm looking at an old microarray gene expression dataset and I had a question about correcting for chip to chip batch effects. The samples were run on the Affymetrix Mouse Genome 430 2.0 Array.
The experimental design is the following: 3 biological replicates of both paired mouse cutaneous skin (c) and oral mucosa (m) taken from 8 different time points (including t0= control) for a total of 48 samples. The issue I'm concerned about is that when looking at the chip hybridization data, all of the samples for both cutaneous and mucosa at the same time points were hybridized to the same chip. I'm worried I cannot correct for batch effects due to this and since I'm looking for differential expression changes over time. Please see below:
Chip 1: t0_c1, t0_c2, t0_c3, t1_c1, t1_c2, t1_c3, t0_m1, t0_m2, t0_m3, t1_m1, t1_m2, t1_m3
Chip 2: t2_c1, t2_c2, t2_c3, t3_c1, t3_c2, t3_c3, t2_m1, t2_m2, t2_m3, t3_m1, t3_m2, t3_m3
Chip 3: t4_c1, t4_c2, t4_c3, t5_c1, t5_c2, t5_c3, t4_m1, t4_m2, t4_m3, t5_m1, t5_m2, t5_m3
Chip 4: t6_c1, t6_c2, t6_c3, t7_c1, t7_c2, t7_c3, t6_m1, t6_m2, t6_m3, t7_m1, t7_m2, t7_m3
When looking at this normalized data on a PCA plot the samples group according to the chip they were run on t0_c with t1_c and t0_m with t1_m, etc.. Is there any way I can correct for this batch effect with the way they were run on the chips?