Detecting batch effects not via PCA
1
0
Entering edit mode
7.1 years ago
chris86 ▴ 370

Hi

I am trying to detect batch effect in my microarray samples each belongs to multiple different groups including one of five batches. I am currently melting my expression data frame (log2 normalised values) so that I have a list of sampleIDs in column1, and a list of expression values in column2. I am then adding information such as serology or batch in extra columns. I then use a anova test to discern what magnitude the batch effect is having relative to other variables.

aov.ex2 = aov(value~CELL.TYPE+VISIT+SEROLOGY+HYB.BATCH,data=merged)

Df   Sum Sq Mean Sq F value Pr(>F)
CELL.TYPE         4     2221   555.3  109.61 <2e-16 ***
VISIT             4      552   138.0   27.25 <2e-16 ***
SEROLOGY          1      347   347.3   68.55 <2e-16 ***
BATCH             4     2123   530.9  104.79 <2e-16 ***
Residuals   5391730 27314376     5.1
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Df   Sum Sq Mean Sq F value Pr(>F)
CELL.TYPE         4     1318   329.6  65.683 <2e-16 ***
VISIT             4      410   102.6  20.440 <2e-16 ***
SEROLOGY          1      467   466.7  93.004 <2e-16 ***
BATCH             4        3     0.6   0.128  0.972
Residuals   5391730 27058055     5.0
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


I can see that the p value for batch effect has gone up a lot. However I was a bit confused because the PCA plot did not show any batch effect, yet the anova test is giving me a highly significant value for batch effect. Also the F value for the batch effect is very high, higher than other clinical variables, I would not really expect this. Any comments or thoughts? Am I doing this correctly?

Cheers,
Robert

next-gen • 1.9k views
1
Entering edit mode

What are the two different analyses? BATCH is not significant in the second one.

Please post some of the data so we see the structure.

0
Entering edit mode
7.1 years ago
vassialk ▴ 200

Try JMP software, Genesis, Expander, MeV and relevant Bioconductor packages, all you need is there, read documentation