How To Transform Microarray Data To Adjust For Batch Effects
1
5
Entering edit mode
13.5 years ago

I've downloaded someone else's microarray data (Affymetrix HG-133plus2, normalized with GCRMA) and noticed many unexpected genes were differentially expressed with the patient's sex (about 30 males, 30 females). Although a few genes (e.g. Y-chromosome located EIF1AY) will have obvious sex-linkage in any human sample, such effects are not usually so strong or pervasive in my experience. I checked the headers in the CEL files and noticed a very strong batch effect: files processed in years one and two were overwhelmingly male, while year three were all female. I concluded the effect is due to technical variation, or at least it cannot be distinguished from such bias.

Many tools such as SAM allow you to specify batches. However, I wish to do downstream analysis using my own methods. What is the best approach to transform the data set to reduce the batch effect? I am resigned to losing any ability to detect true sex-specific gene expression. If I were only performing linear modeling I could include the batch as a factor in my model. However, I'd like to (for example) analyze correlation using Spearman's rank correlation, for which I don't know an obvious solution.

A quick literature search turned up Johnson Biostatistics 2007, "Adjusting batch effects in microarray expression data using empirical Bayes methods", which in turn references Benito Bioinformatics 2003, "Adjustment of systematic microarray data biases". Before I dive in any further, anyone expert in this area want to comment on best practices?

modeling data microarray • 5.3k views
ADD COMMENT
4
Entering edit mode
13.5 years ago
User 59 13k

I have always used ComBat.R (from the Johnson Biostatistics paper you mention) to do batch correction on datasets. It's performed very well on our datasets with marked batch variation. I can't say it's best practice, but I can certainly recommend it.

ADD COMMENT

Login before adding your answer.

Traffic: 1511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6