Using Rle (Relative Log Expression) Mean Values Of Microarray Data To Adjust For Batch Effects
1
2
Entering edit mode
9.7 years ago
Luke ▴ 40

I am analyzing microarray data generated using Illumina Human HT 12 chips, and there were multiple batches as the samples were analyzed. The data I have has been through the 'standard' genome studio normalization steps, but has not been adjusted for any batch effects.

In analyses testing an outcome of interested against the expression values it is common to 'adjust' (include as an independent variable) for the batch effects using a factor variable.

I have also seen elsewhere that analysts may adjust for the relative log expression (RLE) mean to account for technical bias. RLE means are more commonly used to assess the batch effects using boxplots - I can see from boxplots in my data the a couple of the batches have significantly higher RLE means, bot not all.

My question is which method most accurately accounts for the technical variability introduced by the batches?

My feeling is that using the RLE mean values is best because, not only is this a linear variable, but it is actually based on the data! The batches may not necessarily have affected the expression, but to include them as covariates anyway must introduce some noise to the model. Whereas including the RLE mean values as a covariate, which are based exclusively on the expression data itself, will only account for the observed technical variation. Is this rationale logical? Have I overlooked anything? Many thanks.

microarray expression analysis statistics r • 6.2k views
ADD COMMENT
4
Entering edit mode
9.7 years ago
brentp 24k

This article compared several methods of removing batch effects: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0017238

By their metrics, ComBat ( http://www.bu.edu/jlab/wp-assets/ComBat/Abstract.html ) performed the best.

ComBat is available in the R/Bioconductor package SVA: http://www.bioconductor.org/packages/release/bioc/html/sva.html

The advantage of using these methods over simpler approaches is that (some of them) they can remove batch effects while "protecting" your model of interest.

ADD COMMENT

Login before adding your answer.

Traffic: 1220 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6