I'm just starting analysis of an Affymetrix Human Gene 1.0 ST microarray dataset, and I was hoping to get a consensus on which libraries are 'best' for normalization of these data. I've analyzed a good few Affy datasets, but never with this chip and I'm learning that the software options are a bit more limited. I've come across the aroma, XPS and oligo R libraries, however there's no clear winner in the literature, from what I can tell. Thanks very much for any input.
I think you pretty much answered your own question there. Oligo is very straightforward for RMA normalisation of Exon/Gene ST arrays at the probeset and transcript levels. "Best" is a vague term, but most people use an RMA approach for Affy chips. In my experience aroma.affymetrix is a bit harder to get to grips with, and perhaps better for handling large chip numbers due to the way it has been optimised for lower memory overheads, XPS is similarly optimised.
The affy package in Bioconductor is probably the most widely used R package for affy analysis, it is definitely the most well maintained and updated one, because it's part of the Bioconductor core packages. It has also the most possibilities to combine different methods in its
expresso function, possibly confusing most people. For doing just RMA, the
justRMA function is both more simple and memory efficient.
You will need an array description file, (called
.cdf) that fits your array. In most cases the affy package will download and install the cdf file automatically when trying to import .cel files. Just make sure that you are using the only the latest version of Bioconductor.
Just in case here is the annotation package that could work.
Disclaimer: haven't tested anything because my institution cannot afford such expensive stuff ;)
Aroma and XPS both require significantly more effort to use the first time; I've found XPS helpful for the new exon arrays-- which are frankly a real pain to deal with-- but don't use it for the ST arrays, as it's rather complex. Aroma is well-documented, but unless I had thousands of arrays to normalize I go with oligo. Dead simple to use.
Note that if you happen to be using the GeneTitan system, the correct CDF is version 1.1 rather than 1.0.