Question: Quantiles Normalization On Massive Dataset
7.7 years ago by
United States
I've got a massive dataset of 160 Affy Exon arrays that I need to quantiles normalize at probe level (the reasons for which are complicated!), which means ~6 million probes per sample.

Trying to load the into memory maxes out 32gigs of memory on one of our cluster nodes.

I clearly need to find a way to do this without loading it into memory, would anyone perhaps have a pointer/suggestion?


ADD COMMENTlink modified 5.8 years ago by Biostar ♦♦ 20 • written 7.7 years ago by Paul750
7.7 years ago by
Cambridge, UK
A few suggestions:

  • Aroma affy or xps for on-disk access instead of loading everything into memory,
  • fRMA (frozen RMA) to pre-process single or batches of arrays against precomputed reference probe effects.

Hope this helps.

ADD COMMENTlink written 7.7 years ago by Laurent1.6k
7.6 years ago by
Washington University School of Medicine, St. Louis, USA
I also recommend Aroma affy, but you might also consider RMAExpress

ADD COMMENTlink written 7.6 years ago by Malachi Griffith17k
