Question: What Are The Most Reliable Normalization Methods For Microarrays?
gravatar for Jarretinha
8.9 years ago by
São Paulo, Brazil
Jarretinha3.3k wrote:

Hi people,

I've just attended a seminar focused on microarray data, essentially given by experimentalists. It was somewhat shocking that they were unable to agree on what methods to use for data normalization (and why). So, you can imagine what happened in further steps . . .

Hence, I'm wondering about a list of the most reliable methods for data normalization. Not a plain list of methods/models. A list explaing why a given method/model is reliable (or why someone should use it).

Just to avoid some confusions, in this context reliability acquires its statistical meaning.

This question is relevant just because the most popular normalization procedures depends on statistical models to address probe-level, background-level, etc., variation/correlation. For example, RMA and fRMA uses a linear model.

So, given the number of microarray plataforms and designs, reliability is of utmost importance.

model data microarray • 5.7k views
ADD COMMENTlink modified 7.9 years ago by Michael Dondrup45k • written 8.9 years ago by Jarretinha3.3k

Can you give a short definition of what you mean by reliability in the statistical sense? I confess I had to look up the definition myself, but if it is as is wikipedia "In statistics, reliability is the consistency of a set of measurements or measuring instrument, often used to describe a test. Reliability is inversely related to random error." then the question makes no sense, because most or all normalization methods are deterministic.

ADD REPLYlink written 8.9 years ago by Michael Dondrup45k

It's true that many models are deterministic. But, the most used are model-based. Hence, stochastic/statistical in its nature (e. g. RMA). If one treats normalization as an experiment (which indeed it is), this question makes a lot of sense, though.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

It is totally wrong that a method just because it involves 'a model' becomes non-deterministic! A linear model, given the same data, for example reproduces the same results, always. So, reliability is of "utmost importance", but it is solved.

ADD REPLYlink written 8.9 years ago by Michael Dondrup45k

RMA depends on a linear statistical model. You can check it on the paper if you want. I agree with you if you use a given linear model and OLS it will give the same results. Still, it is a statistical model. But, normalization is not so simple !!! People use a wealth of techniques. If it was just linear regression, this question would be trivial. But, as it depends on many instances on M-estimators, specific training sets, etc., I still think that its not solved. Otherwise, people wouldn't gather on a room for two days to discuss which one is the most reliable.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

I think asking for the "best method" ends up being less productive than asking for opinions on the strengths and weaknesses of a few existing methods.

ADD REPLYlink written 8.9 years ago by Istvan Albert ♦♦ 79k
gravatar for Neilfws
8.9 years ago by
Sydney, Australia
Neilfws48k wrote:

I recommend that you visit PubMed, enter "microarray (normalisation OR normalization) as a query, select some of the review articles and have a good read. Then, armed with appropriate keywords (RMA, GCRMA, MAS), head to Google and obtain some more opinions.

It is not so surprising that people cannot agree on methods. A normalisation method is just a statistical model that tries to explain what happens when probes meet gene chips. Different models have different assumptions. Some of these are: how to distinguish within-array effects from between-array effects? Are mismatch probes ever useful? (RMA says no, because MM probes often, in fact, match). How is "background" distributed across a chip?

Experiments also vary. Which of your experiments are comparable? Should you even be comparing, say, samples prepared last week and frozen to samples freshly-prepared today? If you do want to compare, you can only hope that each set will show some characteristic (batch effect) for which you can correct.

How do you conclude that a method is "right", or "better"? You might try to validate using another experimental method, such as real-time PCR. Or you might conduct "spike-in" experiments, where you know what the "true positives" should be, then see how well each method picks them out. That's the approach taken in this paper. Or, you might try several methods on your own favourite dataset. Of course, someone else will then try them on their favourite dataset - and reach totally opposite conclusions! Or you might just ask "what do most people do?"

That's the long answer. The short answer: RMA ;-)

ADD COMMENTlink written 8.9 years ago by Neilfws48k

Let me say that I do understand the tests. I'm one of the guys who develops statistical tests for such an end. My questions is about reliability, not power or accuracy. I found out that my notion of reliability (which is based on statistical mechanics ideas) is too much different from that of peple on the wet bench. IMHO microarrays are not suitable for differential gene expression & similar experiments (way too high type-II error rate). But, as experimentalists keep using it, reliability still is a important matter.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

If you develop the tests, you should be telling us which are the most reliable :-)

I tend to agree with Michael's comment at the top. Normalization is not a measurement. If anything, the raw intensity is the measurement. But it is not a measurement in the same way that, say, putting a thermometer in water is a measurement. You might have a hypothesis about what observed intensities should be, but variation will ensure that this will never be consistent.

I also agree with you that many microarray experiments are poor measures of gene expression - but that's the way the science chose to go.

ADD REPLYlink written 8.9 years ago by Neilfws48k

Is I said in other comments, one can treat the normalization procedure as an experiment over the raw intensities. There is no legal impediment to do that. The normalization procedure will be you thermometer. And, just to remember, RMA do have hypothesis about intensities.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k
gravatar for Istvan Albert
8.9 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

The lack of reproducibility in microarray methods is well known problem. In my opinion the reasons for this go way beyond the choice of normalization and are primarily caused by biological and experimental variability. Some are convinced that one method must be substantially better than the other, but I suspect that is because that particular method worked well for them under some specific circumstances.

I read studies that demonstrated that the upper 50% percent (the strongest signals) were recovered identically across just about all methodologies, whereas the bottom half contained a different subset for each method. So maybe the best strategy is to be more strict with the results, beyond what the original estimate of the significance is - of course it could be that this approach removes the genes of interest.

That's the long answer. The short answer: the best normalization is the one you understand the best.


ADD COMMENTlink written 8.9 years ago by Istvan Albert ♦♦ 79k

Absolutely agree; it's easy to get wrapped up in normalization choice and forget other factors.

ADD REPLYlink written 8.9 years ago by Neilfws48k

Again, I'm asking about realiability. After normalization, everything is pretty straightforward. Those studies seems very interesting !!! Could you name them?

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

I think it is the normalization that is pretty straightforward, you pick a method and run it. The interpretation that comes after is a lot more difficult. Search for "reproducibility of microarray data" for many papers on this. I just quoted from memory not from a document.

ADD REPLYlink written 8.9 years ago by Istvan Albert ♦♦ 79k

Normalization is the main source of variability (or lack of it) in microarray data. As the intensity level is non-linear and most normalization procedures do use a linear model, the choice of the model do alter the final result.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

You might have to look at the definitions of some of the terms you are using first and clean that up, you are using them wrongly. e.g. "intensity level is non-linear" has no meaning, "most normalization procedures do use a linear model" where did you get that information from? "Normalization is the main source of variability (or lack of it) in microarray data" how can you arrive at this judgement? IMHO this is totally mistaken.....

ADD REPLYlink written 8.9 years ago by Michael Dondrup45k

Clarifying the meaning: the intensity level scale is non-linear. Just to mention: RMA, GC-RMA, fRMA, quantile use a linear model. I'm sure I can find more examples. And a random reference about methods says: "Consistent with previous results we observed a large effect of the normalization method on the outcome of the expression analyses.". This observation is quite reasonable as the intensity scale (which defines the experiment) will be normalized.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

"Inline Link -".

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

Inline Link

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k
gravatar for Michael Dondrup
8.9 years ago by
Bergen, Norway
Michael Dondrup45k wrote:

Following this definition, all deterministic methods are 100% reliable, because they always reproduce the same result when repeated. Reliability is - of course - important for measurements, but data-transformations are not measurements. There are some statistics (not normalization methods I know of) for example those involving the EM-algorithms or k-means clustering.

So, my advise: check if the methods are deterministic, then they are reliable by definition. This question of reliability is for sure relevant for the measurement techniques such as microarrays, qPCR, RNA-seq, but it is totally solved for normalization (say: ALL methods are deterministic/reliable). If you are looking for a problem to solve in normalization this is definitely not the right place.

BTW.: one can easily assess the reliability. If you want to check RMA, loess-normalization, mean or quantile normalization, just run it on the same input data say 1000 times and look at the results. BTW2.: RMA because mentioned (robust multichip average) is not (only) normalization, it comprizes background subtraction, quantile normalization (a totally deterministic method), and intensity sumarization.

Edit: Just to restrict the above said again. There are some reliability issues with normalization. I just saw a message on bioconductor noticing differences in the analysis using GCRMA on windows/linux. As said, most normalization and summary methods are deterministic as long as data and methods stay the same. However, there can be variations on the probe level, even when using the same array design. The most common source of such events is that the array annotation and thereby the probe-level groups and their assignments to genes are changed.

This is sort of a "pseudo-(un)reliabilty" because if all parameters are the same, the results are the same. But the annotations are frequently changed and the annotation updates are mostly included automagically without the user noticing the difference. This is specifically true for the Affy platform.

ADD COMMENTlink modified 8.9 years ago • written 8.9 years ago by Michael Dondrup45k

I've checked. Most methods are statistical and rely on parameter estimation for normalization. Looking to 1000 samples is almost the as Monte Carlo estimation. Of course it will produce the same result, at least on average. It's not a complicated question. I asked for the realiability of the method !!! Even quantile normalization relies on parameter estimation. My question is totally unrelated to the precision of one's computer . . .

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k

No, not on average, exactly. I recommend you really try this out before you claim something. But use one and the same dataset and run the same method, say RMA 1000 times, then publish the result here. Also, you are mistakenly interchanging parameter estimation for non-deterministic outcome. So please, try it out with one technique first.

ADD REPLYlink written 8.9 years ago by Michael Dondrup45k

I do understand your point now. Default RMA will give the same results. A default ML of parameters will give the same too. Both could be invalid, but reliable anyway. So, my question is mock from the beginning. Maybe a question about validity would be more appropriate.

ADD REPLYlink written 8.9 years ago by Jarretinha3.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2445 users visited in the last hour