Question: Baseline Transformation/Normalization To Control For Microarray Data
gravatar for Adamc
8.9 years ago by
United States
Adamc640 wrote:

Hello everyone,

This is a question regarding microarray data pre-processing, which I think is at least partially an issue of terminology.

Background info:
I recently had a researcher request an analysis involving "normalizing" one sample to another sample, and then the differential expression vs a third sample. This has come up before, when an external collaborator had done something similar with a data set that involved three experimental conditions, to improve sensitivity for a given condition. In that case, they had termed it "delta differential expression". They had used R for their analysis workflow, but I never received the code for that segment.

I believe that this should be theoretically possible in GeneSpring, by performing baseline transformation of arrays for one condition to arrays of another condition, but I am either misinterpreting it or the current version is buggy as it is requiring even samples that have been marked as controls to also have a control set. And that doesn't seem to make any sense.

Perhaps due to the variances in terminology, I haven't been able to find any papers mentioning this, simple as it is. Is there a more common formal name for this? Would it be appropriate to do something as simple as subtracting the mean/median of every probe for the "control" from the probe levels for the other two arrays, and then proceed with the normal DE analysis?

Thank you.

ADD COMMENTlink written 8.9 years ago by Adamc640

What is YOUR dataset and what do you want to do with it? In other words, how many arrays for each condition, what type of arrays, and what question(s) do you want to ask? Could you clarify those details?

ADD REPLYlink written 8.9 years ago by Sean Davis26k

I was interested more in general methodology- I'm also developing some software which will be working with microarray data analysis, and probably need to incorporate this sort of comparison. Overall it seems like an oversimplified way of attempting to improve sensitivity for one experimental condition, when there are multiple conditions. I'll go back and ask our original collaborator about what they did.

ADD REPLYlink written 8.9 years ago by Adamc640
gravatar for Maxime
8.9 years ago by
Maxime70 wrote:

I usually used GCRMA normalization, within affy bioconductor package. But only three samples seems very little to me.

ADD COMMENTlink written 8.9 years ago by Maxime70

Three conditions, multiple replicates. What I meant by 'normalization' in this case is more of the general mathematical term than the microarray-specific term. More like 'standardization'.

ADD REPLYlink written 8.9 years ago by Adamc640
gravatar for Larry_Parnell
8.8 years ago by
Boston, MA USA
Larry_Parnell16k wrote:

This is what my colleague does.

Two steps are involved. First the data are transformed and then normalized. Transform the base signal data by log(2). Normalize to the median of another sample before the comparison of state 1 to state 2 or control vs experimental.

ADD COMMENTlink written 8.8 years ago by Larry_Parnell16k

It turns out that this isn't what they had done for our data, but I'll probably use this in the future. They were just defining nested contrasts for makeContrasts before model fitting in Bioconductor.

ADD REPLYlink written 8.8 years ago by Adamc640
gravatar for W Langdon
8.8 years ago by
W Langdon90
W Langdon90 wrote:

We created a series of average Affymetric GeneChips using thousands of genechips downloaded from GEO and then using R to calcualte the protected geometric mean for each probe on the array. The average was protected by discarding the extreem 1% (ie top 0.5% and bottom 0.5%). This was because the mean is highly susceptible to outliers. I think today I would be tempted to work with the median. New chips, even just a single chip, can then be quantile normalised against the average chip. See


ADD COMMENTlink modified 11 months ago by RamRS28k • written 8.8 years ago by W Langdon90

I forgot to give you the address of the R code:


ADD REPLYlink modified 11 months ago by RamRS28k • written 8.7 years ago by W Langdon90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 702 users visited in the last hour