Question: Normalisation before log2 transformation or after in Microarray Gene expression data?
gravatar for J. Smith
2.4 years ago by
J. Smith50
J. Smith50 wrote:

Hi friends.

I have a doubt on the order of steps performed on Microarray Gene expression data / RNASeq data.

1) Whether we should apply normalisation techniques like quantile or lowess to Microarray gene expression and then perform log2 transformation or steps are correct other way round? I have found both types of order in different sources. Which one is correct?

2) And what about the order of steps in RNASeq data?

Thanks in advance.

rna-seq • 3.8k views
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by J. Smith50
gravatar for Kevin Blighe
2.4 years ago by
Kevin Blighe65k
Kevin Blighe65k wrote:

For [Affymetrix] microarray, the broadly accepted method of normalisation is known as Robust Multiarray Average (RMA):

  1. background correction
  2. quantile normalisation
  3. probe summarisation (i.e. across transcripts)
  4. log (base 2) transformation

Extra notes:

  • An alternative to this which also adjusts for the GC content, and how it affects probe-binding affinities, is called GC-RMA.
  • Other types of normalisation (step 2) exist, namely: Qspline; LOESS; VSN (variance stabilising normalisation); et cetera
  • Step 3 is usually a 'median polish'
  • There are intricate differences in each step based on different microarray platforms

Log transformation is not performed prior to normalisation.

For more, read the really great review by Professor Quackenbush: Microarray data normalization and transformation.


Current RNA-seq normalisaton methods / expression measures differ quite a bit from each other. We have:

  • FPKM
  • RPKM
  • RSEM
  • TPM
  • CPM
  • TMM
  • Median normalisation (DESeq2)

NB (added November 6th, 2019) - some of these are not considered normalisation procedures, per se, and are instead referred to as count measures / abundance measures / expression units that are produce from otherwise un-named normalisation procedures, e.g., FPKM

A log transformation is not typically involved in the normalisation process for RNA-seq. Statistical comparisons are performed on the normalised, unlogged counts, which generally do not follow a binomial distribution. RNA-seq count data, in fact, follows a negative binomial distribution, akin to a Poisson. However, one can later log the normalised counts, e.g. for plotting functions, in order to bring them to a binomial distribution. DESeq2, for example, implements a regularised log transformation.

For more, read A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis


ADD COMMENTlink modified 4 months ago • written 2.4 years ago by Kevin Blighe65k

Dear Kevin, Many many thanks for your detailed response to my query.

ADD REPLYlink written 2.4 years ago by J. Smith50

No problem - best of luck. Please do also read the mentioned publications.

ADD REPLYlink written 2.4 years ago by Kevin Blighe65k

Thanks... I will surely go through those publications...

ADD REPLYlink written 2.4 years ago by J. Smith50
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1947 users visited in the last hour