Question: Normalisation before log2 transformation or after in Microarray Gene expression data?
0
gravatar for Saheb
10 months ago by
Saheb0
Saheb0 wrote:

Hi friends.

I have a doubt on the order of steps performed on Microarray Gene expression data / RNASeq data.

1) Whether we should apply normalisation techniques like quantile or lowess to Microarray gene expression and then perform log2 transformation or steps are correct other way round? I have found both types of order in different sources. Which one is correct?

2) And what about the order of steps in RNASeq data?

Thanks in advance.

rna-seq • 1.2k views
ADD COMMENTlink modified 10 months ago • written 10 months ago by Saheb0
1
gravatar for Kevin Blighe
10 months ago by
Kevin Blighe39k
Republic of Ireland
Kevin Blighe39k wrote:

For microarray, the broadly accepted method of normalisation is known as Robust Multiarray Average (RMA):

  1. background correction
  2. quantile normalisation
  3. probe summarisation (i.e. across transcripts)
  4. log (base 2) transformation

Extra notes:

  • An alternative to this which also adjusts for the GC content and how it affects probe-binding affinities is called GC-RMA.
  • Other types of normalisation (step 2) exist, namely: Qspline; LOESS; VSN (variance stabilising normalisation); et cetera
  • Step 3 is usually a 'median polish'
  • There are intricate differences in each step based on different microarray platforms

Log transformation is not performed prior to normalisation.

For more, read the really great review by Professor Quackenbush: Microarray data normalization and transformation.

--------------------------

Current RNA-seq normalisaton methods / count values differ quite a bit from each other. We have:

  • FPKM
  • RPKM
  • FPKM-UQ
  • RSEM
  • TPM
  • CPM
  • TMM
  • Median normalisation (DESeq2)

A log transformation is not typically involved in the normalisation process for RNA-seq. Statistical comparisons are performed on the normalised, unlogged counts, which generally do not follow a binomial distribution. RNA-seq count data, in fact, follows a negative binomial distribution, akin to a Poisson. However, one can later log the normalised counts, e.g. for plotting functions, in order to bring them to a binomial distribution. DESeq2, for example, implements a regularised log transformation.

For more, read A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis

Kevin

ADD COMMENTlink modified 5 months ago • written 10 months ago by Kevin Blighe39k

Dear Kevin, Many many thanks for your detailed response to my query.

ADD REPLYlink written 10 months ago by Saheb0

No problem - best of luck. Please do also read the mentioned publications.

ADD REPLYlink written 10 months ago by Kevin Blighe39k

Thanks... I will surely go through those publications...

ADD REPLYlink written 10 months ago by Saheb0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 766 users visited in the last hour