Question: What are median and quantile normalization?
gravatar for pyKey
2.4 years ago by
pyKey50 wrote:

Hello everyone,

Normally I use TPM for within-sample analysis. Recently I got a suggestion to use Median and Quantile between-sample normalization methods. I noticed that DESeq and Limma packages offer the methods. But... what are they doing? What is the intuition behind them?

Thank you all,

rna-seq normalization • 4.7k views
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by pyKey50

Right! So more explanation:

I have a bunch of RNA-Seq experiments and I am performing some simple gene expression comparisons between two conditions (wildtype vs. mutants). Some conditions have at most two replicates. I already TPM normalized all the samples, but for comparisons, another between-sample normalization step seems like a good idea.

So far your explanations are of great help. Thank you all!

ADD REPLYlink written 2.4 years ago by pyKey50

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLYlink written 2.4 years ago by genomax90k

I already TPM normalized all the samples

If you are performing differential expression with DESeq2 or limma, don't transform the data. DESeq2 expects raw counts. For RNAseq with limma, you have to perform the voom transformation on the raw counts as well. Repeating: start with raw counts, not TPM, for both packages (and edgeR, for that matter).

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by h.mon31k
gravatar for h.mon
2.4 years ago by
h.mon31k wrote:

Your question is poorly explained: what downstream analyses you intend to perform? Are you moving from within-sample comparisons to differential expression analysis?

I believe DESeq2 does not perform quantile nor median normalization, only limma.

About limma between-array normalization: quantile normalization is performed to make the distribution of microarray intensity signals the same between all arrays being analysed. Median normalization (method="scale") makes the samples to have the same median.

DESeq2 and edgeR normalize for library size, each package has a different method for performing the normalization, but the idea is to make all samples sequencing depth "the same". DESeq2 uses some transformations (rlog and vst) for exploratory analyses and visualization, but these are not used for differential expression analysis.

Some resources:

ADD COMMENTlink written 2.4 years ago by h.mon31k

DESeq's method for library normalization is median based; it makes a geometric mean pseudo-sample, finds the median expressed gene in that, and corrects all counts of other samples based on making them all have the same expression at that gene. Obviously this is only smart if you think that only a small subset of your genes are significantly changing expression, and it's safe to anchor your counts based on that gene with a median expression.

ADD REPLYlink written 2.4 years ago by swbarnes28.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 797 users visited in the last hour