Question: edgeR TMM normalization failure
gravatar for pavenhuizen
2.9 years ago by
pavenhuizen90 wrote:

Dear all,

I'm running into what seems to be an issue in my differential gene expression (DGE) analysis. I have three samples, a control, mutant_1, and mutant_2, each with 3 biological replicates. I have quantified my transcripts using Salmon, imported and aggregated the data with tximport and performed the DGE analysis with edgeR by pairwisely comparing the control sample with the mutant samples. The result is that I get 1488 up-regulated and 217 down-regulated genes for mutant_1, and 1984 up-regulated and 1286 down-regulated genes for mutant_2.

The distribution of up- and down-regulated genes for mutant_1 is heavily skewed towards up-regulated genes and very different from mutant_2 and what was expected. Now I want to find out if the result I'm getting is representative of the biology, or if it is (in part) caused by the (failing) method. After reading related posts on DGE asymmetry, I've come to know that perhaps the TMM normalization has failed and one can examine the performence of said normalization using MD plots. The edgeR user guide says the following about MD plots:

Ideally, the bulk of genes should be centred at a log-fold change of zero. This indicates that any composition bias between libraries has been successfully removed.

This entry is accompanied by a single plot, without stating whether this is an example of "good" TMM normalization or a "bad" scenario.

I would like to know how I could properly test whether the TMM normalization succeeded and what I should in case if it turns out that the TMM normalization is not appropriate for my samples.

Thanks! Peter

dge edger rna-seq tmm • 1.2k views
ADD COMMENTlink written 2.9 years ago by pavenhuizen90

I do not believe that there is any definitive measure that says whether normalisation has been successful or not. Such things are usually played out as you go through downstream analyses and then decide to go back a few steps and tweak some parameter until you are finally satisfied with your results.

The imbalance in DEGs in your case could be reflective of an outlier sample or outlier samples - if you generate a PCA bi-plot, this will quickly become evident if it is the case.

Generally speaking, you can assure good normalisation by eliminating variables (genes) that have low counts prior to performing the normalisation process.

ADD REPLYlink written 2.9 years ago by Kevin Blighe69k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2476 users visited in the last hour