TMM followed by inverse normal transform
1
1
Entering edit mode
6 weeks ago

Hey all,

I am following a protocol from a paper that uses the following pre-processing procedure:

a. Read counts were normalized between samples using TMM (Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11, R25 (2010)).

b. Expression values for each gene were inverse normal transformed.

I used edgeR::calcNormFactors to normalize library size via TMM for part a, but I am confused on how to apply an inverse normal transform on my read counts together with my normalized library sizes. What is my misunderstanding? I know that I can apply other transforms like cpm, rpkm, etc., to the results of calcNormFactors, and it will transform using the normalized library sizes -- is there a similar function for inverse normal transformation?

Appreciate any help.

edgeR GTEx RNAseq • 255 views
3
Entering edit mode
6 weeks ago
Gordon Smyth ★ 3.5k

I am not convinced that the inverse normal transformation is a good idea, so we don't provide a function for it in edgeR. Here is how you would do it however:

dge <- calcNormFactors(dge)
logCPM <- cpm(dge, log=TRUE)
n <- ncol(logCPM)
zvalues <- qnorm(ppoints(n))
z <- logCPM
for (i in 1:nrow(z)) z[i,] <- zvalues[order(order(z[i,]))]


The inverse normal values are now stored in z.

0
Entering edit mode

This looks fantastic, thank you! I have also read a few papers about overuse of INT in unfitting scenarios, but right now I’m just trying to replicate the data in a paper… Any resources you recommend for exploring other normalization methods? Thanks again.

0
Entering edit mode

The edgeR recommendation is simply to use logCPM for most purposes (other than the DE analysis itself, which does not require normalized expression values).