Question: output TMM normalized counts with edgeR
0
gravatar for guillaume.rbt
9 months ago by
guillaume.rbt530
France
guillaume.rbt530 wrote:

Hi all,

Sorry I know that this question has been asked several times, but unfortunately I haven't been able to find the right answer, or didn't understand.

I'm trying to get TMM normalized counts thanks to edgeR.

I understand that I have to compute normalization factors :

dgList <- calcNormFactors(dgList, method="TMM")

which gives me a normalization factor for all samples :

head(dgList$samples)

group lib.size norm.factors
S1     1 21087314    0.9654794
S2     1 16542810    1.1589117
S3     1 18875473    0.8763291
S4     1 15865414    1.0864038
S5     1 19179795    1.0488230
S6     1 15063992    1.0707007

But at this step I don't know what to do to get a matrix of normalized TMM counts.

I know that I can get CPM normalized counts thanks to :

cpm(dgList)

But CPM and TMM are not the same, right ?

Thanks in advance for any of your input on this topic.

ADD COMMENTlink modified 9 months ago by James Ashmore2.6k • written 9 months ago by guillaume.rbt530
2
gravatar for James Ashmore
9 months ago by
James Ashmore2.6k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore2.6k wrote:

If you run the cpm function on a DGEList object which contains TMM normalisation factors then you will get TMM normalised counts. Here is a snippet of the source code for the cpm function:

cpm.DGEList <- function(y, normalized.lib.sizes=TRUE, log=FALSE, prior.count=0.25, ...)
#   Counts per million for a DGEList
#   Davis McCarthy and Gordon Smyth.
#   Created 20 June 2011. Last modified 10 July 2017
{
    lib.size <- y$samples$lib.size
    if(normalized.lib.sizes) lib.size <- lib.size*y$samples$norm.factors
    cpm.default(y$counts,lib.size=lib.size,log=log,prior.count=prior.count)
}

The function checks to see if a DGEList object was provided with a lib.size and norm.factors column (created when you run calcNormFactors), if so then it uses those in the normalisation of the raw counts. You were right in your original post, just run the following and you will have TMM normalised counts:

dge <- calcNormFactors(dge, method = "TMM")
tmm <- cpm(dge)
ADD COMMENTlink modified 9 months ago • written 9 months ago by James Ashmore2.6k

Ok great, I did think there was something with the cpm function, but I get it know.

ADD REPLYlink written 9 months ago by guillaume.rbt530
3
gravatar for lieven.sterck
9 months ago by
lieven.sterck4.1k
VIB, Ghent, Belgium
lieven.sterck4.1k wrote:

No, CPM and TMM are not exactly the same indeed.

perhaps try this snippet of code:

dgList <- estimateCommonDisp(dgList)
dgList <- estimateTagwiseDisp(dgList)
norm_counts.table <- t(t(dgList$pseudo.counts)*(dgList$samples$norm.factors))
write.table(norm_counts.table, file="./normalizedCounts.txt", sep="\t", quote=F)
ADD COMMENTlink written 9 months ago by lieven.sterck4.1k

Thank for your help.

Could you explain me what the "pseudo.counts" are?

ADD REPLYlink modified 9 months ago • written 9 months ago by guillaume.rbt530

There's no need to calculate the TMM values yourself, the cpm function should do it for you given a DGEList with the lib.size and norm.factors columns present (which you get after running calcNormFactors).

ADD REPLYlink modified 9 months ago • written 9 months ago by James Ashmore2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1995 users visited in the last hour