When is the correct timing to perform edgeR cpm calculation?
1
1
Entering edit mode
20 months ago
camelest ▴ 50

Hi, I have a question on edgeR cpm function. If we follow the user manual, the standard workflow should be

x <- read.delim("TableOfCounts.txt",row.names="Symbol")
group <- factor(c(1,1,2,2))
y <- DGEList(counts=x,group=group)
keep <- filterByExpr(y)
y <- y[keep,,keep.lib.sizes=FALSE]
y <- calcNormFactors(y) *1
design <- model.matrix(~group)
y <- estimateDisp(y,design) *2

When we want to get the count table of log2cpm by cpm() function, for example, for clustering or heatmap, as in

logcpm <- cpm(y, log=TRUE)

should it happen at point 1 just after calculating NormFactors or at point 2 after estimation of dispersion? Is it OK as long as it is after calcNormFactors()?

RNA-seq edgeR CPM • 776 views
ADD COMMENT
2
Entering edit mode
20 months ago
Gordon Smyth ★ 7.0k

Yes, anytime after calcNormFactors(). Dispersion estimates don't effect the logCPM.

ADD COMMENT
0
Entering edit mode

Thank you so much for the clarification.

ADD REPLY

Login before adding your answer.

Traffic: 2717 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6