Hi, I have a question on edgeR cpm function. If we follow the user manual, the standard workflow should be
x <- read.delim("TableOfCounts.txt",row.names="Symbol") group <- factor(c(1,1,2,2)) y <- DGEList(counts=x,group=group) keep <- filterByExpr(y) y <- y[keep,,keep.lib.sizes=FALSE] y <- calcNormFactors(y) *1 design <- model.matrix(~group) y <- estimateDisp(y,design) *2
When we want to get the count table of log2cpm by cpm() function, for example, for clustering or heatmap, as in
logcpm <- cpm(y, log=TRUE)
should it happen at point 1 just after calculating NormFactors or at point 2 after estimation of dispersion? Is it OK as long as it is after calcNormFactors()?