I am currently researching the differences in RNA-Seq data analysis, comparing the two well known EdgeR and Voom methods. However, there is one thing I can not manage to reproduce, namely the logCPM value in the output of the LRT table of EdgeR, after analyzing a certain contrast or coefficient. I understand from the manual and helpfile, that this logCPM value is a log2 counts per million, normalized for library sizes. I also understand that it is not a simple mean that is being used, but rather the mglmOneGroup() function. However, when I try to recalculate this myself, I can not obtain the same value. My workflow in doing so looks like this:
#Calculate through the table counts <- read.delim("file.txt") Treat <- factor(rep(c(rep(c("Cont","DPN","OHT"),4)),each=2))[-19] #Delete removed sample 19 by authors y.s <- DGEList(counts=counts.filtered.smart,group=Treat) y.s <- calcNormFactors(y.s) y.common <- estimateGLMCommonDisp(y, design, verbose=TRUE) y.trended <- estimateGLMTrendedDisp(y.common, design) y.tagwise <- estimateGLMTagwiseDisp(y.trended, design,prior.df=22) lrt <- glmLRT(fit,coef= c(5,6,9,8)) lrt$table$logCPM #Calculate manually cpmstest <- cpm(y, normalized.lib.sizes = TRUE) LogCpmstest <- log(cpmstest + 0.5, base = 2) mglmOneGroup(LogCpmstest) #not identical mglmOneGroup(y.tagwise) #also non-identical
Has anyone got any recommendations in what should be changed in this manual calculation workflow? What am I doing wrong? Any help would be greatly appreciated.