I've read and been told that edgeR must take in counts only. I noticed that the
RPK values are being fed into the
TMM normalization procedure. Is this a correct usage assuming all of the assumptions?
Should this be used for downstream DGE analysis?
Note: I am no expert with these methods but I just wanted to ask the community
# calculate RPK rpk <- (x[,2:ncol(x)]/x[,1]) # remove length col in x x <- x[,-1] # for normalization purposes, no grouping of samples group <- c(rep("A",ncol(x))) #EdgeR x.norm.edger <- DGEList(counts=x,group=group) x.norm.edger <- calcNormFactors(x.norm.edger) norm.counts.edger <- cpm(x.norm.edger) #GeTMM rpk.norm <- DGEList(counts=rpk,group=group) rpk.norm <- calcNormFactors(rpk.norm) norm.counts.rpk_edger <- cpm(rpk.norm) # Source: # https://static-content.springer.com/esm/art%3A10.1186%2Fs12859-018-2246-7/MediaObjects/12859_2018_2246_MOESM4_ESM.docx