Question: Does Gene length corrected TMM [GeTMM] violate any assumptions of TMM normalization?
gravatar for O.rka
9 months ago by
O.rka170 wrote:

I've read and been told that edgeR must take in counts only. I noticed that the RPK values are being fed into the TMM normalization procedure. Is this a correct usage assuming all of the assumptions?

Should this be used for downstream DGE analysis?

Note: I am no expert with these methods but I just wanted to ask the community

# calculate RPK
rpk <- (x[,2:ncol(x)]/x[,1])
# remove length col in x
x <- x[,-1]
# for normalization purposes, no grouping of samples
group <- c(rep("A",ncol(x)))
x.norm.edger <- DGEList(counts=x,group=group)
x.norm.edger <- calcNormFactors(x.norm.edger)
norm.counts.edger <- cpm(x.norm.edger)

rpk.norm <- DGEList(counts=rpk,group=group)
rpk.norm <- calcNormFactors(rpk.norm)
norm.counts.rpk_edger <- cpm(rpk.norm)

# Source:
rna-seq • 453 views
ADD COMMENTlink modified 9 months ago by Damian Kao15k • written 9 months ago by O.rka170
gravatar for Damian Kao
9 months ago by
Damian Kao15k
Damian Kao15k wrote:

Technically, RPK values do not violate assumptions of TMM.

TMM is just a technique that tries to find the non-DE portion of the expression distribution by very liberally trimming off outliers. It doesn't matter what kind of expression units you are using.

However, RPK values do violate assumptions for DE analysis. So you cannot use it for downstream DGE.

ADD COMMENTlink written 9 months ago by Damian Kao15k
gravatar for swbarnes2
9 months ago by
United States
swbarnes27.4k wrote:

For DGE, use raw counts, like the software demands. Other normalizations can be used for things like visualizations.

ADD COMMENTlink written 9 months ago by swbarnes27.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1719 users visited in the last hour