Comparison of ratio of expression between two genes across samples
Entering edit mode
10 months ago
guillepalou4 ▴ 20

Hi guys,

I have been reading a lot about within-sample and between-sample RNA-seq normalization but I still haven't found an answer for my question.

My data consists of an RNA-seq matrix with Transcripts-Per-Million (TPM) values, so it is within-sample normalized. I want to compare the expression of two genes within a sample (expression gene 1 / expression gene 2), and then compare the resulting ratio between samples (from the same group, I do not have different conditions or experiments, this is not my purpose). My guess is that, as I am comparing relative abundances between genes within a sample, I can compare these ratios between samples without a between-sample normalization (like DESeq2 or TMM). But I am not 100% sure. Do I need this normalization or I can compare samples directly with TPM in this particular case?

Thank you!


RNA-Seq TPM normalization relative abundance • 558 views
Entering edit mode
10 weeks ago

Hi, I faced a similar issue. I think you can use DESeq2 as a starting point, but then need to do some manual adjustments. I posted to DESeq2 support at but so far got no response whether the authors agree it is a sensible approach, so some caution is warranted.

Assuming a design of counts ~ condition the null hypothesis is that for each gene A and a fixed reference B: log(mu_A_condition1) - log(mu_B_condition1) >= log(mu_A_condition2) - log(mu_B_condition2).

My understanding is that this null hypothesis can actually be tested with DESeq2 with a little tweak, because with dummy coding it translates to beta_A_condition2 - beta_B_condition2 <= 0, so I can get the maximum likelihood estimate by just subtracting the two beta values given by DESeq2, get the SE by taking sqrt(beta_A_SE^2 + beta_B_SE_^2) (as the betas should have 0 covariance, right?) which is all I need to compute the Wald statistic and then I can once again carry on the same way DESeq2 does (p-adjustment, shrinkage, ...)

Here's my R code that performs this - you give the value of results(dds) for the coefficient you are trying to compare and the name of the reference gene. It then computes estimates for all genes against the reference, performs p-value adjustments and shrinkage with ashr

change_from_reference <- function(results, reference, altHypothesis = "greater" , threshold = 0) {

  reference_beta <- results[reference, "log2FoldChange"]
  reference_beta_SE <- results[reference, "lfcSE"]
  res_others <- results[rownames(results) != reference,]
  LFC <- res_others$log2FoldChange - reference_beta
  SE <- sqrt(res_others$lfcSE ^ 2 + reference_beta_SE ^ 2)
  T <- threshold

  #Taken from on
  pfunc <- function(q) pnorm(q, lower.tail=FALSE) 

  if (altHypothesis == "greaterAbs") {
      newStat <- sign(LFC) * pmax((abs(LFC) - T)/SE, 0)
      newPvalue <- pmin(1, 2 * pfunc((abs(LFC) - T)/SE))
    } else if (altHypothesis == "lessAbs") {
      newStatAbove <- pmax((T - LFC)/SE, 0)
      pvalueAbove <- pfunc((T - LFC)/SE)
      newStatBelow <- pmax((LFC + T)/SE, 0)
      pvalueBelow <- pfunc((LFC + T)/SE)
      newStat <- pmin(newStatAbove, newStatBelow)
      newPvalue <- pmax(pvalueAbove, pvalueBelow)
    } else if (altHypothesis == "greater") {
      newStat <- pmax((LFC - T)/SE, 0)
      newPvalue <- pfunc((LFC - T)/SE)
    } else if (altHypothesis == "less") {
      newStat <- pmin((LFC + T)/SE, 0)
      newPvalue <- pfunc((-T - LFC)/SE)
    } else {
      stop("Invalid altHypothesis")

  ashrfit <- ashr::ash(LFC, SE,
                     mixcompdist="normal", method="shrink")
  data.frame(gene = rownames(res_others), 
             estimate = LFC, shrunkEst = ashrfit$result$PosteriorMean, 
             shrunkSE = ashrfit$result$PosteriorSD, p.adj = p.adjust(newPvalue, method = "BH")) 

Login before adding your answer.

Traffic: 2281 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6