Question: p-value calculation when comparing ratios of means (not means of ratios)
gravatar for wildtype
2.8 years ago by
wildtype40 wrote:

Hello, I apologize in advance for the dumb question but I am no expert in statistics, my course only covered how to do student's t.test pretty much. I could not find information in different forums on a similar problem specifically. I am not sure what to do in the following situation. Suppose we measure the expression of geneX by RT-qPCR in two different cell lines, each of which is treated with some drug or not treated. They are independent, so it is not the same cells before an after treatment; it is different cells that are either treated or not. Suppose we have 3 biological replicates for each condition, and the following values

Cell line A, no treatment: c(0.9,1,1.1) => mean expression 1 +/- 0.1 standard deviation (let's call this A1) Cell line A, treatment: c(9,10,11) => mean expression 10 +/- 1 sd (A2) Cell line B: no treatment: c(2.9,3,3.1) => mean expression 3 +-/ 0.1 sd (B1) Cell line B: treatment: c(290,300,310) => mean expression 300 +-/ 10 sd (B2)

Obviously, the drug causes an increase in both cell lines, but in line A there is a 10-fold increase (A2/A1=10/1) while in line B there is a 100 fold increase (B2/B1=300/3). I want to claim that the effect in B is much stronger.

What statistical test should we use to tell whether the RATIOS (gene expression upon treatment normalized to basal gene expression without treatment) are significantly different i.e. A2/A1 +/- propagated error != B2/B1 +/- error

Thanks in advance

statistics • 1.1k views
ADD COMMENTlink modified 2.7 years ago by ejm32440 • written 2.8 years ago by wildtype40
gravatar for pbpanigrahi
2.7 years ago by
pbpanigrahi190 wrote:


Since you are interested in asking the question I want to claim that the effect in B is much stronger, your

  • Null hypothesis can be: Fold change in cell line A is <= B
  • Alternate hypothesis can be: Fold change in cell line A is > B

Now you will do the testing. Since there are 3 replicates, n=3, you will have to go for t-test with 2 degree of freedom (mark that you have less power). Also since before and after are not paired, you should go for unpaired t test.

Before doing you have to calculate fold change e.g. A1 with A2 and B1 with B2, so you will have 3 fold change values

a1 = c(0.9,1,1.1)
a2 = c(9,10,11)
b1 = c(2.9,3,3.1)
b2 = c(290,300,310)

a_diff = a2/a1
b_diff = b2/b1

t.test(a_diff, b_diff, alternative = "greater")

In the example above, there is 10 fold change in every comparison, so t test would give error. Why it will give error, refer this link

Hope this helps

ADD COMMENTlink written 2.7 years ago by pbpanigrahi190

The problems with this approach is that you are pairing a1 to a2 and b1 to b2, when OP specifically states that before and after treatment are independent samples. Equally valid would be a_diff <- rev(a1)/a2 and all other permutations

ADD REPLYlink written 2.7 years ago by ejm32440
gravatar for ejm32
2.7 years ago by
Boston, MA
ejm32440 wrote:

In this case I think the best thing to is make use of linear models, which will allow you to perform contrasts based on the comparisons you care about. Have a look at the limma manual for a bit more information.

a1 <-  c(0.9,1,1.1)
a2 <-  c(9,10,11)
b1 <-  c(2.9,3,3.1)
b2 <-  c(290,300,310)

m <- data.frame(a1,a2,b1,b2, stringsAsFactors = F)
m <- reshape2::melt(m)

mm <- model.matrix( ~ 0 + m$variable)
colnames(mm) <- levels(m$variable)
res <- lmFit(t(log2(m[2])), mm) #If your values are already log transformed remove the log2 call

cm <- makeContrasts(A=a2-a1,
                    Diff = (b2-b1)-(a2-a1), levels = levels(m$variable))

res2 <-, cm)
res3 <- eBayes(res2)
topTable(res3, coef = 3 ,number = 1)

#              logFC  AveExpr           t      P.Value    adj.P.Val         B
# value   3.32192809 3.281243 27.82837605 6.974347e-10 2.789739e-09 13.450186
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by ejm32440
gravatar for Chirag Parsania
2.8 years ago by
Chirag Parsania1.9k
University of Macau
Chirag Parsania1.9k wrote:

Hi, Here I showed the crude way to calculate the probability from normal distribution using given information. I am not very much familiar with various statistical tests. There must be sophisticated ways to get p-values than what i mentioned here.

## generate populations from given information
a1 <- rnorm(n =1000 , mean = 1, 0.1)
a2 <- rnorm(n =1000 , mean = 10, 1)
b1 <- rnorm(n =1000 , mean = 3, 0.1)
b2 <- rnorm(n =1000 , mean = 300, 10)

## get distribution of ratios. You can change number of iterations 
n_iter <- 1000
ratios <- sapply(seq(n_iter), function(i){

        ## select 3 random from each populations  

        samp_a1  <- mean(sample(a1, 3))
        samp_a2  <- mean(sample(a1, 3))
        samp_b1  <- mean(sample(b1, 3))
        samp_b2 <- mean(sample(b2, 3))

        r1 <- sampa2 /sampa1  ## A2/A1

        r2 <- sampb2 /sampb1 ## B2/B1 

        r_ratios <- r2/r1  ## Treatment / WO Treatment


## distribution of ratios 

## zscore 
zscore <-  (1 -  mean(ratios)) / sd(ratios)

## probability of normal dist   
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by Chirag Parsania1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1309 users visited in the last hour