Question: p-value calculation when comparing ratios of means (not means of ratios)
1
gravatar for wildtype
15 months ago by
wildtype40
wildtype40 wrote:

Hello, I apologize in advance for the dumb question but I am no expert in statistics, my course only covered how to do student's t.test pretty much. I could not find information in different forums on a similar problem specifically. I am not sure what to do in the following situation. Suppose we measure the expression of geneX by RT-qPCR in two different cell lines, each of which is treated with some drug or not treated. They are independent, so it is not the same cells before an after treatment; it is different cells that are either treated or not. Suppose we have 3 biological replicates for each condition, and the following values

Cell line A, no treatment: c(0.9,1,1.1) => mean expression 1 +/- 0.1 standard deviation (let's call this A1) Cell line A, treatment: c(9,10,11) => mean expression 10 +/- 1 sd (A2) Cell line B: no treatment: c(2.9,3,3.1) => mean expression 3 +-/ 0.1 sd (B1) Cell line B: treatment: c(290,300,310) => mean expression 300 +-/ 10 sd (B2)

Obviously, the drug causes an increase in both cell lines, but in line A there is a 10-fold increase (A2/A1=10/1) while in line B there is a 100 fold increase (B2/B1=300/3). I want to claim that the effect in B is much stronger.

What statistical test should we use to tell whether the RATIOS (gene expression upon treatment normalized to basal gene expression without treatment) are significantly different i.e. A2/A1 +/- propagated error != B2/B1 +/- error

Thanks in advance

statistics • 600 views
ADD COMMENTlink modified 13 months ago by ejm32440 • written 15 months ago by wildtype40
1
gravatar for pbpanigrahi
13 months ago by
pbpanigrahi180
pbpanigrahi180 wrote:

Hi

Since you are interested in asking the question I want to claim that the effect in B is much stronger, your

  • Null hypothesis can be: Fold change in cell line A is <= B
  • Alternate hypothesis can be: Fold change in cell line A is > B

Now you will do the testing. Since there are 3 replicates, n=3, you will have to go for t-test with 2 degree of freedom (mark that you have less power). Also since before and after are not paired, you should go for unpaired t test.

Before doing you have to calculate fold change e.g. A1 with A2 and B1 with B2, so you will have 3 fold change values

a1 = c(0.9,1,1.1)
a2 = c(9,10,11)
b1 = c(2.9,3,3.1)
b2 = c(290,300,310)

a_diff = a2/a1
b_diff = b2/b1

t.test(a_diff, b_diff, alternative = "greater")

In the example above, there is 10 fold change in every comparison, so t test would give error. Why it will give error, refer this link

Hope this helps

ADD COMMENTlink written 13 months ago by pbpanigrahi180

The problems with this approach is that you are pairing a1 to a2 and b1 to b2, when OP specifically states that before and after treatment are independent samples. Equally valid would be a_diff <- rev(a1)/a2 and all other permutations

ADD REPLYlink written 13 months ago by ejm32440
1
gravatar for ejm32
13 months ago by
ejm32440
Boston, MA
ejm32440 wrote:

In this case I think the best thing to is make use of linear models, which will allow you to perform contrasts based on the comparisons you care about. Have a look at the limma manual for a bit more information.

library(limma)
a1 <-  c(0.9,1,1.1)
a2 <-  c(9,10,11)
b1 <-  c(2.9,3,3.1)
b2 <-  c(290,300,310)

m <- data.frame(a1,a2,b1,b2, stringsAsFactors = F)
m <- reshape2::melt(m)

mm <- model.matrix( ~ 0 + m$variable)
colnames(mm) <- levels(m$variable)
res <- lmFit(t(log2(m[2])), mm) #If your values are already log transformed remove the log2 call

cm <- makeContrasts(A=a2-a1,
                    B=b2-b1,
                    Diff = (b2-b1)-(a2-a1), levels = levels(m$variable))

res2 <- contrasts.fit(res, cm)
res3 <- eBayes(res2)
topTable(res3, coef = 3 ,number = 1)

#              logFC  AveExpr           t      P.Value    adj.P.Val         B
# value   3.32192809 3.281243 27.82837605 6.974347e-10 2.789739e-09 13.450186
ADD COMMENTlink modified 13 months ago • written 13 months ago by ejm32440
0
gravatar for Chirag Parsania
15 months ago by
Chirag Parsania1.5k
University of Macau
Chirag Parsania1.5k wrote:

Hi, Here I showed the crude way to calculate the probability from normal distribution using given information. I am not very much familiar with various statistical tests. There must be sophisticated ways to get p-values than what i mentioned here.

## generate populations from given information
a1 <- rnorm(n =1000 , mean = 1, 0.1)
a2 <- rnorm(n =1000 , mean = 10, 1)
b1 <- rnorm(n =1000 , mean = 3, 0.1)
b2 <- rnorm(n =1000 , mean = 300, 10)

## get distribution of ratios. You can change number of iterations 
n_iter <- 1000
ratios <- sapply(seq(n_iter), function(i){

        ## select 3 random from each populations  

        samp_a1  <- mean(sample(a1, 3))
        samp_a2  <- mean(sample(a1, 3))
        samp_b1  <- mean(sample(b1, 3))
        samp_b2 <- mean(sample(b2, 3))


        r1 <- sampa2 /sampa1  ## A2/A1

        r2 <- sampb2 /sampb1 ## B2/B1 

        r_ratios <- r2/r1  ## Treatment / WO Treatment

        return(r_diff)        
})

## distribution of ratios 
plot(density(ratios))

## zscore 
zscore <-  (1 -  mean(ratios)) / sd(ratios)

## probability of normal dist   
pnorm(zscore)
ADD COMMENTlink modified 15 months ago • written 15 months ago by Chirag Parsania1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1445 users visited in the last hour