Question

Performing one-sample t-tests with limma

1

Entering edit mode

3.6 years ago

digestize ▴ 20

I am new to bioinformatics and come from a data analytics background. I have a set of normalized ratio measurements from an experiment for each protein under consideration. There are 4 measurements for each protein to be exact. I ran the one sample t-test against each protein and identified statistically significant proteins. I am wondering if I can do better by using limma. Is it possible to use limma here to get a p-value for each protein? Also, the measurements are ratios and normalized between 0 and 1. Do you I need to do any pre-processing if limma is applicable here? Could you also help identify the design matrix for this case if it is applicable? Thank you very much.

limma one t-test p-value proteomics sample • 1.3k views

ADD COMMENT • link updated 3.6 years ago by dariober 15k • written 3.6 years ago by digestize ▴ 20

score 0 · Answer 1 · 2021-12-08

the measurements are ratios and normalized between 0 and 1. Do you I need to do any pre-processing if limma is applicable here?

I would probably convert the values with e.g. the logit or arcsine function to make them more suitable to the assumptions of limma's model.

If you want to test each protein for whether it is above or below a given value, I would subtract that value from the data matrix (after transformation as above) then fit an intercept only model to limma and test whether the intercept is different from zero.

Here's an example with simulated data:

library(limma)

logit <- function(p) {
    log(p/(1-p))
}

# A matrix of 100 proteins and 4 replicates. Data is between 0 and 1. Average ratio is ~0.67. 
# Make first two proteins different from 0.67: 
set.seed(1234)
X <- matrix(data= rbeta(n= 400, 4, 2), ncol= 4)
X[1,] <- c(0.1, 0.11, 0.12, 0.13)
X[2,] <- c(0.91, 0.92, 0.95, 0.99)
rownames(X) <- paste0('P', 1:nrow(X))

Transform ratios to linear scale with logit. Subtract the critical value you want to test each protein against, here 0.67:

Y <- logit(X)
Y <- Y - logit(0.67)

Fit model with limma and test:

design <- matrix(1, nrow= ncol(Y))
fit <- lmFit(Y, design= design)
fit <- eBayes(fit)
topTable(fit)

# As expected, the first two proteins are detected as different from logit(0.67):
     logFC AveExpr     t  P.Value adj.P.Val      B
P1  -2.754  -2.754 -6.81 6.60e-06   0.00066  4.087
P2   2.366   2.366  5.06 1.52e-04   0.00759  0.953
P57 -1.614  -1.614 -3.85 1.63e-03   0.05432 -1.407
P66  1.558   1.558  2.81 1.35e-02   0.32918 -3.464
P58  1.757   1.757  2.71 1.65e-02   0.32918 -3.656
P94  0.981   0.981  2.28 3.80e-02   0.55366 -4.439
P96  0.961   0.961  2.27 3.88e-02   0.55366 -4.458
P17  1.076   1.076  2.19 4.50e-02   0.56269 -4.595
P9   0.854   0.854  2.08 5.53e-02   0.56675 -4.782
P15 -0.854  -0.854 -2.07 5.67e-02   0.56675 -4.805

Check this is correct... Maybe have a look also at the minfi package since (I think) it works with 0-to-1 ratios using limma under the hood.