Question

Unrealistic limma p-value

0

Entering edit mode

2.1 years ago

bifi.ftw • 0

Dear Community,

I am new to limma and, since there are limited resources online on how to apply it to metabolomics data, I'm not sure if I implemented it correctly.
Especially because in this specific experiment some limma p-values seem quite unrealistic.

Plot
As you can see grp23 has an outlier in this metabolite.
Consequently, a regular Welch's t test results in p = 0.489.
Limma on the other hand reports it as significant: p = 0.0287

Could you please give me some advice, whether you think the results are trustworthy and where I maybe went wrong.

Data preparation for matrix m (according to Could I apply limma to metabolon concentrations?):
1) Normalized raw peak intensity (LCMS) to internal standard and protein
2) Removed metabolites where Missingness > 10% across test samples
3) log2-transformed normalized intensity
4) replaced -Inf values with 0
5) blinded the experiment, because it is unpublished data.

data: https://pastebin.com/L1mF6hAu
metadata: https://pastebin.com/JavEQ1Mx

My code:

m <- as.matrix(read.csv('m.csv', row.names=1))
metadata <- read.csv('metadata.csv')

groups <- factor(metadata$group)

comparison = list('grp23-grp18')

design <- model.matrix(~0+groups)
colnames(design) <- levels(groups)
cont_m <- makeContrasts(contrasts=comparison, levels=design)

fit <- lmFit(m, design)
fit <- contrasts.fit(fit, cont_m)
fit <- eBayes(fit, trend=TRUE)

res <- topTable(fit, coef=1, n=Inf)

# Limma result
res %>%
    filter(row.names(res) == 'metabolite_75')

# Welch's test
x <- c(-17.72762381, -9.338114467, -8.636741925) # grp23
y <- c(-10.13976212,-8.9665991,-9.218714982) # grp18
t.test(x,y)

metabolomics limma • 1.4k views

ADD COMMENT • link updated 2.1 years ago by Gordon Smyth ★ 7.0k • written 2.1 years ago by bifi.ftw • 0

score 6 · Accepted Answer · 2022-03-30

6

Entering edit mode

2.1 years ago

Gordon Smyth ★ 7.0k

Started-log transformation removes the outliers

Your transformation of the data has introduced outliers at both ends (high and low) of the data. Logging very small positive values has produced low outliers while replacing -Inf with zeros has produced high outliers. For your data, 0 is close to the maximum of the logged values so by replacing -Inf with zero you are converting the smallest values into the largest, which is obviously the opposite of what you should be doing. It is always better to use a started-log tansformation rather than straight log for data that can take zero or very small values.

I "repaired" your data like this. First I reversed your log-transformation and fudging of zeros:

> y <- exp(m)
> y[m==0] <- 0

Then I used a started log-transform with a small offset instead:

> y <- log2(y+1e-5)

There are now no outliers and the p-value for metabolite 75 is 0.375 instead of 0.0287:

> fit <- lmFit(y,design)
> fit2 <- contrasts.fit(fit, cont_m)
> fit2 <- eBayes(fit2)
> res <- topTable(fit2, n=Inf)
> res["metabolite_75", ]
                   logFC   AveExpr          t   P.Value adj.P.Val         B
metabolite_75 -0.6711282 -12.80573 -0.8958151 0.3747599 0.8525787 -5.674766

By the way, a p-value of 0.02 is not significant in this multiple testing context. You have to judge significance from the adjusted p-value (FDR) column.There are actually no significant differentially expressed metabolites either in your analysis or mine.

In my analysis, it was not necessary to use trend=TRUE in eBayes because the started-log transformation with an appropriately chosen offset stabilized the variance, as you can see from plotSA(fit2).

Correction

Thank you for pointing out below that you used log2 instead of log. My mistake---I had read your question too quickly. With log2 the unlog transformation and started-log offset need to be changed accordingly. Naturally enough, with the correct transformation, the results are even better than before.

First set the unlog intensities:

> Intensity <- 2^m
> Intensity[m==0] <- 0

Then choose an offset. The smallest positive intensity values are around 0.0004:

> quantile(Intensity[Intensity>0], 0.1)
      10% 
0.0004444

A little experimentation shows that an offset of 0.001 stabilizes the variance nicely, which can been seen from the horizontal trend-line from plotSA():

> y <- log2(Intensity + 0.001)
> fit <- lmFit(y, design)
> fit <- eBayes(fit, trend=TRUE)
> plotSA(fit)

Now do the DE analysis. The p-value for metabolite 75 is even less significant than before:

> fit2 <- contrasts.fit(fit, cont_m)
> fit2 <- eBayes(fit2)
> res <- topTable(fit2, n=Inf)
> res["metabolite_75", ]
                logFC AveExpr       t P.Value adj.P.Val      B
metabolite_75 -0.2515  -8.308 -0.7155  0.4777    0.9611 -5.768

ANOVA approach

Given that you have so many groups, I would personally use an overal F-test to find metabolites that vary between any of the groups (instead of using lots of pairwise tests):

> design2 <- model.matrix(~groups)
> fit <- lmFit(y,design2)
> fit <- eBayes(fit)
> topTable(fit)

The top metabolite from this analysis is metabolite 10, which seems a pretty clear result:

ADD COMMENT • link 2.1 years ago by Gordon Smyth ★ 7.0k

0

Entering edit mode

Dear Gordon,

Thank you very much for taking the time to write your answer.

Your explanation for the data "reparation" makes perfect sense.

I am of course aware, that I need to interpret the adjusted p-values. I used the unadjusted p-values only to get a feeling for whether the limma analysis is plausible.

Also thank you for suggesting the ANOVA approach, I suppose I blinded the experiment a bit too much, since some of the groups are time course data. I'm not interested in comparing all the groups.

Thank you for the suggestion to use a started log-transformation. I will do this from now on.
However, I'm wondering how you decided on 1e-5 as the constant (c).

I did some testing and the p-value, e.g. of metabolite_75, strongly depends on this constant. The p-value decreases with decreasing c and converges to the p-value I got from my original "broken" data set where the outliers dominated.

c = 1e-4: 0.8110358
c = 1e-5: 0.3721194
c = 1e-6: 0.1362911
c = 1e-7: 0.05532647
c = 1e-8: 0.03213364
c = 1e-9: 0.02850181

Here is the distribution for the values of c:

m <- as.matrix(read.csv('m.csv', row.names=1))
y <- exp(m)
y[m==0] <- 0
par(mfrow=c(2,6))
hist(log(y + 1e-4)); hist(log(y + 1e-5)); hist(log(y + 1e-6)); hist(log(y + 1e-7)); hist(log(y + 1e-8)); hist(log(y + 1e-9))
qqnorm(log(y + 1e-4)); qqnorm(log(y + 1e-5)); qqnorm(log(y + 1e-6)); qqnorm(log(y + 1e-7)); qqnorm(log(y + 1e-8)); qqnorm(log(y + 1e-9))

Distribution

Visually, the histogram with c = 1e-5 still seems slightly skewed.
Since the p-value strongly depends on this value, how did you decide on 1e-5?

Again, thank you very much for your time!

Edit:
I just realized you reversed my log2-transformation with exp(m), but it should be 2^m, shouldn't it?
That would change the p-value with c = 1e-5 to p = 0.04999265.
Interesting, that it's still barely significant when unadjusted.

Edit2: Moved this from a reply to a comment.

ADD REPLY • link 2.1 years ago by bifi.ftw • 0

0

Entering edit mode

Sorry, yes it should be 2^m instead of exp(m). I had read your question too quickly. I've now updated my answer to the use base-2.

No, using 2^m instead of exp(m) does not decrease the p-value.

The offset is chosen to mitigate outliers and to stabilize the variance, i.e., to get a flat line from plotSA(). An offset comparable to the smallest positive intensities usually works best. The analysis is not particularly sensitive to the choice of offset, but you will obviously see differences if you change the offset by orders of magnitude.

I disagree with your assertion that the p-value "depends strongly" on the offset value. The fact is that you changed the offset by four orders of magnitude from my value and still failed to get a p-value that would show up as significant in a limma analysis. Anyway, the offset is not chosen to get a particular p-value that you might want or not want.

limma makes no assumption about the marginal distribution of the log-intensities, so the histograms and qqplots are not relevant. The marginal distribution can in principle be any shape at all.

ADD REPLY • link 2.1 years ago by Gordon Smyth ★ 7.0k

0

Entering edit mode

Dear Gordon,

Thank you very much for clarifying your answer.

I played around a bit more, trying to understand the influence of the offset value and found that indeed, when staying within the same order of magnitude, the p-value is quite stable.
I also found that changing the offset by 10% (from 0.001 to 0.0011 or 0.0009) changes the logFC by almost 10% (from -0.2515 to -0.2321 and -0.2743).
I think this will not change the result of my experiment, but I am scared to choose the wrong offset in my next experiment.

A little experimentation shows that an offset of 0.001 stabilizes the variance nicely, which can been seen from the horizontal trend-line from plotSA():

Could you please briefly elaborate what exactly you are looking at or trying to optimize, when "experimenting" with different values?
I'm still not sure how to decide on 0.001 instead of 0.005, for example.

You mentioned the plotSA(fit). I plotted it for 3 different offsets:
Left to right: 0.001 (your chosen value), 0.000444 (10% quantile), 0.0001 (10% quantile rounded down)
plotSA() with different started log-transformations

Did you simply visually estimate which offset produces the best fit for a horizontal curve? I wonder, should this be done analytically or is estimating which round number to chose enough?

Edit:

In a previous version of your answer you wrote:

Usually the offset is chosen according to the granularity of the data but, since I know nothing about how your data was generated, I had to choose the offset empirically.

Metabolites were quantified via LCMS. Most likely, metabolites that were detected in some, but not all samples, are close to the detection limit and are therefore not missing randomly. Interestingly, 0.001 is approximately our detection limit (normalized to internal standard):
i.e. intensity of internal standard: ~ 1e7,
lowest detectable intensities: ~ 1e4
-> lowest detectable intensity normalized to internal standard: 1e4/1e7 = 0.001.

We also normalize to protein, where we roughly divide the value by 100 again. But I'm not sure if this should go into the offset.

Thank you again very much for your input and taking your time to answer my questions.

In your correction you write:

y <- log2(Intensity + 0.001)
fit <- lmFit(y, design)
fit <- eBayes(design, trend=TRUE)
plotSA(fit)

I think the 3rd line should be "fit <- eBayes(fit, trend=TRUE)".

ADD REPLY • link 2.1 years ago by bifi.ftw • 0

0

Entering edit mode

You're right, an offset of 0.0001 is bettter with a flatter curve. But I don't think it will matter, all will probably give the same overall results.

Yes, thanks, the 3rd code line was another typo from me. Now fixed.

ADD REPLY • link 2.1 years ago by Gordon Smyth ★ 7.0k