Dear Community,
I am new to limma and, since there are limited resources online on how to apply it to metabolomics data, I'm not sure if I implemented it correctly.
Especially because in this specific experiment some limma p-values seem quite unrealistic.
As you can see grp23 has an outlier in this metabolite.
Consequently, a regular Welch's t test results in p = 0.489.
Limma on the other hand reports it as significant: p = 0.0287
Could you please give me some advice, whether you think the results are trustworthy and where I maybe went wrong.
Data preparation for matrix m (according to Could I apply limma to metabolon concentrations?):
1) Normalized raw peak intensity (LCMS) to internal standard and protein
2) Removed metabolites where Missingness > 10% across test samples
3) log2-transformed normalized intensity
4) replaced -Inf values with 0
5) blinded the experiment, because it is unpublished data.
data: https://pastebin.com/L1mF6hAu
metadata: https://pastebin.com/JavEQ1Mx
My code:
m <- as.matrix(read.csv('m.csv', row.names=1))
metadata <- read.csv('metadata.csv')
groups <- factor(metadata$group)
comparison = list('grp23-grp18')
design <- model.matrix(~0+groups)
colnames(design) <- levels(groups)
cont_m <- makeContrasts(contrasts=comparison, levels=design)
fit <- lmFit(m, design)
fit <- contrasts.fit(fit, cont_m)
fit <- eBayes(fit, trend=TRUE)
res <- topTable(fit, coef=1, n=Inf)
# Limma result
res %>%
filter(row.names(res) == 'metabolite_75')
# Welch's test
x <- c(-17.72762381, -9.338114467, -8.636741925) # grp23
y <- c(-10.13976212,-8.9665991,-9.218714982) # grp18
t.test(x,y)
Dear Gordon,
Thank you very much for taking the time to write your answer.
Your explanation for the data "reparation" makes perfect sense.
I am of course aware, that I need to interpret the adjusted p-values. I used the unadjusted p-values only to get a feeling for whether the limma analysis is plausible.
Also thank you for suggesting the ANOVA approach, I suppose I blinded the experiment a bit too much, since some of the groups are time course data. I'm not interested in comparing all the groups.
Thank you for the suggestion to use a started log-transformation. I will do this from now on.
However, I'm wondering how you decided on 1e-5 as the constant (c).
I did some testing and the p-value, e.g. of metabolite_75, strongly depends on this constant. The p-value decreases with decreasing c and converges to the p-value I got from my original "broken" data set where the outliers dominated.
c = 1e-4: 0.8110358
c = 1e-5: 0.3721194
c = 1e-6: 0.1362911
c = 1e-7: 0.05532647
c = 1e-8: 0.03213364
c = 1e-9: 0.02850181
Here is the distribution for the values of c:
Visually, the histogram with c = 1e-5 still seems slightly skewed.
Since the p-value strongly depends on this value, how did you decide on 1e-5?
Again, thank you very much for your time!
Edit:
I just realized you reversed my log2-transformation with exp(m), but it should be 2^m, shouldn't it?
That would change the p-value with c = 1e-5 to p = 0.04999265.
Interesting, that it's still barely significant when unadjusted.
Edit2: Moved this from a reply to a comment.
Sorry, yes it should be 2^m instead of exp(m). I had read your question too quickly. I've now updated my answer to the use base-2.
No, using 2^m instead of exp(m) does not decrease the p-value.
The offset is chosen to mitigate outliers and to stabilize the variance, i.e., to get a flat line from plotSA(). An offset comparable to the smallest positive intensities usually works best. The analysis is not particularly sensitive to the choice of offset, but you will obviously see differences if you change the offset by orders of magnitude.
I disagree with your assertion that the p-value "depends strongly" on the offset value. The fact is that you changed the offset by four orders of magnitude from my value and still failed to get a p-value that would show up as significant in a limma analysis. Anyway, the offset is not chosen to get a particular p-value that you might want or not want.
limma makes no assumption about the marginal distribution of the log-intensities, so the histograms and qqplots are not relevant. The marginal distribution can in principle be any shape at all.
Dear Gordon,
Thank you very much for clarifying your answer.
I played around a bit more, trying to understand the influence of the offset value and found that indeed, when staying within the same order of magnitude, the p-value is quite stable.
I also found that changing the offset by 10% (from 0.001 to 0.0011 or 0.0009) changes the logFC by almost 10% (from -0.2515 to -0.2321 and -0.2743).
I think this will not change the result of my experiment, but I am scared to choose the wrong offset in my next experiment.
Could you please briefly elaborate what exactly you are looking at or trying to optimize, when "experimenting" with different values?
I'm still not sure how to decide on 0.001 instead of 0.005, for example.
You mentioned the plotSA(fit). I plotted it for 3 different offsets:
Left to right: 0.001 (your chosen value), 0.000444 (10% quantile), 0.0001 (10% quantile rounded down)
Did you simply visually estimate which offset produces the best fit for a horizontal curve? I wonder, should this be done analytically or is estimating which round number to chose enough?
Edit:
In a previous version of your answer you wrote:
Metabolites were quantified via LCMS. Most likely, metabolites that were detected in some, but not all samples, are close to the detection limit and are therefore not missing randomly. Interestingly, 0.001 is approximately our detection limit (normalized to internal standard):
i.e. intensity of internal standard: ~ 1e7,
lowest detectable intensities: ~ 1e4
-> lowest detectable intensity normalized to internal standard: 1e4/1e7 = 0.001.
We also normalize to protein, where we roughly divide the value by 100 again. But I'm not sure if this should go into the offset.
Thank you again very much for your input and taking your time to answer my questions.
In your correction you write:
I think the 3rd line should be "fit <- eBayes(fit, trend=TRUE)".
You're right, an offset of 0.0001 is bettter with a flatter curve. But I don't think it will matter, all will probably give the same overall results.
Yes, thanks, the 3rd code line was another typo from me. Now fixed.