Why do I get a big log fold change but small mean change in b value when plotting differential methylation?
2
0
Entering edit mode
2.7 years ago
Christine • 0

Hello I am doing differential methylation analysis using limma. I use the m values for testing and b values for plotting. I plotted a volcano plot visualizing the p value and effect size, however I noticed something that I can't explain. Using the p value and the log fold change as reported by topTable() in limma, i get this volcano plot, which looks fairly like expected:

Volcano plot by adjusted p value and log fold change

Volcano plot by adjusted p value and log fold change

But then I then I thought it would make more biological sense to plot the significant probes by mean differences in b value, but the significant probes seems to have really small changes in mean b value, while the nonsignificant ones have bigger changes in mean b value.

Volcano plot by adjusted p value and mean difference in b value (the axis label is wrong) Volcano plot by adjusted p value and mean difference in b value (the axis label is wrong)

Could you help me decipher how to interpret this? I am afraid I am doing something wrong in my analysis, but I also suspect this has something to do with the calculation of b values to m values and maybe the variance. It also seems really strange that there is an inverse relationship between the p value and the mean difference in b value?

Sincerely, Christine

differential volcano methylation • 1.7k views
ADD COMMENT
4
Entering edit mode
2.7 years ago
Gordon Smyth ★ 7.0k

There's a reason why M-values are used for the statistical analysis instead of B-values! The latter are highly heteroscedastic and nonlinear and unsuitable for a linear modelling analysis.

I don't think it is appropriate to compute mean differences of B-values but, if you do, you will tend to highlight probes with low intensities because they tend to have extreme B values. By contrast, statistical significance tends to highlight probes with higher intensities because they give more reliable results. Hence the inverse relationship between B-value differences and significance.

ADD COMMENT
0
Entering edit mode

Thank you for your answers, the results makes much more sense in light of what you write. I was aware of the nature of B and M values, but not that it would be that important in the analysis (important lesson learned, thank you)

I am wondering how to get the most biologically meaningful probes though. The statistically significant sites have a low SE, but also a small absolute change, but by using the fold change I get probes that started with a low intensity, and get really high fold change even if they did not change that much in absolute value, so I would prefer to not use fold change.

I am assuming that maybe a change from e.g. 0.01 to 0.03 in methylation is not that biologically important as a change from e.g. 0.3 to 0.6.. Maybe it could be meaningful to have a less stringent p-value adjustment, and instead use a cutoff in mean change in M values to get the probes that also have a big mean change in methylation? Since mean difference in M value may be a more valid measure?

ADD REPLY
0
Entering edit mode

The statistically significant sites have a low SE, but also a small absolute change

That isn't true. The first volcano plot in your question shows that the significant sites have large fold changes.

I am assuming that maybe a change from e.g. 0.01 to 0.03 in methylation is not that biologically important as a change from e.g. 0.3 to 0.6

I am not aware of any biological principles that would allow you to assert that.

I would repeat Devon Ryan's comments on this forum from 5 years ago ( Interpretation of Beta values : Methylation data ): "There are already a number of Bioconductor packages for handling methylation data. I would strongly encourage you to use one of them and not try to come up with your own methods."

Using the most statistically significant sites is generally the best way to identify biologically significant changes. That's the whole purpose of the statistical method. You do not explain which software package you are using to process the data, compute M-values etc but I would suggest that you follow the documentation for whichever software package you are using.

ADD REPLY
2
Entering edit mode
2.7 years ago
Lemire ▴ 940

These are probes with mean methylation values very close to 0, and very small SE. The difference between, say, the mean methylation values 0.01 and 0.03 is small but the fold-change of 3 is large. The difference would be significant if the SE is also very small.

ADD COMMENT

Login before adding your answer.

Traffic: 1610 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6