Hello I am doing differential methylation analysis using limma. I use the m values for testing and b values for plotting. I plotted a volcano plot visualizing the p value and effect size, however I noticed something that I can't explain. Using the p value and the log fold change as reported by topTable() in limma, i get this volcano plot, which looks fairly like expected:
Volcano plot by adjusted p value and log fold change
But then I then I thought it would make more biological sense to plot the significant probes by mean differences in b value, but the significant probes seems to have really small changes in mean b value, while the nonsignificant ones have bigger changes in mean b value.
Volcano plot by adjusted p value and mean difference in b value (the axis label is wrong)
Could you help me decipher how to interpret this? I am afraid I am doing something wrong in my analysis, but I also suspect this has something to do with the calculation of b values to m values and maybe the variance. It also seems really strange that there is an inverse relationship between the p value and the mean difference in b value?
Thank you for your answers, the results makes much more sense in light of what you write. I was aware of the nature of B and M values, but not that it would be that important in the analysis (important lesson learned, thank you)
I am wondering how to get the most biologically meaningful probes though. The statistically significant sites have a low SE, but also a small absolute change, but by using the fold change I get probes that started with a low intensity, and get really high fold change even if they did not change that much in absolute value, so I would prefer to not use fold change.
I am assuming that maybe a change from e.g. 0.01 to 0.03 in methylation is not that biologically important as a change from e.g. 0.3 to 0.6.. Maybe it could be meaningful to have a less stringent p-value adjustment, and instead use a cutoff in mean change in M values to get the probes that also have a big mean change in methylation? Since mean difference in M value may be a more valid measure?
That isn't true. The first volcano plot in your question shows that the significant sites have large fold changes.
I am not aware of any biological principles that would allow you to assert that.
I would repeat Devon Ryan's comments on this forum from 5 years ago ( Interpretation of Beta values : Methylation data ): "There are already a number of Bioconductor packages for handling methylation data. I would strongly encourage you to use one of them and not try to come up with your own methods."
Using the most statistically significant sites is generally the best way to identify biologically significant changes. That's the whole purpose of the statistical method. You do not explain which software package you are using to process the data, compute M-values etc but I would suggest that you follow the documentation for whichever software package you are using.