Probabilistic modeling of CpG ß-values
Entering edit mode
17 months ago
lotus28 ▴ 70

Has anyone tried reproducing these distributions in pymc3?

"A statistical model for the analysis of beta values in DNA methylation studies"

I have just tried, and although I can get the right PDF, with regular numpy, I have failed to implement it as a pymc3 distribution to do Bayes inference.


To give you more context, I have recently read this preprint: Profiling epigenetic age in single cells

The authors use a probabilistic approach to predict the age of single cells based on their DNAm profiles. Aging clocks are nothing new for bulk samples, but it is the first time I see smt like this done on single cells.

So, I had a reasonable question: can CpG ß-values be modeled with a normal distribution? The authors do not discuss much, what assumptions they make to make it all work, but imo, it is implied that DNAm is treated as a Bernoulli trial, which is then aggregated into a normally distributed ß-value of a CpG site.

Then it turned out that there is a specific PDF associated with ß-values. Certainly, I was curious to implement it and eventually failed.

I am currently using truncated normals, as implemented in pymc3 to work with ß-values. I wonder if switching for the distribution from the first article would give significantly different results.

statistics epigenetics methylation • 288 views

Login before adding your answer.

Traffic: 2237 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6