I have about 75 methylation profiles from diseased subjects that differ in several variables, e.g. a continuous variable that indicates disease severity. There are no classical different groups. The data stem from Illumina 450k arrays. I have done QC, normalization etc. with minfi, and ended up with a matrix of beta values.
I looked into different ways to assess differential methylation related to the different variables. I am, however, unsure what the most approporate way to tackle this problem could be.
I am concerned about the distribution of betas that from what I remember renders this problem unsuitable for normal linear methods, so I cannot use limma or just a lot of linear models. Or am I mistaken?
beta ~ variable1 + variable2 + (1|subject) (450k times, possibly inappropriate?)
An alternative way would be to 1/0 the data, e.g. by calling every beta below 0.5 'unmethylated', and all above 'methylated'. Thus, I could use (a lot of) logistic regressions to check for variables related to methylation. Still, this approach would loose me quite a lot of detail.
Methylated(0,1) ~ variable1 + variable2 + (1|subject) (450k times, takes a long time)
What do you think?