Skewed beta-distribution from Methylation EPIC array data
1
0
Entering edit mode
2.4 years ago

Dear all, I'm a rookie in the field of bioinformatics. We recently started to work with Methylation EPIC array data from Illumina.

For our last 4 chips we received weird (skewed) beta distribution densities even after within-array normalization using the ChAMP package. Has anyone ever seen something similiar and have an advice?

Your help is very much appreciated!

Kind regards, Erwin

metylation EPIC Illumina ChAMP • 1.0k views
1
Entering edit mode
0
Entering edit mode

I'd expect some peak in the middle for humans - these are imprinted genes. Or the question is about something else?

0
Entering edit mode

I imagine the question is "why is the normalization making the signals vastly less comparable?", which I have no good answer to. Honestly, the before-normalization curves look more reasonable that what the normalization produced.

0
Entering edit mode

Just a guess - may be these samples were fundamentally different? Different cell type or cancer? Author mentioned nothing if there are such differences...

1
Entering edit mode

That could be, though the post-normalization beta distributions look nothing like either normal or diseased mammalian samples.

1
Entering edit mode

Then I'd advise to author to normalize with alternative method (eg rnbeads, it is impossible to make a mistake there) and compare results - I agree, these plots does not look all right

1
Entering edit mode
2.4 years ago

I think the comments pretty much answer the question, but I would recommend using a different normalization. Sometimes, popular normalization can cause problems that would be obvious upon visual inspection of the density distributions (like you have shown). So, you should try to figure out what works best for your specific dataset.

I often use the most basic normalization, from either GenomeStudio (where you can potentially filter more probes with the detection p-values) or minfi (which includes multiple pre-processing methods, including preprocessIllumina(), although I think the probe filtering may be slightly different/worse than exported and re-formatted GenomeStudio beta values for certain datasets?).