Skewed beta-distribution from Methylation EPIC array data
1
0
Entering edit mode
4.2 years ago

Dear all, I'm a rookie in the field of bioinformatics. We recently started to work with Methylation EPIC array data from Illumina.

For our last 4 chips we received weird (skewed) beta distribution densities even after within-array normalization using the ChAMP package. Has anyone ever seen something similiar and have an advice?

Your help is very much appreciated!

Kind regards, Erwin

enter image description here

metylation EPIC Illumina ChAMP • 1.8k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

I'd expect some peak in the middle for humans - these are imprinted genes. Or the question is about something else?

ADD REPLY
0
Entering edit mode

I imagine the question is "why is the normalization making the signals vastly less comparable?", which I have no good answer to. Honestly, the before-normalization curves look more reasonable that what the normalization produced.

ADD REPLY
0
Entering edit mode

Just a guess - may be these samples were fundamentally different? Different cell type or cancer? Author mentioned nothing if there are such differences...

ADD REPLY
1
Entering edit mode

That could be, though the post-normalization beta distributions look nothing like either normal or diseased mammalian samples.

ADD REPLY
1
Entering edit mode

Then I'd advise to author to normalize with alternative method (eg rnbeads, it is impossible to make a mistake there) and compare results - I agree, these plots does not look all right

ADD REPLY
1
Entering edit mode
4.2 years ago

I think the comments pretty much answer the question, but I would recommend using a different normalization. Sometimes, popular normalization can cause problems that would be obvious upon visual inspection of the density distributions (like you have shown). So, you should try to figure out what works best for your specific dataset.

I often use the most basic normalization, from either GenomeStudio (where you can potentially filter more probes with the detection p-values) or minfi (which includes multiple pre-processing methods, including preprocessIllumina(), although I think the probe filtering may be slightly different/worse than exported and re-formatted GenomeStudio beta values for certain datasets?).

ADD COMMENT

Login before adding your answer.

Traffic: 2607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6