methylation beta distribution (minfi generated)
Entering edit mode
8 weeks ago
anikb ▴ 40


I am analyzing EPIC methylation array and did necessary filtering for cross-reactive probes, common snps, excluded XY chr. ~10% of my samples cluster separately (which I am calling "outliers" for now) than the rest. Since these samples are collected from human brain with similar pathological conditions, I did not expect such differences. I tried to check other quality measures for this, nothing seems off the chart (good detection p values, bisulphite conversion rate etc.) I have other phenotypic data which I checked to see if any correlate with these samples, they don't. I looked at the overall beta distribution of all samples (top raw vs bottom quantile normalized), but the outliers kind of overlap with other samples at the lower side, so I guess the distribution is not off from the rest. enter image description here enter image description here

I took a subset of the outlier samples and age/sex matched with the other cluster and looked at their distribution, and see below for raw(top) and quantile normalized(bottom). For outlier samples, I see higher bumps for both unmethylated and methylated signals. enter image description here. enter image description here

This is where my confusion is, what could be technical issues that may lead to such patterns in the data? Can you suggest anything to check what's going on here?

There are two possibilities: 1. It's all due to some technical error. or 2. There could be some biological meaning? Testing #1 should be easier, but I am kind of stuck on what/where to look for. And I don't think it's worth diving into #2 until I exclude all possible reasons (that can be tested) for #1. Any help is appreciated!

array methylation • 243 views
Entering edit mode

Looking a the plot with all of the samples, the "outlier" samples do not seem to be too different from the overall distribution of samples. When you say they cluster separately, do you mean by PCA? Seeing the normalized plot, I'm surprised the distributions are so variable even after quantile normalization. Did you do background correction (for example with Noob) prior to the normalization? I imagine you already took a look at the QC probes in the array? (You can easily generate a report with minfi::qcReport). And checked ethnicity too?

I would be careful with focusing too much on the last plots because you have many less samples and part of the effect could be coincidence. It is nonetheless interesting. Because the density plot is relative, the higher bumps at both low- and high- methylation values indicate that those samples have less intermediate methylation levels. And the outlier samples may have higher methylation levels overall.

A biological guess: because you have brain tissue, could this be related to the purity or cell composition of the tissue? In my experience, brain tissue dissections can often be very variable in region. An even wilder biological guess: maybe, because methylation percentage is really a proxy of the values of a distribution of cells, the purer the cell types the more the values tend to the extremes and you're seeing a gradient

Entering edit mode

Thank you very much! After rechecking everything, we found that these samples came from a different brain tissue.


Login before adding your answer.

Traffic: 1379 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6