Question: How To Compare Gene Expression And Methylation Level Of A Gene
gravatar for Chip
6.8 years ago by
Chip110 wrote:

I am analyzing data (from the TCGA project) of patients affected by Glioblastoma Multiforme and, specifically, I want to compare Gene Expression values with Methylation levels.

Methylation levels have been obtained using Illumina Infinium HumanMethylation27 BeadChip, of which I downloaded the product support file*, that retrieves methylation levels of ~27k CpG sites.

Here comes the issue: for a lot of genes there are several probes (hence, CpG sites) that regulates the same gene. I was wondering what could be the best way to treat them as a unique entity, so to obtain a single methylation level for each gene.

I was thinking of taking the average of all the probes that control one specific gene but the assumption here is "all CpGs have the same importance as gene expression regulators" and I don't know if I can justify it.



gene expression methylation dna • 5.2k views
ADD COMMENTlink modified 6.8 years ago by B. Arman Aksoy1.2k • written 6.8 years ago by Chip110

This is an excellent question. How to summarise methylation probes to gene level is an issue that is routinely ignored or glossed over in publications on this topic. I call it the 'genes x samples' problem, because statistics papers always talk about "matrices of genes x samples" with no indication of how they were derived.

ADD REPLYlink modified 6.8 years ago • written 6.8 years ago by Neilfws49k

Thanks, though not a definitive answer it provides very useful insight. I will proceed taking one probe per gene.

ADD REPLYlink written 6.8 years ago by Chip110

Hey, there! Do you have find any method to do this jod? Recently, I also met the same problem. Thanks a lot! Wayne

ADD REPLYlink written 2.4 years ago by Wayne Lee10

As suggested in Neilfws's comment I decided to choose the probe with the highest variance.

ADD REPLYlink written 2.4 years ago by Chip110
gravatar for B. Arman Aksoy
6.8 years ago by
B. Arman Aksoy1.2k
New York, NY
B. Arman Aksoy1.2k wrote:

Selection of the probes is a hard problem; the way TCGA does it to assign a methylation score for each is to correlate all probe values in the proximity of a gene with the gene expression, and pick the one that best negative correlation. This is of course having some phenotype of interest in mind, i.e. it works only if you want to see methylation probes that can help explain the gene expression levels across many patients. If you have another phenotype in mind, I think you can apply the same thing, but correlate the values with a different measure instead of gene expression.

Have a look at this question and my reply to, I think you can find it useful: A: Interpreting Fractional Methylation Data

Also you can learn more about the TCGA way from this web site:

ADD COMMENTlink written 6.8 years ago by B. Arman Aksoy1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1923 users visited in the last hour