Question: For GEO data, how to calculate the average methylation level of the gene promoter region using R.
0
gravatar for a511512345
8 months ago by
a51151234590
China guangxi nanning
a51151234590 wrote:

Hello, guys I am currently learning to use the TCGA-Assembler package to process methylated chips from TCGA. TCGA-Assembler provides a function (CalculateSingleValueMethylationData) to calculate the average methylation level of a particular region of a gene.

However, for GEO data, how to calculate the average methylation level of the gene promoter region using R. I look forward to your reply. thank you very much

methylation geo • 504 views
ADD COMMENTlink modified 8 months ago by Kevin Blighe51k • written 8 months ago by a51151234590
3
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe51k
Kevin Blighe51k wrote:

There is no standard package for this, of course. It will take some effort on your part.

Some pointers:

  1. download the normalised β (beta) methylation values from GEO. You will know that they are β values because the distribution will go from 0.0 to 1.0. Most methylation data on GEO in the series matrix files should be normalised. There is usually an automated R script that you can use, too. From the main accession page, click on the blue Analyze with GEO2R button
  2. download promoter regions as a BED file - you will have to define what is a promoter in your study. Generally, there is no clear definition of what is a promoter, but activity of H3K27ac, H3K4me1, and H3K27me3 are observed at promoters (and enhancers). You can download information for these from the ChromHMM study (do a search). My preference, however, would be to take the data from FANTOM5, a study from Japan whose aim was to define promoter regions.
  3. summarise methylation by mean across your promoter regions. For this in R, you can use GenomicRanges

Kevin

Edit based on noorpratap's comment: it is highly likely that the methylation array already has many probes that target promoter regions. Thus, why not just use these? Check the array platform and then try to obtain the associated annotation / metadata associated with this.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Kevin Blighe51k
1

Thank you for your help, I will try it.

ADD REPLYlink written 8 months ago by a51151234590
2

I am not familiar with GEO but data should contain probe level information. The genomic locations of the probes can be extracted from Illumina450K annotation given the data has been retrieved by that.Illumina Manifest. Once you have that then the paper outlines a method for associating beta value to a gene in which if the probes are present within TSS200 then the mean of all those probes is used, otherwise mean of probes in 1st Exon is taken and if 1stExon is also not there then mean of probes present in TSS1500 is used.

ADD REPLYlink modified 8 months ago • written 8 months ago by noorpratap.singh280
1

Thanks for the additional information, noorpratap. It reminds that, in fact, the Illumina 450k methylation metadata indicates whether or not the probe is in a promoter region ('promoter' as defined by Illumina).

a511512345, you may simply want to check whether you already have information on the probes overlapping the promoter regions. Take a look at the Illumina Manifest to which noorpratap refers

ADD REPLYlink modified 8 months ago • written 8 months ago by Kevin Blighe51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1998 users visited in the last hour