Question: how to combine methylation information from different platform?
gravatar for liu4gre
6.6 years ago by
United States
liu4gre210 wrote:

Hi, guys,

I have a set of methylation data from 450K platform. And I want to test see how it compares to another set, which unfortunatly is from RRBS. I have the bed format file of the RRBS, which gives the CpG sites and methylation level. I am wondering how to combine 450K and RRBS, especially they don't have same CpG site.


rrbs methylation 450k • 3.7k views
ADD COMMENTlink modified 5 months ago by SimH0 • written 6.6 years ago by liu4gre210
gravatar for Charles Warden
6.6 years ago by
Charles Warden8.0k
Duarte, CA
Charles Warden8.0k wrote:

You can only compare signal that is commonly represented on both platforms.

If comparing 450k versus BS-Seq, you would obviously be ignoring a lot of information from the whole-genome BS-Seq. In your case, my understanding is that you may also have to work with differences in RRBS coverage between samples (even between technical replicates).

If it helps, there is a comparison between 450k signal and targeted BS-Seq data in Figure 4B of this paper (although that particular figure focuses on signal between islands rather than sites):

ADD COMMENTlink modified 12 months ago by _r_am32k • written 6.6 years ago by Charles Warden8.0k

Thanks for the info.

It is true that 450K give less information than BS-Seq, such as RRBS. Actually I am only interested in those sites detected by 450K, as my sample is performed on this platform. The problem is my reference samples are performed on RRBS platform. I checked the CpG sites provided in these two samples, one is based on probe coordinates and another one is from the BED file. They don't match each other. So I am wondering what I should do?


ADD REPLYlink written 6.6 years ago by liu4gre210

You'll have to compare the files with a custom script, but the necessary information (chromsome, position, and beta / percentage methylation) are available in both cases. I believe the 450k annotation file (the .bpm file) provides both hg18 and hg19 coordinates (where the hg19 information is in CHR and MAPINFO). The 450k beta values are between 0 and 1 and the percentage methylation values in the .bed files (I assume from Bismark) will probably be between 0 and 100, so you need to change the scale by a factor of 100 (unless you are looking at differential methylation - then the chromosome and position are all you need to make something like the venn diagram in the figure that I was mentioning)

You can download that 450k annotation file from the Illumina website, and there is also a copy in the standalone version of COHCAP

ADD REPLYlink modified 15 months ago by _r_am32k • written 6.6 years ago by Charles Warden8.0k

Thank you very much for your help. Actually it is exact what I am asking. The coordinates from 450K (MAPINFO) are different from the coordinates provided in BBRS. How could I merge them? Or they are surely different CpG sites, meaning only the overlapped CpG sites can be extracted and compared (venn diagram)?

Thanks again.

ADD REPLYlink written 6.6 years ago by liu4gre210

Yes - you should only look for the overlap. The RRBS may just happen to not show enrichment in some areas covered by the 450k array, so this is my guess. You can visualize your alignment for a few cases to confirm this is in fact what is happening.

ADD REPLYlink modified 12 months ago by _r_am32k • written 6.6 years ago by Charles Warden8.0k

Great. Thank you very much for the help.

ADD REPLYlink written 6.6 years ago by liu4gre210
gravatar for ibphuangchen
3.7 years ago by
United States
ibphuangchen10 wrote:

Try this R package:


ADD COMMENTlink written 3.7 years ago by ibphuangchen10
gravatar for SimH
5 months ago by
SimH0 wrote:

This might be an old post but I am trying to achieve the same thing but only got ~8-10% of my 450K methylation probe 'cg numbers' mapped to the RRBS dataset. Is this normal? I would assume to have most of the 450K sites covered.

ADD COMMENTlink written 5 months ago by SimH0

There are CpG sites outside of CpG islands. So, 100% of the sites shouldn't be covered by RRBS.

However, that percent does sound low. Unless you have n samples and you are asking how many RRBS sites are covered in 100% of your n samples (and n>20 or n>100), I think it should be higher.

I am not sure what is exactly the most comparable number, but this paper has some comparisons:

Also, I think the annotations are for hg19. So, if you used hg38 for RRBS, then that would probably cause some discordance. I don't know that is what happened, but it is a troubleshooting idea.

ADD REPLYlink modified 5 months ago • written 5 months ago by Charles Warden8.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 981 users visited in the last hour