Question: All CpG sites in human genome RRBS
gravatar for Pin.Bioinf
2.4 years ago by
Pin.Bioinf290 wrote:


I was asked to find which CpGs are unique of EPIC 850K methylation array that are not observed by RRBS (reduced representation bisulfite sequencing). I have EPIC 850K manifest, but is there a public site where I can download all coordinates/locations for all CpGs detected by RRBS?

Do you think there will be CpGs that EPIC 850K covers that RRBS does not?

Thanks a lot for your help.

human rrbs genome bsseq • 1.1k views
ADD COMMENTlink modified 24 months ago by Illinu90 • written 2.4 years ago by Pin.Bioinf290
gravatar for Devon Ryan
2.4 years ago by
Devon Ryan98k
Freiburg, Germany
Devon Ryan98k wrote:

There won't be an exact list of CpGs covered by RRBS because it depends on the exact enzymes used and how tightly you perform size selection. I would propose that you perform the following procedure:

  1. Use biopython to determine all possible fragments generated by the restriction enzymes you'll be using (there are some convenient functions for performing restriction digests on sequences in that package).
  2. Determine a rough range of sequencable fragments, which will likely be something like 75-500 bases.
  3. Choose a read length (N), because the results of all of this will be length-dependent.
  4. For each of the fragments you selected from step 2, write the regions corresponding to the first/last N bases to a file in BED format.
  5. Load the BED file from step 4 into an interval tree (there might be something in biopython for this, worst case scenario you can use deeptoolsintervals from deepTools).
  6. Use biopython to iterate over the CpGs and query them for overlaps with the interval from step 5.
  7. Write output files appropriately
  8. Compare them to what the EPIC 850K covers.

Note that the EPIC 850K may give a ballpark estimate of all of this in their sales materials. I wouldn't be surprised if the EPIC 850K covers some CpGs that RRBS doesn't.

ADD COMMENTlink written 2.4 years ago by Devon Ryan98k

I agree that RRBS coverage will vary (but I think the number of shared sites at 10X coverage is a useful QC metric).

So, even if you find the conclusions are a little different for your own experiment, perhaps it is worth taking a look at this Carmona et al. 2017 paper?

ADD REPLYlink modified 24 months ago • written 24 months ago by Charles Warden8.0k
gravatar for Illinu
24 months ago by
Illinu90 wrote:

Hi Pin.Bioinf, If you are still interested, you can email and they can give you a list of all CpGs detected in human samples with the Diagenode Premium RRBS Kit. Best, Sol

ADD COMMENTlink written 24 months ago by Illinu90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1136 users visited in the last hour