Question: All CpG sites in human genome RRBS
1
gravatar for Pin.Bioinf
17 months ago by
Pin.Bioinf270
Malaga
Pin.Bioinf270 wrote:

Hello,

I was asked to find which CpGs are unique of EPIC 850K methylation array that are not observed by RRBS (reduced representation bisulfite sequencing). I have EPIC 850K manifest, but is there a public site where I can download all coordinates/locations for all CpGs detected by RRBS?

Do you think there will be CpGs that EPIC 850K covers that RRBS does not?

Thanks a lot for your help.

human rrbs genome bsseq • 714 views
ADD COMMENTlink modified 11 months ago by Illinu90 • written 17 months ago by Pin.Bioinf270
7
gravatar for Devon Ryan
17 months ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

There won't be an exact list of CpGs covered by RRBS because it depends on the exact enzymes used and how tightly you perform size selection. I would propose that you perform the following procedure:

  1. Use biopython to determine all possible fragments generated by the restriction enzymes you'll be using (there are some convenient functions for performing restriction digests on sequences in that package).
  2. Determine a rough range of sequencable fragments, which will likely be something like 75-500 bases.
  3. Choose a read length (N), because the results of all of this will be length-dependent.
  4. For each of the fragments you selected from step 2, write the regions corresponding to the first/last N bases to a file in BED format.
  5. Load the BED file from step 4 into an interval tree (there might be something in biopython for this, worst case scenario you can use deeptoolsintervals from deepTools).
  6. Use biopython to iterate over the CpGs and query them for overlaps with the interval from step 5.
  7. Write output files appropriately
  8. Compare them to what the EPIC 850K covers.

Note that the EPIC 850K may give a ballpark estimate of all of this in their sales materials. I wouldn't be surprised if the EPIC 850K covers some CpGs that RRBS doesn't.

ADD COMMENTlink written 17 months ago by Devon Ryan94k

I agree that RRBS coverage will vary (but I think the number of shared sites at 10X coverage is a useful QC metric).

So, even if you find the conclusions are a little different for your own experiment, perhaps it is worth taking a look at this Carmona et al. 2017 paper?

ADD REPLYlink modified 11 months ago • written 11 months ago by Charles Warden7.6k
0
gravatar for Illinu
11 months ago by
Illinu90
Belgium
Illinu90 wrote:

Hi Pin.Bioinf, If you are still interested, you can email services@diagenode.com and they can give you a list of all CpGs detected in human samples with the Diagenode Premium RRBS Kit. Best, Sol

ADD COMMENTlink written 11 months ago by Illinu90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 659 users visited in the last hour