How to generate CpG REGIONs file to run DMR analysis with methylkit?
1
0
Entering edit mode
2.3 years ago
maria2019 ▴ 210

I have RRBS data that I have pre-processed using bismark. I wanna analyze DMR using methylkit. However, what I have is CpG calls from bismark methyl call and do not have CpG regions. What package is the best to generate CpG regions so that I can run DMR analysis with methylkit?

** I have done DMC analysis with methylkit that is why I prefer to do DMR with the same package (methylkit)

DMR methylkit CpG-regions RRBS • 1.8k views
2
Entering edit mode
2.3 years ago

If this is human, perhaps you can consider the UCSC CpG Islands (and you can get some gene mappings through Bioconductor)?

The RRBS coverage may also tend to be in certain blocks, so you could also use distance to the gene. However, I think that knowing about the CpG Islands (or more specific boundaries for regions of higher coverage) may be helpful?

Either way, I am assuming you are asking about how to create the genome ranges object ("refGR" in the example for regionCounts(meth, regions=refGR), where "meth" is the result of running myobj = methRead() and meth=unite(myobj))?

You can use anything with the relevant information, but I think this will work for a bed file (with names assuming that you had your own set of "promoter" regions - although an "island" table would work just as well):

refGR = GRanges(Rle(promoter.table$V1), IRanges(start=promoter.table$V2, end=promoter.table$V3), Names=promoter.table$V4,
Rle(strand(promoter.table\$V6)))


For just the CpG island part, I think you want to download "CpG Islands" under "Regulation" from the Table Browser:

https://genome.ucsc.edu/cgi-bin/hgTables

1
Entering edit mode

Thank you very much for your response. This actually helps majority of problem. More in detail, I am working with Rat. I read DMAP which also calculates DMR. They choose sliding windows of 40-220 bp to identify CpG regions with some special thresholds of coverage. I am trying to see if I can extract those regions from DMAP and use it as GRanges in methylkit.