Question: DMRcate results: How to find out which CpG sites constitute each DMR
1
gravatar for c.ryder3
24 months ago by
c.ryder320
c.ryder320 wrote:

Hello. The output of DMRcate is the genomic coordinates of regions identified as differentially methylated. DMRcate also tells how many CpG sites are in each DMR. The output looks something like this:

        coord                   no.cpgs    minfdr        Stouffer maxbetafc meanbetafc
39721    chr7:96641456-96657023    86   1.786098e-195        0   0.5638876  0.2507447
11267 chr12:115130855-115136308    71   0.000000e+00         0   0.5665583  0.3240000
29891    chr3:62353312-62365402    62   5.477056e-142        0   0.5326552  0.3088480
30739  chr3:147122315-147131860    62   0.000000e+00         0   0.5841839  0.3162800
6859  chr10:134594987-134602530    60   0.000000e+00         0   0.6188469  0.3357113
41367    chr8:25897201-25909599    57   3.400620e-184        0   0.5738376  0.3226581

The coord column gives the chromosomal coordinates of the differentially methylated region and the no.cpgs column gives the number of CpG sites that constitute this region. I would like to know the ID (e.g. cg08899471) of the CpG sites within these regions. How can I get this information?

Thank you

ADD COMMENTlink modified 6 days ago by pierrefransquet0 • written 24 months ago by c.ryder320

Can you elaborate on what you mean by identity? You can identify genes in the vicinity or overlap with a CpG island etc.

ADD REPLYlink modified 24 months ago • written 24 months ago by Satyajeet Khare1.4k

Hello. I have updated my question with some more details.

ADD REPLYlink written 24 months ago by c.ryder320

I have the same problem right now, did you ever find a solution?

Thanks, Alex

ADD REPLYlink written 18 months ago by alex.v.nesta0
0
gravatar for halo22
24 months ago by
halo22130
Indianapolis, IN
halo22130 wrote:

I am not sure if there is a tool that can help you retrieve the CpG info from the DMRcate or if there is a way to tweak DMRcate into giving you the info. But this is how I would do it:

Consider the co-ordinates "chr7:96641456-96657023 " and see how many CpG actually lie in this area, may though UCSC. Take this list and overlap with your list of CpG's that you used as an input(with B or M-values) to DMRcate.

ADD COMMENTlink written 24 months ago by halo22130
0
gravatar for 574233829
13 months ago by
5742338290
5742338290 wrote:

Hello.Have you solved the problem? I met the similar questions.I want to find the gene associated coordinates,but I don't know how to do it. Can you help me? I want to know which gene associated with the coordinate (eg.chr7:96641456-96657023).

ADD COMMENTlink written 13 months ago by 5742338290
0
gravatar for 574233829
13 months ago by
5742338290
5742338290 wrote:

Hello, I have used package called DMRcate to analyse 450k data. I want to find the gene which associates the DMRs.And I met some questions.The outputs include gene_assoc, group, hg19coord, no.probes, minpval, meanpval and maxbetafc,when I used the old version to analyse 450k data. But when I update the package,the outputs became coord, no.cpgs, minfdr, Stouffer, maxbetafc,meanbetafc.There is no result of "gene_assoc".I want to find the gene names associating "coord",can you help me ? Can you please tel me how to associate the gene by using the newest DMRcate packages.

There follow the output results of the newest DMRcate. coord no.cpgs minfdr Stouffer maxbetafc meanbetafc 63999 chr6:33156164-33181870 265 0 0 -0.5008031 -0.02648790 63997 chr6:33128825-33149777 150 0 0 0.4176126 0.08611966 63917 chr6:32144195-32161004 128 0 0 -0.2574513 -0.03184096 63914 chr6:32114490-32123701 124 0 0 -0.4377015 -0.06195576 63889 chr6:31935801-31940855 101 0 0 -0.1555205 -0.02401999 12564 chr11:31817810-31841980 100 0 0 -0.4611059 -0.17113506

ADD COMMENTlink written 13 months ago by 5742338290
0
gravatar for ATpoint
13 months ago by
ATpoint21k
Germany
ATpoint21k wrote:

As you're already in R, use this code snippet to get a GRanges with all CpG coordinates. It takes a BSgenome as input, e.g. hg38:

#### Find CpG coordinats in a given BSgenome:

###################################################################################
require(Biostrings)
require(parallel)
###################################################################################

Find_CpG <- function(Genome, Cores = 1){

  if (class(Genome) != "BSgenome") stop("Genome must be a BSgenome!")

  CpG <- mclapply(seqlevels(Genome), function(x) start(matchPattern("CG", Genome[[x]])), mc.cores = Cores)
  return(
    suppressWarnings(
      do.call(c, mclapply(1:length(seqlevels(Genome)), function(x) GRanges(names(Genome)[x], 
                                                                           IRanges(CpG[[x]], width = 2)
      ), mc.cores=Cores))
    )
  )
}

## Example:
CpG.gr  <- Find_CpG(Genome = BSgenome.Hsapiens.UCSC.hg38, Cores = 8)

Once you have this, I would assign metadata to this GRanges object, like:

CpG.gr$ID <- paste("CpG", seq(1, lengthCpG.gr), sep="")

This you could use then for intersection with your output. The GenomicRanges package has efficient intersection implementations that you might want to check out.

ADD COMMENTlink written 13 months ago by ATpoint21k
0
gravatar for pierrefransquet
6 days ago by
pierrefransquet0 wrote:

Hi all, I know this is a late reply but i had the same issue so i wrote some code to help anyone who needs it in the future. It might be a bit crude but it works well!

First take your final DMRcate output, mine was named "results.ranges" which puts the genomic ranges over the 'DMRcoutput', and turn it into a data frame, and give each DMR an identifier

RR <-as.data.frame (results.ranges)
RR$DMRID <-rownames(RR)
row.names(RR) = NULL
RR$DMRNO <- rownames (RR)
row.names(RR) = RR$DMRID
RR$DMRID = NULL
RR <-RR[order(RR$minfdr), , drop = FALSE
View(RR)

Your new results.ranges (i.e. RR) should now have a DMR number associated with each DMR

Now you need to pull the CpG info from the dmr output that was used to make the results ranges

cgID <- as.data.frame(dmrcoutput$input)

Look at your RR file, choose a DMR that looks good, and take note of the DMRNO; Run

DMRNUM <- readline(prompt = "What is your DMR Number:"

Enter the number into the console and hit enter, then run these lines and it should spit out a table listing the probes as well as other useful info

assign(paste0("DMR_",DMRNUM), subset(subset(RR,DMRNO==DMRNUM)))
assign(paste0("DMR_",DMRNUM,"_probelist"), subset(cgID, cgID$CHR==assign(paste0("DMR_",DMRNUM), subset(subset(RR,DMRNO==DMRNUM)))$seqnames & cgID$pos>assign(paste0("DMR_",DMRNUM), subset(subset(RR,DMRNO==DMRNUM)))$start-1 & cgID$pos<assign(paste0("DMR_",DMRNUM), subset(subset(RR,DMRNO==DMRNUM)))$end+1))

Hopefully you should now have a dataframe labelled "DMR_XXX_probelist"!

Kind Regards, Pete

ADD COMMENTlink written 6 days ago by pierrefransquet0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1386 users visited in the last hour