Question: MeDip-seq data interpretation + analysis / MEDIPS, MACS2
0
gravatar for florian.noack
4.6 years ago by
Germany
florian.noack0 wrote:

Hello everybody,

since three weeks i try to make sense out of my (h)medip-seq data (iam an absolute beginner in deep sequencing data analysis and bioinformatics in general) and i hope you guys can give me some tips.

Sample: in vivo material of three linage related cell types

Method: comperative (h) MeDip-seq ( Li Tan et al. 2013) with 300 ng of each of the three celltypes (in total 900ng DNA) using the Diagnode MeDip and hMeDip kits.

First of all I don’t know if my data look alright since I have really narrow peaks (reads are extended to 250 bp) which align quite often only to on strand (hemimethylated DNA or non-CpG methylation), ([IMG]http://i.imgur.com/1TiUPTQ.png[/IMG]). I tried already to download  published raw medip-seq data to compare them with my data but without success. Most times the data were already processed and therefor i couldnt compare them with my data. Is this narrow peak pattern normal or should I see more broader peaks (MeDip QC and Seqeuncing QC seem all okay) ?

 

 

The second problem is to analyse the data and find differential methylated peaks/regions). I used already the MEDIPS package (see code below) but these gives me around 10 peaks without duplicates whereas I get around 500 peaks with duplicates (adj=T). 

library("MEDIPS")
library("BSgenome.Mmusculus.UCSC.mm10")
library("edgeR")

bcv <- 0.01
counts <- matrix( rnbinom(40,size=1/bcv^2,mu=10), 20,2)
y <- DGEList(counts=counts, group=1:2)
et <- exactTest(y, dispersion=bcv^2)
medipoN="/home/crtd_c...MeDip_Output.bam"
medipoR="/home/crtd_c ...2MeDip_Output.bam"
BSgenome="BSgenome.Mmusculus.UCSC.mm10"
uniq=FALSE
extend=100
shift=0
ws=100
chr.select=NULL

N_Output= MEDIPS.createSet(file = medipoN, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)
R_Output= MEDIPS.createSet(file = medipoR, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)

 

mr.edgeR.MeDip = MEDIPS.meth(MSet1 = N_Output, MSet2 = R_Output,  p.adj = "hochberg",diff.method = "edgeR", prob.method = "poisson", MeDIP = F, CNV = F, type = "rpkm", minRowSum = 1)

mr.edgeR.Medip.s = MEDIPS.selectSig(results = mr.edgeR.MeDip, p.value = 0.2, adj = T, ratio = NULL, bg.counts = 30, CNV = F)

However if I visualize the called regions in seqmonk they don’t look really interesting whereas I find easily other regions which look more differential methylated. So I don’t really trust the package but may used some wrong settings ? Can i use adj=F (but i think then i will get to much false positiv peaks) ?

I proceed then with MACS2 to call broad peaks in all three samples, merged them and count the reads of the peaks using bedtools. 

macs2 callpeak -t /home/crtd_c ... MeDip_Output.bam -c /home/crtd_c ... MeDip_Input.bam --broad -f BAM -g mm --bw 250 -n PP_MeDip_MACS2  -> for each of the three celltypes

merge all three with "cat" function 

bedtools merge -i /home/crtd_csortedmacs2peaks.bed -n -scores sum  -d 100 > mergedMACS2peaks.bed

bedtools multicov -bams /home/crtd_c...PP_MeDip_Output.bam /home/crtd_c...DP_MeDip_Output.bam /home/crtd_c...N_MeDip_Output.bam -bed /home/crtd_c...mergedMACS2peaks.bed > MeDippeaks.bed -D -q 30


I sorted all peaks out which have no more than 20 reads in one of the three samples ( sorted <-subset(mydata, ((V6>20)|(V7>20)|(V8>20))) )and calculated the log2 ratio between the three celltypes (so at the end I had a excelsheet with the genomic coordinates and the three log2 ratios). However now is the question can I do it like this and if yes, how can I define the significant changes of the log2 ratios (simply +1/-1 fold change ?).

However Iam happy about any other suggestions regarding the data analysis. May there is a tool out there which I didn’t try so far.

 

Thanks a lot !

sequencing chip-seq • 5.0k views
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by florian.noack0
1
gravatar for Ming Tang
4.6 years ago by
Ming Tang2.4k
Houston/MD Anderson Cancer Center
Ming Tang2.4k wrote:

I have post related to MeDIP-seq http://crazyhottommy.blogspot.com/search/label/MeDIP-seq

you may find it helpful.

 

ADD COMMENTlink written 4.6 years ago by Ming Tang2.4k
0
gravatar for florian.noack
4.6 years ago by
Germany
florian.noack0 wrote:

Thanks a lot, I will try diffreps as well as the MACS2 bdgcmp option.

ADD COMMENTlink written 4.6 years ago by florian.noack0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 762 users visited in the last hour