Question: MeDip-seq data interpretation + analysis / MEDIPS, MACS2
0
gravatar for florian.noack
4.9 years ago by
Germany
florian.noack0 wrote:

Hello everybody,

since three weeks i try to make sense out of my (h)medip-seq data (iam an absolute beginner in deep sequencing data analysis and bioinformatics in general) and i hope you guys can give me some tips.

Sample: in vivo material of three linage related cell types

Method: comperative (h) MeDip-seq ( Li Tan et al. 2013) with 300 ng of each of the three celltypes (in total 900ng DNA) using the Diagnode MeDip and hMeDip kits.

First of all I don’t know if my data look alright since I have really narrow peaks (reads are extended to 250 bp) which align quite often only to on strand (hemimethylated DNA or non-CpG methylation), ([IMG]http://i.imgur.com/1TiUPTQ.png[/IMG]). I tried already to download  published raw medip-seq data to compare them with my data but without success. Most times the data were already processed and therefor i couldnt compare them with my data. Is this narrow peak pattern normal or should I see more broader peaks (MeDip QC and Seqeuncing QC seem all okay) ?

 

 

The second problem is to analyse the data and find differential methylated peaks/regions). I used already the MEDIPS package (see code below) but these gives me around 10 peaks without duplicates whereas I get around 500 peaks with duplicates (adj=T). 

library("MEDIPS")
library("BSgenome.Mmusculus.UCSC.mm10")
library("edgeR")

bcv <- 0.01
counts <- matrix( rnbinom(40,size=1/bcv^2,mu=10), 20,2)
y <- DGEList(counts=counts, group=1:2)
et <- exactTest(y, dispersion=bcv^2)
medipoN="/home/crtd_c...MeDip_Output.bam"
medipoR="/home/crtd_c ...2MeDip_Output.bam"
BSgenome="BSgenome.Mmusculus.UCSC.mm10"
uniq=FALSE
extend=100
shift=0
ws=100
chr.select=NULL

N_Output= MEDIPS.createSet(file = medipoN, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)
R_Output= MEDIPS.createSet(file = medipoR, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)

 

mr.edgeR.MeDip = MEDIPS.meth(MSet1 = N_Output, MSet2 = R_Output,  p.adj = "hochberg",diff.method = "edgeR", prob.method = "poisson", MeDIP = F, CNV = F, type = "rpkm", minRowSum = 1)

mr.edgeR.Medip.s = MEDIPS.selectSig(results = mr.edgeR.MeDip, p.value = 0.2, adj = T, ratio = NULL, bg.counts = 30, CNV = F)

However if I visualize the called regions in seqmonk they don’t look really interesting whereas I find easily other regions which look more differential methylated. So I don’t really trust the package but may used some wrong settings ? Can i use adj=F (but i think then i will get to much false positiv peaks) ?

I proceed then with MACS2 to call broad peaks in all three samples, merged them and count the reads of the peaks using bedtools. 

macs2 callpeak -t /home/crtd_c ... MeDip_Output.bam -c /home/crtd_c ... MeDip_Input.bam --broad -f BAM -g mm --bw 250 -n PP_MeDip_MACS2  -> for each of the three celltypes

merge all three with "cat" function 

bedtools merge -i /home/crtd_csortedmacs2peaks.bed -n -scores sum  -d 100 > mergedMACS2peaks.bed

bedtools multicov -bams /home/crtd_c...PP_MeDip_Output.bam /home/crtd_c...DP_MeDip_Output.bam /home/crtd_c...N_MeDip_Output.bam -bed /home/crtd_c...mergedMACS2peaks.bed > MeDippeaks.bed -D -q 30


I sorted all peaks out which have no more than 20 reads in one of the three samples ( sorted <-subset(mydata, ((V6>20)|(V7>20)|(V8>20))) )and calculated the log2 ratio between the three celltypes (so at the end I had a excelsheet with the genomic coordinates and the three log2 ratios). However now is the question can I do it like this and if yes, how can I define the significant changes of the log2 ratios (simply +1/-1 fold change ?).

However Iam happy about any other suggestions regarding the data analysis. May there is a tool out there which I didn’t try so far.

 

Thanks a lot !

sequencing chip-seq • 5.3k views
ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by florian.noack0
1
gravatar for Ming Tang
4.9 years ago by
Ming Tang2.5k
Houston/MD Anderson Cancer Center
Ming Tang2.5k wrote:

I have post related to MeDIP-seq http://crazyhottommy.blogspot.com/search/label/MeDIP-seq

you may find it helpful.

 

ADD COMMENTlink written 4.9 years ago by Ming Tang2.5k
0
gravatar for florian.noack
4.9 years ago by
Germany
florian.noack0 wrote:

Thanks a lot, I will try diffreps as well as the MACS2 bdgcmp option.

ADD COMMENTlink written 4.9 years ago by florian.noack0

Hi Florian,

How was your experience with the MACS2 bdgcmp option? I was trying to compare a Tumor and Normal sample ( both having CONTROL Sample) and after running MACS2 bdgcmp option, I am only getting 1-2 DMR peaks ? Is that anything obvious that I am missing?

Thanks

ADD REPLYlink written 28 days ago by always_learning980
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 565 users visited in the last hour