Dear all, I came across the following situation: I have three WGBS methylation data from the epigenomics roadmap:
E016_WGBS_FractionalMethylation_allCHROM.bedgraph E013_WGBS_FractionalMethylation_allCHROM.bedgraph E004_WGBS_FractionalMethylation_allCHROM.bedgraph
The values of methylation states on specific sites are:
grep "chrY 5926917" E016_WGBS_FractionalMethylation_allCHROM.bedgraph
chrY 59269171 59269175 0.98 grep "chrY 5926917"
E004_WGBS_FractionalMethylation_allCHROM.bedgraph
chrY 59269171 59269173 0.93 chrY 59269173 59269175 1 grep "chrY
5926917" E013_WGBS_FractionalMethylation_allCHROM.bedgraph
chrY 59269171 59269173 0.91 chrY 59269173 59269175 0.9
Notice that the site E016 state has a uniformed methylation state so the nucleotides at chrY 59269171 59269173
and chrY 59269173 59269175
are merged into one site which is not true for the rest of the sites.
So if I perform multiIntersect:
grep 5926917 E016_E004_E013_WGBS_FractionalMethylation_allCHROM2_multiIntersect.bedgraph
chrY 59269171 59269175 3 1,2,3 1 1 1
chrY 59269175 59269209 0 none 0 0 0
All sites are merged into one If I do subsequent intersects
grep 59269173 E016_E004_E013_WGBS_FractionalMethylation_allCHROM2.bedgraph
chrY 59269171 59269173 0.98 chrY 59269171 59269175 0.93 chrY 59269171 59269173 0.91 2 2
chrY 59269171 59269173 0.98 chrY 59269171 59269175 1 chrY 59269173 59269175 0.9 2 2 chrY 59269173 59269175 0.98 chrY 59269171 59269175 0.93 chrY 59269171 59269173 0.91 2 2 chrY 59269173 59269175 0.98 chrY 59269171 59269175 1 chrY 59269173 59269175 0.9 2 2
The multiintersect does not include the appropriate methylation states (not its function) The subsequent IntersectBed keep the full file but somehow assign the values wrongly, what I would have expect would be:
chrY 59269171 59269173 0.98 chrY 59269171 59269175 0.93 chrY 59269171 59269173 0.91 2 2
chrY 59269173 59269175 0.98 chrY 59269171 59269175 1 chrY 59269173 59269175 0.9 2 2
Is there a way to get the correct intersect output? Thank you.