I am currently analysing ChipSeq data from 4 different proteins in order to build up some idea of correlations between and across the c. elegens genome. Essentially I want to see where each protein overlaps with the others and where.
So far I have called peaks on all of my data sets (which include biological and technical replicates) I am now browsing data before I start comparing to find correlations (overlaps, intersections etc).
Some of my data is quite noisy, and in order to get the best out of it I have run MACS2 on a relatively low pvalue threshold (5e-2) and then only taken peaks which are confirmed across technical and biological replicates, hoping to catch noise and wrongly called peaks at this step. It seems to have worked empirically and I am seeing sensible results. However, this is my first solo bioinformatics project and I just wanted to check to see if this was a sensible method.
Is anyone able to recommend a better method? Is my MACS2 cutoff prohibitively low? Can anyone point me to papers which details methods for this sort of thing? I bow to the greater knowledge and wisdom of this community. Many thanks.