Question: Are MACS2's summit or DiffBind's summit options are recommended for histone marks like H3K4me3, H3K27me3, H3K27ac and H3K4me1?
Researcher60 wrote:

Hi All I have used MACS2 for the peak calling with no summit at the default settings and then used Diffbind to call differential peak calls. I am worried as I could not identify any differential peaks around the TSS but they lie somewhere else on the gene bodies. Is it is because I didn't use summit option while calling peaks using macs2 or differential peaks using DiffBind? Will the peak calling with summit option help or improve my calls? Any helpful suggestion is welcome.


The Macs2 call summit option won't change the boundaries of the peaks called. What it does is add multiple lines in the bed file for some peaks, which will have the same start and end coordinate but different values for the last 4 columns, which are the signal, p-value, q-value, and peak summit. This lets you detect multiple binding events within a peak region which I think is more useful for transcription factor ChIP-seq than histone modifications.

Hi @colin, Thanks for explaining it well. But then, what do you think about the DiffBind's summit option, will that help? Hi @Rory, Do you have any suggestions on this?

Impossible to say without more details. How many replicates? Are there even peaks at TSS regions Is the quality of the ChIP good?

I have 4 samples in each group which I considered as replicates while performing differential peak calling. I can see peaks when I upload the tds files on IGV, but still unable to filter out those which are differential between the two groups.

Check the fold changes and FDR values over the regions you are interested in. FCs close to zero indicate that there is indeed no change in the TSS regions. Larger FCs but also large or inflated p-values/FDRs indicate a lack of statistical power for the given setup. Is this cell line material or primary specimen?

Rory Stark
Rory Stark wrote:

Yes, I would use the summits option in DiffBind to analyze these data.

This is straightforward for the H3K4me3 mark, which is generally narrow. For the other marks, which tend to be enriched over a wider intervals, generally the important thing is to accurately detect changes in enrichment in the vicinity of a certain genomics features (enhancer, TSS, etc). The summits parameter works by finding a point of optimal enrichment among all the samples where enrichment was identified, then creating a consistent window around that point which is used to detect differential enrichment. It is usually more important that the window being tested represents a representative region truly enriched in at least one sample group, than to represent the entire enriched region. Even if it is a subset of the full enriched region, if it is changing significantly between sample groups, that tells you what you need to know about enrichment in/near that feature. If you try to cover the entire region of enrichment, you will include more "background" bases that are not truly enriched, and this noise dilutes the confidence statistics.

Ultimately it depends on exactly what your experiment is designed for; there are some cases where knowing the precise boundaries of enrichment are more important, in which case you may lose some information by re-centering the consensus peaks.

