I have been given vague instructions by my supervisor to check our ChIP-seq analysis pipeline on an example data set to see if we should be "re-centering" peaks. Unfortunately, I'm struggling to find consensus guidelines. So I'd like to get some opinions and guidance here form more experienced hands.
MACS2 is used to call peaks on the sample ChIP-seq data set with a fragment length of 217. Please see the the Peak Model and Cross Correlation. In the model you can see the characteristic bimodal distribution of reads on the forward and reverse strands. Am I assuming correctly that centering means to adjust the fragment positions to make these 2 spikes overlap? Also, I am not sure how to interpret that Cross-Correlation image.
What I have found so far:
Tutorial: Use half the fragment length as centering distance for jointly analyzing 5’ and 3’ tags. Centering means shifting the positions of tags mapping to the + or − strand of the chromosome by a fixed distance downstream and or upstream, respectively. Centering increases the resolution of the ChIP-Seq data.
MACS2 shift option: When NOMODEL is set, MACS will use this value to move cutting ends (5') towards 5'->3' direction then apply EXTSIZE (fragment length) to extend them.... recommended to keep it as default 0 for ChIP-Seq datasets.
Does anyone have any papers that does this re-centring? Is there any real need to do this?
EDIT: Thanks to vchris. I'm reading that paper now. Already I can see that:
- the true binding site is between the bimodal distribution peaks
- MACS2 centering of peaks is done using shift and extsize options
So to center the peaks with a fragment length of 217 I think I must do 1 of the following
--shift 108 --extsize 108 --shift 108 --extsize 217 --shift -108 --extsize 217 (basic intuition is leading me toward this choice)
Thank you all, Kenneth