I am trying to call peaks in ATAC-seq data. Not surprisingly, MACS is a popular option. According to the MACS documentation:
To find enriched cutting sites such as some DNAse-Seq datasets. In this case, all 5' ends of sequenced reads should be extended in both direction to smooth the pileup signals. If the wanted smoothing window is 200bps, then use '--nomodel --shift -100 --extsize 200'.
For certain nucleosome-seq data, we need to pileup the centers of nucleosomes using a half-nucleosome size for wavelet analysis (e.g. NPS algorithm). Since the DNA wrapped on nucleosome is about 147bps, this option can be used: '--nomodel --shift 37 --extsize 73'.
Based on a brief literature review, people use both
--shift -100 --extsize 200 and
--shift 37 --extsize 73 for ATAC-seq. Is one option more appropriate? Are there maybe different sub-types of ATAC-seq where one is better than the other?