I am reading a paper, in which ChIP-Seq was used to identify the binding sites of a target TF in mice. Before de novo motif discovery, the authors filter out peaks that were located in false-positive regions. (The authors claim that these false positive peaks emerged from low complexity sequence).
Since in this study, the ChIP-Seq on the target TF knock out mice was included, I guess the authors do the peak calling on the knock out strain with the input control, And all peaks from KO mice should be false positive. And it seems these regions are quite consistent in other ChIP seq experiments.
Thus I am wondering in general Chip-Seq experiment without the target TF knock-out, how should be removed the false positive peaks? and how do these false positive peaks arise, is it because the low complexity region has multi copies in the genome? but if that is the case, the false positive peaks could be largely elimated by using control non-ChIP DNA.
Thank you very much for sharing your ideas in advance,