I'm trying to identify the binding motif of a transcription factor based on ChIP-seq peaks identified with MACS2.
I tried different tools (among which Homer, MEME-ChIP, STEME, RSAT) but I have a couple of questions as I'm not sure about the validity/reproducibility of my results:
Do you select the peaks (for example taking the top 10-20% of the peaks ranked by fold change/p-value) or you give as input the full list of them? What's the best practice?
Do you use your own background sequences or you let the software generate its own background?
In case you use a custom background, what do you use? For example, I tried to give randomly picked regions identified by DNase-seq, but this completely abolished the enrichment of the motifs that I get when I don't give any custom background.
Thanks for your help!