I have just 12 peaks (500 - 1000 bp) from a ATAC experiment at promoter regions which I want to check for TF binding sites.
I used HOMER findMotifsGenome.pl with default parameter and masking (mm10 -size 200 -mask).
knownResults: three hits which are biological plausible (motif TF fits gene of peak region) but only p-value 1e-2
homerResults: all hits are marked as potential false positive
My questions are:
- If I use few target sequences, is it expected that the p-value for knownResults is relatively high and not e.g. 1e50 due to the ratio of target sequence hits vs background hits?
- Do I assume correctly that the de novo homerResults will fail due to the few input target regions?
How could I validate if the hits I found in knownResults are true hits?
Many thanks for your help and kind regards :) Florian