floriandeckert wrote:

I have just 12 peaks (500 - 1000 bp) from a ATAC experiment at promoter regions which I want to check for TF binding sites.

I used HOMER with default parameter and masking (mm10 -size 200 -mask).

knownResults: three hits which are biological plausible (motif TF fits gene of peak region) but only p-value 1e-2
homerResults: all hits are marked as potential false positive

My questions are:

  1. If I use few target sequences, is it expected that the p-value for knownResults is relatively high and not e.g. 1e50 due to the ratio of target sequence hits vs background hits?
  2. Do I assume correctly that the de novo homerResults will fail due to the few input target regions?

How could I validate if the hits I found in knownResults are true hits?

Many thanks for your help and kind regards :) Florian

