I have 2 sets of bed files from ChIP-seq, A and B, and a certain number of peaks overlap. Now I want to put some kind of enrichment score / p-value on that overlap - if only 2 peaks overlap and A and B contain both 10 000 peaks, it's not really significant, while if 90 peaks overlap, A is 10000 peaks and B is 100, it's more significant.

The classical way to do this would be to do a hypergeometric test - all the variables of the test are easily filled in (number of draws, number of successes, number of 'failures') except one, the total population number. In this scenario it would be something similar to 'the total number of peaks one could potentially draw from the genome', which is impossible to estimate, and could be biased (regions that don't chip well, unmappable regions, etc).

What's the best way to put a 'score' on an overlap between A and B then? The ultimate goal is to display some kind of heatmap, as I'm doing the comparison between many A's and many B's.

Love the idea of the per-base Fisher test, I'll probably give that a try. I'm already using Intervalstats to calculate the per-peak p-value between A and B - my problem is that in their paper they suggest a summary statistic being 'number of significant peaks in A versus B' divided by 'total number of peaks in A'. All of that is fine in the case where you can compare one set of bed files against themselves, but I'm doing a pairwise comparison between all beds in 'set of bed A' versus 'set of bed B', and depending on the order (A vs B or B vs A) you can have wildly different summary statistics. One example would be if you have 50 significant peaks between A and B, but A has 5000 peaks while B has 100 - the summary statistic changes from 0.01 to 0.5 depending on the order. I was trying to find a way to quantify the relationship between the two in a single value.

The asymmetry in the IntervalStats algorithm can be useful for biological interpretation. In the example you give, you might imagine A to be something like PolII (found at many places) and B to be an activating transcription factor (found at fewer places but typically with A). Maybe the single metric you're looking for can be found in comparing the rows/columns of the matrix created from all pairwise comparisons?