I am studying the distribution of breakpoints among different human genomes looking for hotspots in the "samples" genomes that are enriched in breakpoints. To do so, I have divided the each chromosome in bins of 10Kb and the I have counted how many breaks are present in each bins. I have done the same for some control datasets and for randomly generated datasets. At this point, what is the best statistical test I could use to determine the p value for each bin?
The data I have looks like this:
Breaks_Bin1 10 3
Breaks_bin2 15 6
Breaks_bin3 5 3