i think it is a redundant question but i didn't find a satisfying answer....
I have multiple sequencing data of cells (with duplicates) regarding point mutations and have two questions:
I want to know where the 'hotspots' are (for example especially transversions), therefore i intersected genomic windows with the mutations and want to get all genomic ranges with significant more observed mutations than expected... But the occurences are not-normal distributed and the sample size is small... I tried calculating a mean for all windows in a duplicate (expected) and compare it to a specific window, but i don't find the right statistical approach example of Distribution: n(Occurence): n(windows of genome) 0:17,752,010, 1: 8,200,279, 2: 3,397,173, 3: 1,135,377, 4: 308,352, 5: 69,607, 6: 14,728, 7: 3,637, 8: 1,113, 9: 371, 10: 63, 11: 1
Thanks in advance!