1) I am confused about how to look for enrichment of my data. Specifically I have attempted to use phyper but I seem to misunderstand how to use the phyper function when it pertains to 'lists' of overlaps for three files. My data looks like the following:
list a - 5,627 #list of unique peaks for a transcription factor
list b - 2,533 #list of unique peaks for a transcription factor
list c - 3,989 #list of unique peaks for a transcription factor
The number of common peaks (overlap) within all three files (eg: the center of a venn diagram) is 2,329.
The total number of unique peaks possible for all three transcription factors (if they perfectly overlapped the reference data) is 8,669.
2) This data pertains to intergenic regions. I am also interested in how to test whether these transcription factors are more enriched in intergenic regions versus intragenic regions assuming I have similar data for intragenic regions.
I might simply be misunderstanding the use of the phyper function and require something completely different. Essentially I am trying to see whether these three transcription factors are enriched in the reference data region in the form of a p-value or similar.
I've begun messing around with the fisher function in bedtools. Could this be used to achieve what i'm looking for? If so, how?