The data being analysed is a capture Hi-C dataset looking at the captured promoters in relation to a predefined set of enhancers. The reads were preprocessed with HiCUP and the set of trusted interactions were subsequently found using the R/bioconductor package diffhic.
I see a pattern where the number of regions having an even number of interactions greatly outnumbers the regions having an uneven number of interactions.
This is obvious when the frequency of each number of interactions is plotted:
countOverlaps(anchors(viewdata, type = "first"),
anchors(viewdata, type = "first"),
type = "equal"
) %>% table %>% plot
I count about twice as many even number of interactions as uneven.
What could be causing this? Am I missing some simple bias, or error in my analysis? It seems unlikely that there would be a biological favour for an even number of interactions.
Any insight is appreciated.
I am not familiar with this but could it be that the odd numbers come from missing one side of the interaction ? If the way you look at interactions is by sequencing both partners, you would expect even numbers because each interaction should be represented from both sides.