Question

HiC interactions has a bias towards even numbers

0

Entering edit mode

7.2 years ago

ingerslev • 0

The data being analysed is a capture Hi-C dataset looking at the captured promoters in relation to a predefined set of enhancers. The reads were preprocessed with HiCUP and the set of trusted interactions were subsequently found using the R/bioconductor package diffhic.

I see a pattern where the number of regions having an even number of interactions greatly outnumbers the regions having an uneven number of interactions.

This is obvious when the frequency of each number of interactions is plotted:

countOverlaps(anchors(viewdata, type = "first"), 
              anchors(viewdata, type = "first"), 
              type = "equal"
              ) %>% table %>% plot

Line plot of number of interactions

I count about twice as many even number of interactions as uneven.

What could be causing this? Am I missing some simple bias, or error in my analysis? It seems unlikely that there would be a biological favour for an even number of interactions.

Any insight is appreciated.

next-gen sequencing diffhic • 1.2k views

ADD COMMENT • link 7.2 years ago by ingerslev • 0

0

Entering edit mode

I am not familiar with this but could it be that the odd numbers come from missing one side of the interaction ? If the way you look at interactions is by sequencing both partners, you would expect even numbers because each interaction should be represented from both sides.

ADD REPLY • link 7.2 years ago by Jean-Karim Heriche 27k

score 1 · Accepted Answer · 2017-02-24

1

Entering edit mode

7.2 years ago

ingerslev • 0

I figured out the answer: The protocol uses RNA baits to capture promoter regions, and for some reason 3/4 of all promoters have two baits while the remaining 1/4 only have one bait. Merging the two baits prior to selecting interactions removes the bias.

ADD COMMENT • link 7.2 years ago by ingerslev • 0