Question: How to read Hi-C result from paper 'Bing Ren'
3.7 years ago
United States
jesselee516 wrote:

Hi all, I am reading paper'A high-resolution map of the three-dimensional' chromatin interactome in human cells. I downlaod their Hi-C data from GEO(GSM1154024) for IMR90 Cell Line.This is used for genome regions interaction.I download the .txt version. The line in this file looks like as  follows:

HWI-ST216_0375:2:1206:9502:51416#0 chr1 148 - chr19 63789745 -

HWI-ST216_0375:2:1215:19504:87026#0 chr1 6071 - chr13 50507738 -


As reading above result, I get confused. I know two regions are interacting, like

(chr1 148 -)<->(chr19 63789745-), this two regions should be interacting. But it should be two regions instead of position.I do not know what does 148 and 63789745 means in this dataset.My familar format should be like (148-500)<->(63789745-63789999),a region map to another region. Any one could help me out how to read reesult from this paper? Thanks.

3.6 years ago
Göttingen, Germany
Gjain wrote:

Hi Jesse, 

In order to understand this paper, you need to first understand the 3C-based technologies and how the experiment is performed. Please have a look at this review which will help you understand these technologies. You specifically want to focus on 3C part and then move to the HiC the part. 

A decade of 3C technologies: insights into nuclear organization

You also want to read and understand the original HiC paper to understand what a looping interaction means:

Comprehensive mapping of long range interactions reveals folding principles of the human genome


Overview of Hi-C.

(A) Cells are cross-linked with formaldehyde, resulting in covalent links between spatially adjacent chromatin segments (DNA fragments: dark blue, red; Proteins, which can mediate such interactions, are shown in light blue and cyan). Chromatin is digested with a restriction enzyme (here, HindIII; restriction site: dashed line, see inset) and the resulting sticky ends are filled in with nucleotides, one of which is biotinylated (purple dot). Ligation is performed under extremely dilute conditions to create chimeric molecules; the HindIII site is lost and a NheI site is created (inset). DNA is purified and sheared. Biotinylated junctions are isolated with streptavidin beads and identified by paired-end sequencing.

(B) Hi-C produces a genome-wide contact matrix. The submatrix shown here corresponds to intrachromosomal interactions on chromosome 14. Each pixel represents all interactions between a 1Mb locus and another 1Mb locus; intensity corresponds to the total number of reads (0-50). Tick marks appear every 10Mb.

(C, D) We compared the original experiment to a biological repeat using the same restriction enzyme (C, range: 0-50 reads) and to results with a different restriction enzyme (D, range: 0- 100 reads, NcoI).

The topological domains comes later which are basically regions of genome where the elements involved in looping tends to happen in one domain.

I hope this helps a bit.

Hi Gjain, Thanks a lot. It did help me a lot.

I am happy to help.

3.6 years ago
Czech Republic, Brno, CEITEC
mikhail.shugay wrote:

I suggest that those are the coordinates of corresponding HindIII sites that are interacting. Also note that raw reads should be processed accordingly, i.e. they are usually binned to 500kb regions, the corresponding interaction matrix is normalized for biases and smoothed. See Tanay's group web page for the pipeline and details.

