2.3 years ago by
the resolution of a Hi-C dataset (a) highly depends on the details of the protocol and (b) actually, is a poorly defined, though real measure.
(a) the number of reads is not the only variable that determines the quality of a Hi-C dataset.
First, if the total number of unique Hi-C molecules in your sample is low (as they say, "a library has low complexity"), nothing good can come out of extra sequencing.
Second, informative Hi-C molecules (i.e. molecules formed by ligation between two loci spatially close in a nucleus) represent only a fraction of any Hi-C library; any library also has molecules formed by random ligations in solution, fragments of unligated DNA and pieces of self-circularized DNA. Thus, depending on the protocol and particular execution, two sequenced libraries with the same number of reads may contain different number of informative interactions. Even worse, randomly ligated molecules cannot be computationally removed from a library and, if present in large numbers, may completely mask true interactions, especially at large distances or in trans.
Third, any Hi-C experiment cannot achieve the resolution higher than a few average restriction fragments. In particular, the commonly used HindIII is a 6bp-cutter and on average cuts every ~3kb; thus, any experiment using HindIII has a maximal resolution of ~10kb.
See a more detailed discussion in this review by Dekker's lab.
(b) The issue is complicated by the fact that, AFAIK, there is no published, publicly accepted definition of a resolution of a Hi-C dataset/protocol. This is not to say that all datasets are created equal - the original Hi-C dataset from 2009 could not really distinguish TADs, and it took a high read depth dataset produced using a modified protocol by Lieberman-Aiden's lab in late 2014 to clearly distinguish individual loop interactions at corners of TADs. Now, the classic definition of resolution is the minimal size of a feature that the method can distinguish. For Hi-C, it's not entirely clear what specific fine features are expected to be seen in any specific sample; without knowing such features one cannot tell the resolution of a chosen dataset. Obviously, it is not an unsolvable problem, but officially it has not been solved yet. There are some standardization efforts currently undertaken by the freshly formed 4D Nucleus consortium, so hopefully in two-three years the definition of resolution in Hi-C will be more clear.