8.2 years ago by
Athens, GA, USA
DNA sequence complexity in modern terms has changed from its original meaning, which is part of the reason you are having trouble finding a clear definition.
The original usage of complexity relates to studies using reassociation kinetics to measure how repetitive/unique a genome was, through Cot curve analysis. In short these studies measured how fast denatured DNA reassociated, with faster reassociation impying increased repetetiveness. The "complexity" of a genome was then measured by the time at which half of the DNA was reassociated.
The more modern usage takes this original term and tries to apply it (loosely) to actual DNA sequences, with the same notion that more complex sequences have a higher degree of uniqueness, and vice versa. I am not sure that there is an widely-accepted definition of this modern usage applied to the "complexity" of DNA sequences. My suspicion is that this is operationally defined with respect to the algorithm used (simple sequence repeat detection, compression, etc.). A little google-ing found this definition that seems reasonable (but no reference is provided):
"The complexity of a sequence is defined as the longest non-repetitive sequence that can be derived from a sequence", e.g.