I posted to reddit/askscience/. I posted to Quora.
Then I realized there is a website where someone might actually KNOW whats going on!
So I present to you the best explanation I've found so far from wikipedia:
Current methods can directly sequence only relatively short (300–1000 nucleotides long) DNA fragments in a single reaction. The main obstacle to sequencing DNA fragments above this size limit is insufficient power of separation for resolving large DNA fragments that differ in length by only one nucleotide. In all cases the use of a primer with a free 3' end is essential.
I'm interpreting the "insufficient power to resolve large DNA fragments that differ in length by only one nucleotide." to mean that as inputs you need lengths of DNA exactly ... 523 bps in length ( or whatever the machine specifies )
To ME ( with my limited knowledge of the subject matter ) this seems trivial. If what is needed is to "resolve" out DNA lengths with more precision why cant we just lower the viscosity of the gel, make the bath longer and run the electrophoresis for an extended period of time?
... but this still doesn't make sense to me.
In the process of prepping a sample for DNA sequencing the researcher will run a electrophoresis and remove a specific chunk of DNA corresponding to a particular sequence length from the gel.
The question is: Why are we selecting particular lengths of DNA instead of just using the longest possible lengths we've got in the DNA solution? Note: As each sequencing technology is different feel free to specify which you are most familiar with ( I'm interested in all their limitations )
Just a note that surprised me when I first discovered it: on an Illumina machine, the enzymes are actually "clumbing down" the DNA (not up). Sequencing happens from the "top" of the read down towards the flowcell.
I would imagine secondary structure issues also become an issue for longer strands of DNA
In my mind, the enzymes are little monkeys planting colored flags on the trunk of a coconut tree, while a helicopter takes pictures from above, and no way is a moneky going to plant a flag, and then take it out one at a time while climbing down. The monkey is just going to scramble down, and forget the flags. So it has to be up.
I work with computers and large text files and scripts. Enzymes, monkeys, mostly the same thing, right?
so I know with the case of pac bio the fluorescent is a custom nucleotide. Is this the case with Illumina? I don't understand why that would chemically fade? Wouldn't a solution then to be to constantly wash a fresh chemical solution over them?
For pac bio the degradation is apparently very binary--- the polymerase in the well fails due to light exposure after some period (according to their marketing at least). The mechanism is a bit different, and as a result it's possible, although unlikely, to get extremely long reads in the >10kb range with that technology.