To vet the genomic coordinate input going into my application, I'm trying to write sensible limits that will be agnostic about organism (hopefully remaining so, even as more organisms are sequenced).
As an example of what I'm after, a coordinate pair should always be between 0 (or 1, depending on index) and some (hopefully sensible) upper bound, e.g. 1010 bases.
I've read that Protopterus aethiopicus has the largest genome at 133 billion base pairs. But what are the sizes of its chromosomes?
Alternatively, are there organisms with smaller genomes, but very large chromosomes?
Essentially, I'm looking for the largest chromosome irrespective of organism, so that I can multiply by a safety factor and be done. Does anyone know what this value might be, given those genomes that have been sequenced or otherwise had their genomic sizes approximated?
(Not so much interested in alternatives to setting an the upper bound, like using dynamically-allocated memory for strings of arbitrary length, say, which hold the numerical values of the coordinates. Just interested in setting a 'safe' discrete value to this.)