Question: (Closed) Organism With The Chromosome With The Largest Number Of Bases?
2
gravatar for Alex Reynolds
5.2 years ago by
Alex Reynolds21k
Seattle, WA USA
Alex Reynolds21k wrote:

To vet the genomic coordinate input going into my application, I'm trying to write sensible limits that will be agnostic about organism (hopefully remaining so, even as more organisms are sequenced).

As an example of what I'm after, a coordinate pair should always be between 0 (or 1, depending on index) and some (hopefully sensible) upper bound, e.g. 1010 bases.

I've read that Protopterus aethiopicus has the largest genome at 133 billion base pairs. But what are the sizes of its chromosomes?

Alternatively, are there organisms with smaller genomes, but very large chromosomes?

Essentially, I'm looking for the largest chromosome irrespective of organism, so that I can multiply by a safety factor and be done. Does anyone know what this value might be, given those genomes that have been sequenced or otherwise had their genomic sizes approximated?

(Not so much interested in alternatives to setting an the upper bound, like using dynamically-allocated memory for strings of arbitrary length, say, which hold the numerical values of the coordinates. Just interested in setting a 'safe' discrete value to this.)

chromosome • 3.2k views
ADD COMMENTlink modified 5.2 years ago by a.zielezinski7.4k • written 5.2 years ago by Alex Reynolds21k

exact duplicate of

The longest chromosome > sizeof(int32)

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by Pierre Lindenbaum102k

I think the first issue to decide is if the representation should be 32 bit or 64 bit integers. After that I would use common sizes (magnitudes that accommodate human genome sizes) and only allow longer chromosomes if the user specifically overrides that setting. That would head off a large number of situations where people accidentally compute grossly incorrect start/end coordinates and thus proceed to nuke their systems.

ADD REPLYlink written 5.2 years ago by Istvan Albert ♦♦ 75k

I don't understand why finding the max size of a chromosomal outlier is a worthwhile ("sensible") upper limit.

ADD REPLYlink written 5.2 years ago by AGS230

It's sensible because some software won't handle some length greater than 2^32 . For example the picard library use a signed integer to store the position (max value = 2,147,483,647) . Adding 'one' base to this value would change the integer to a negative value.

ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum102k

In that case, you have your answer answer as to "max" chromosome size, no?

ADD REPLYlink written 5.2 years ago by AGS230

We're doing error checking, so the goal is to be reasonably agnostic about genome. Thanks again for your help.

ADD REPLYlink written 5.2 years ago by Alex Reynolds21k

In C, C++ or java you would'nt use a signed integer to store the size of the human genome.

ADD REPLYlink written 5.2 years ago by Pierre Lindenbaum102k
2
gravatar for a.zielezinski
5.2 years ago by
a.zielezinski7.4k
a.zielezinski7.4k wrote:

Paris Japonica has 150 billion bases over 40 chromosomes and most likely has largest chromosomes.

See: The lathest eukaryotic genome of them all

ADD COMMENTlink written 5.2 years ago by a.zielezinski7.4k
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1505 users visited in the last hour