Question: Genbank: Gi Numbers Vs Accession Numbers?
9.1 years ago
Bio_X2Y wrote:

Some documentation I found suggests that a GenBank GI number will change each time the sequence changes - even if only one base is affected. The Accession number, on the other hand, remains the same.

However, the accession number is usually qualified with a version number suffix, e.g. "GL000191.1". As far as I know, this version number also increments each time the sequence changes.

Does this mean that the relationship between GI number and the Accession.Version pair is one-to-one, and so either would be equally suitable as a unique identifier for a sequence?


Yes, that's right. Accession.Version is probably preferred by humans, and GI number by machines.

@Pierre, hmmmm, while I can see the funny side, I had read that page, but I still wasn't 100% clear if the relationship was always one-to-one... :) e.g. I once thought that the version number would increment if the sequence changed OR the meta-information of the sequence changed, e.g. the gene symbol. I now know that isn't the case now, but I thought maybe GI would increment on a meta change....

9.1 years ago
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum wrote:

Yes, you're right: see

The two systems of identifiers run in parallel to each other. That is, when any change is made to a sequence, it receives a new GI number AND an increase to its version number.

9.1 years ago
São Paulo, Brazil
Jarretinha wrote:

Your observation is true. They run in parallel, but GI system is older. So, there's a lot of sequences tagged with version 1 but with many GIs in its history. For example, check the history of L42023 and its subsequences at the NCBI Sequence Revision History. You'll see a many GI changes without a version number associated with it.

Besides that, GI and versions change only when the sequence itself changes, not the annotations. Changes in annotations can be traced only by modification date. So, GIs and versions are mostly useless.

