Forum:Uniprotkb Accession Number Format To Be Extended To 10 Characters
Entering edit mode
10.4 years ago

UniProtKB accession numbers currently consist of 6 alphanumerical characters. With our projected growth of UniProtKB, we expect to use up all accession numbers of this format in 2014. We will therefore extend the format to 10 alphanumerical characters.

Read more here: and contact the UniProt helpdesk with any comments you might have.

uniprot web-service • 3.0k views
Entering edit mode
10.4 years ago
Michael 54k

Reminds me of transition to from IPv4 to IPv6 ;) That would give 2.611467e+13 new IDs following the new scheme. If we assume there a 10 million species on earth and each contributes on average 20,000 proteins (total 2e+12), then these numbers should be sufficient.

I'd expect a question on BioStar like: "How can I map from old to new UniProtKB accession numbers?", but if I understand correctly both short and long versions should co-exist and the already assigned ANs should not be changed, and new ANs only assigned to new proteins? Further, is it a problem that some new IDs can have valid or existing old ANs as prefixes according to your definition?

Entering edit mode

Regarding the mapping: yes, you are correct, short and long versions will co-exist. New ACs are assigned to new entries, and already assigned ACs usually do not change. If they do need to change, this will be handled like with the current AC scheme: .


Login before adding your answer.

Traffic: 1297 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6