Is UniProt part of the browser release agreement?
2
0
Entering edit mode
22 months ago

I am doing a review on the entry of a protein in UniProt and how aspects of the entry is different to various other databases, and am getting a bit confused with how each database gets its data from. UniProt and ensembl etc. say they are all under the umbrella of EMBL-EBI so why would there data be different? I found out about the Browser release agreement put in place for databases to share and analyse their data to stop errors and inconsistencies but I can't figure out if UniProt is part of that agreement? Please help!

UniProt genome browsers • 363 views
1
Entering edit mode
22 months ago

UniProt is maintained by the UniProt Consortium, which consists of the SIB Swiss Institute of Bioinformatics, EMBL-EBI and PIR (Proteome Information Resource), see https://www.uniprot.org/help/about.

UniProt has been released every 4 weeks for several years, but will switch to releases every 8 weeks as of 2020.

Translations of nucleotide sequences as submitted to one of the members of International Nucleotide Sequence Database Collaboration (INSDC) (the European Nucleotide Archive (ENA), GenBank and the DNA Data Bank of Japan (DDBJ)) are included in UniProtKB/TrEMBL at every release, and most cross-references are also updated at every release.

There can be synchronization delays 5 to 8 weeks between the update of a related database and the appearance of relevant data in the public version of UniProtKB. These delays will become longer as the release cycles will switch from 4 to 8 weeks.

As UniProt does not provide a genome browser, nor genomic data, it is not part of the Browser Genome Release Agreement https://www.ensembl.org/info/about/legal/browser_agreement.html

0
Entering edit mode
22 months ago
GenoMax 107k

UniProt database is described in this special NAR database issue publication. That paper is available for free so you should be able to see it. There are two parts to UniProt database. Swissprot section is human curated and is gold standard for protein sequence information. TrEMBL on the other hand is larger but is not human curated so it is likely to contain some erroneous information.

Ensembl is a larger project that

is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms

If you have not come across it yet then Nucleic Acid Research's annual nucleic acid database issue is a good place to get answers. 2020 Edition should be out next month.