Question

Mapping Gene and Protein names between Uniprot, Swiss Prot, and Entrez

0

Entering edit mode

4.1 years ago

whayes • 0

Ultimate goal: cross-species network alignment for functional prediction. Current problem: mapping BioGRID (ENTREZ_GENE) IDs to/from GO term databases.

I’m trying to produce a GAF or gene2go type file for historical releases of the GO database. On the Gene Ontology Archive (http://archive.geneontology.org/full/), I can’t find any GAF or gene2go files, only SQL databases which are huge and apparently require both SQL and perl to regenerate the GAF files—too much work! So, first question: do there already exist GAF or gene2go files for historical releases?

Then I found EBI’s releases (ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/old), and they use Uniprot/Swiss-prot names. I have figured out how to automatically map between various naming converntions using the IDENTIFIERS files that are released with BioGRID. However, I often find that there are multiple mappings with very different IDs, so I can’t figure out which BioGRID gene/protein is annotated with which GO terms. Here’s a very specific example:

From the 21 April, 2010 release at ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/old/UNIPROT/goa_uniprot_gcrp.gpi.174.gz there is the following line:

9606     B5MCF5  GO:0005634      IDA     0       Putative uncharacterized protein STON1-GTF2A1L  C

From BioGRID’s IDENTIFIERS files, I find the folowing mappings for B5MCF5:

130414  B5MCF5  UNIPROT-ACCESSION
116226  B5MCF5  UNIPROT-ACCESSION

(I also can’t figure out how to make this editor break the lines rather than paragraphing them, sorry)

Unfortunately those two BioGRID IDs on the left map to two different ENTRE_GENE IDs:

116226  11037   ENTREZ_GENE
130414  286749  ENTREZ_GENE

So, should the above annotation be applied to Entrez gene 11037, or 286749?

Gene Ontology Protein Names • 840 views

ADD COMMENT • link updated 4.1 years ago by GenoMax 141k • written 4.1 years ago by whayes • 0

0

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLY • link 4.1 years ago by GenoMax 141k