Uniprot id to GO terms
2
0
Entering edit mode
21 months ago
Shweta • 0

I have Uniprot Ids of 5000 prokaryotic proteins. Would anyone suggest how to do gene ontology for those proteins as they belong to different taxa?

UniProt GO • 1.4k views
ADD COMMENT
0
Entering edit mode

Your previous question where answered. VCF file evaluation ; T2T human genome ; Conversion of Kegg id to uniprot id

Please accept the answers so the question is marked solved on the website. To do that, click on the green check mark on the left side of the answer.

ADD REPLY
1
Entering edit mode
21 months ago

You can upload your list of UniProtKB identifiers to the UniProt Batch retrieval service at https://www.uniprot.org/id-mapping Select to map from UniProtKB to UniProtKB. Once you have your results, click on "Download", select tsv (tab-separated) format and choose the columns you would like to see in your table. There are options for GO terms, GO IDs, or GO terms separated by ontology (molecular function, biological process, cellular component).

Please don't hesitate to contact the UniProt helpdesk if you have any questions on how to use this service.

ADD COMMENT
0
Entering edit mode
21 months ago

something like

join -t $'\t' -1 1 -2 2 \
     <(sort your.list.of.uniprot.ids) \
      <(wget -O - "http://current.geneontology.org/annotations/goa_uniprot_all.gaf.gz" | gunzip -c | sort -T . -t $'\t' -k2,2)
ADD COMMENT
0
Entering edit mode

The above is correct, but just to make this clear as it's easily misunderstood: the goa_uniprot_all.gaf does not contain all GO annotations- just the ones created/maintained by GOA, a (fantastic) team of curators at EBI. However, since the original file contains only prokaryotes and the taxons in that list are rather less likely to have a dedicated Model Organism Database (like MGI, RGD, or SGD) with a specific GAF, the GOA file is indeed probably the best source for this particular user. The goa_uniprot_all.gaf contains electronic annotations (IEAs), which I suspect would make up the bulk of annotations for most prokaryotes. The "all" in the filename is to contrast with goa_uniprot_all_noiea.gaf, which does not contain IEAs just manual annotations (from GOA).

If you can divide by taxon, there may be additional annotations in other files- for example, ecocyc.gaf.gz would be a good source for taxon:83333 (E. coli).

GO does have plans to make the annotation files specific to taxon(s), instead of the currently available files that are sorted by assigning group, but this is a complicated issue and we have no delivery date on it.

ADD REPLY

Login before adding your answer.

Traffic: 2258 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6