Question: How to convert bulk UniProt Id to GO terms/Ids?
gravatar for mirzaei86.vahid
15 months ago by
mirzaei86.vahid10 wrote:


I have worked on a transcriptome and I have got UniProt Id from blastx output (near 20K uniprot accessions). In my project I should do GO analysis and pathway analysis for them and I could not use Trinotate because I have done analysis with different software.

How can I extract GO Ids/terms from bulk UniProt accession? and then enrich them?


rna-seq uniprot assembly • 2.3k views
ADD COMMENTlink modified 8 months ago by chr.ebeling0 • written 15 months ago by mirzaei86.vahid10
gravatar for Elisabeth Gasteiger
14 months ago by
Elisabeth Gasteiger1.4k wrote:

To extract GO terms for a list of UniProtKB identifiers, use the UniProt batch retrieve tool ( as suggested above, but instead of mapping UniProtKB IDs to an external database, map from UniProtKB to UniProtKB.

Once you have your result, you can click on "Columns" and customize your result table layout, as described in or

The customization interface contains a section "Gene Ontology", where you can select to see a complete list, or separate columns for the 3 ontologies molecular function, biological process or cellular component, or a list of identifiers only.

You can remove all columns you are not interested in in this context, and then download the results in tab-delimited format.

Or you can access the UniProt website programmatically, with one query per accession number ( for a given UniProtKB identifier, e.g. Q9ZUA2, you can use this URL

Please don't hesitate to contact the UniProt helpdesk if you have any additional questions.

ADD COMMENTlink written 14 months ago by Elisabeth Gasteiger1.4k

@Elisabeth Nice description. As I understood apart from annotation, 'mirzaei86.vahid' also wants to perform enrichment analysis.

ADD REPLYlink written 14 months ago by EagleEye5.1k

Hi Elizabeth, thanks for your help.

ADD REPLYlink written 14 months ago by mirzaei86.vahid10

Hello Elisabeth Gasteiger , It was a very helpful explanation that you gave. Could you please help me too by letting me know how could I get complete GO terms by using UNIprot IDs. the whole is mentioned below. The whole list returns the query (UNIprot IDs) as TRUE/FALSE for these headings mentioned below:

GOBP_Biological regulation GOBP_Cellular process GOBP_Developmental process GOBP_Growth GOBP_Immune system process GOBP_Interaction with cells and organisms GOBP_Localization GOBP_Metabolic process GOBP_Regulation GOBP_Reproduction GOBP_Response to stimulus GOBP_Other GOCC_Endosome GOCC_Chromosome GOCC_Ribosome GOCC_Golgi GOCC_ER GOCC_Mitochondria GOCC_Nucleus GOCC_Peroxisome/microbody GOCC_Cytoskeleton GOCC_Plasma membrane GOCC_Cell surface GOCC_Extracellular GOCC_Other intracellular organelles GOCC_Other cytoplasmic vesicle GOCC_Macromolecular complex GOCC_Cytoplasm GOCC_Other GOMF_Antioxidant Activity GOMF_Binding GOMF_Catalytic Activity GOMF_Enzyme regulator activity GOMF_Molecular transducer activity GOMF_Structural molecule activity GOMF_Transcription regulator activity GOMF_Translation regulator activity GOMF_Transporter activity GOMF_Chaperone activity GOMF_Motor activity GOMF_Other

ADD REPLYlink written 3 months ago by vipulbatra.pu0

I'm afraid I don't quite understand what you are trying to do. What is your input? Please don't hesitate to contact the UniProt helpdesk with your question (or open a new thread in BioStars).

ADD REPLYlink written 3 months ago by Elisabeth Gasteiger1.4k

I have a list of proteins. Someone helped me with GO results as shown in figure. I want to arrange other lists of proteins in the same format. The 2 pictures are in continuation. ![part1][1]![part2 ][2] I am not sure what tool they have used

My input is Uniprot IDs. When i map them using uniprot, I get only 3 components of GO, not a complete list as shown in the Image I attached as TRUE/FALSE

ADD REPLYlink modified 3 months ago • written 3 months ago by vipulbatra.pu0
gravatar for EagleEye
15 months ago by
EagleEye5.1k wrote:

1) Convert your Uniprot Ids to Gene name/HGNC or Gene Id (Entrez ID) using uniprot id mapping.

2) Use Entrez Ids or Gene names (symbols) in GeneSCF for enrichment analysis (KEGG and GO) or annotation.

ADD COMMENTlink modified 15 months ago • written 15 months ago by EagleEye5.1k
gravatar for Pallab Bhowmick
14 months ago by
Pallab Bhowmick20 wrote:

You can also use EBI QuickGO tools to fetch GO terms/ID programmatically.

ADD COMMENTlink modified 14 months ago • written 14 months ago by Pallab Bhowmick20
gravatar for chr.ebeling
8 months ago by
chr.ebeling0 wrote:

Dear mirzaei86.vahid,

you can use the query functions of the python library pyuniprot.

install (with pip or git clone) and update. Find out which taxonomy identifier fits to your organisms. Example here (human, mouse, rat). Don't make a full update for all organisms (takes very long).

Python code:

pyuniprot.update(taxids=[9606, 10090, 10116])

Use following python code for your problem:

if 1433E_HUMAN and A4_HUMAN are the identifiers you are looking for:

Python code:

import pyuniprot
query = pyuniprot.query() 
entries = query.entry(name=('1433E_HUMAN', 'A4_HUMAN'))  
first_accessions = [entry.accessions[0] for entry in entries]
gos = query.db_reference(entry_name=('1433E_HUMAN', 'A4_HUMAN'), type_='GO')
go_ids = [x.identifier for x in gos]

Best regards

ADD COMMENTlink modified 8 months ago • written 8 months ago by chr.ebeling0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1939 users visited in the last hour