Question: How to convert bulk UniProt Id to GO terms/Ids?
0
gravatar for mirzaei86.vahid
8 months ago by
mirzaei86.vahid10 wrote:

Hello

I have worked on a transcriptome and I have got UniProt Id from blastx output (near 20K uniprot accessions). In my project I should do GO analysis and pathway analysis for them and I could not use Trinotate because I have done analysis with different software.

How can I extract GO Ids/terms from bulk UniProt accession? and then enrich them?

Thanks

rna-seq uniprot assembly • 965 views
ADD COMMENTlink modified 11 weeks ago by chr.ebeling0 • written 8 months ago by mirzaei86.vahid10
3
gravatar for Elisabeth Gasteiger
8 months ago by
Geneva
Elisabeth Gasteiger1.2k wrote:

To extract GO terms for a list of UniProtKB identifiers, use the UniProt batch retrieve tool (http://www.uniprot.org/uploadlists) as suggested above, but instead of mapping UniProtKB IDs to an external database, map from UniProtKB to UniProtKB.

Once you have your result, you can click on "Columns" and customize your result table layout, as described in http://www.uniprot.org/help/customize or http://insideuniprot.blogspot.ch/2015_03_01_archive.html.

The customization interface contains a section "Gene Ontology", where you can select to see a complete list, or separate columns for the 3 ontologies molecular function, biological process or cellular component, or a list of identifiers only.

You can remove all columns you are not interested in in this context, and then download the results in tab-delimited format.

Or you can access the UniProt website programmatically, with one query per accession number (http://www.uniprot.org/help/programmatic_access): for a given UniProtKB identifier, e.g. Q9ZUA2, you can use this URL http://www.uniprot.org/uniprot/?query=Q9ZUA2&format=tab&columns=id%2Cgo

Please don't hesitate to contact the UniProt helpdesk if you have any additional questions.

ADD COMMENTlink written 8 months ago by Elisabeth Gasteiger1.2k

@Elisabeth Nice description. As I understood apart from annotation, 'mirzaei86.vahid' also wants to perform enrichment analysis.

ADD REPLYlink written 8 months ago by EagleEye4.8k

Hi Elizabeth, thanks for your help.

ADD REPLYlink written 8 months ago by mirzaei86.vahid10
1
gravatar for EagleEye
8 months ago by
EagleEye4.8k
Sweden
EagleEye4.8k wrote:

1) Convert your Uniprot Ids to Gene name/HGNC or Gene Id (Entrez ID) using uniprot id mapping.

2) Use Entrez Ids or Gene names (symbols) in GeneSCF for enrichment analysis (KEGG and GO) or annotation.

ADD COMMENTlink modified 8 months ago • written 8 months ago by EagleEye4.8k
0
gravatar for pb
8 months ago by
pb20
Canada
pb20 wrote:

You can also use EBI QuickGO tools to fetch GO terms/ID programmatically.

ADD COMMENTlink modified 8 months ago • written 8 months ago by pb20
0
gravatar for chr.ebeling
11 weeks ago by
chr.ebeling0 wrote:

Dear mirzaei86.vahid,

you can use the query functions of the python library pyuniprot.

install (with pip or git clone) and update. Find out which taxonomy identifier fits to your organisms. Example here (human, mouse, rat). Don't make a full update for all organisms (takes very long).

Python code:

pyuniprot.update(taxids=[9606, 10090, 10116])

Use following python code for your problem:

if 1433E_HUMAN and A4_HUMAN are the identifiers you are looking for:

Python code:

import pyuniprot
query = pyuniprot.query() 
entries = query.entry(name=('1433E_HUMAN', 'A4_HUMAN'))  
first_accessions = [entry.accessions[0] for entry in entries]
gos = query.db_reference(entry_name=('1433E_HUMAN', 'A4_HUMAN'), type_='GO')
go_ids = [x.identifier for x in gos]

Best regards

ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by chr.ebeling0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1422 users visited in the last hour