How to convert a list of Uniprot IDs to Entrez IDs from different species?
1
0
Entering edit mode
7.3 years ago
ypll ▴ 10

Hi, I have an annotation file for a non-model specie of Aspergillus that was generated from the best BLAST (on UNIPROT) hit for each entry. Therefore, the list of UNIPROT IDs is not restricted to one organism but several (e.g ATG12_ASPCL, UBC2_MEDSA, UBE2Z_HUMAN). I want to do pathway and gene set enrichment analysis, and for that I need to have all the transcripts identified with Entrez ID from one model organism. For now I tried stripping the specie code from the Uniprot ID, leaving the Gene Name alone (i.e ATG12, UBC2, UBE2Z). I then used the bitr function from clusterProfiler to convert IDs using org.Sc.sgd.db for S. cerevisiae or org.Hs.eg.bd for human, but 72% and 86% (respectively) of the annotated genes were not mapped.
Could anyone suggest me a tool or a strategy to solve this issue? I hope I have explained my question properly. Thanks!

EDIT - Using Retrieve/ID mapping - UniProt you can convert to Entrez IDs, but the problem of having many (non-model) species for pathway analysis remains.

RNA-Seq Annotation • 4.4k views
ADD COMMENT
1
Entering edit mode

By using uniprot ID converter. Some ID's may still be unmappable.

ADD REPLY
1
Entering edit mode

Hi, I was just adding that quote, because I already tried it. Thanks!

ADD REPLY
0
Entering edit mode
7.3 years ago
EagleEye 7.5k

You will find all types of IDs for different organisms from NCBI. Try to make use of these file (Same basic scripting will be needed).

ftp://ftp.ncbi.nlm.nih.gov/gene/DATA

Information about all the files:

ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/README

ADD COMMENT

Login before adding your answer.

Traffic: 1636 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6