Map The Protein Sequences Of Unknown Genes That Have Uniprot Names To Ensembl Genes.
2
0
Entering edit mode
10.9 years ago
Hmm ▴ 500

Is there a way to map the protein sequences of unknown genes that have uniprot names to ensembl genes. (command line or through api or any ways....)

Uniprot names are as follows:

YJ015_HUMAN
ZNAS2_HUMAN
CJ052_HUMAN
YJ004_HUMAN
CEAS1_HUMAN
YJ008_HUMAN
CTLFB_HUMAN
YJ018_HUMAN
ZN487_HUMAN
FRP2L_HUMAN
SRCRM_HUMAN
HUG1_HUMAN
TI23B_HUMAN
ADAS1_HUMAN
YJ001_HUMAN
DMBTL_HUMAN
YJ017_HUMAN
YJ016_HUMAN
CSC2A_HUMAN
ASA2C_HUMAN
ANTRL_HUMAN
AGA10_HUMAN
CJ085_HUMAN
CJ136_HUMAN
CJ108_HUMAN
CJ040_HUMAN
F25DE_HUMAN
YJ012_HUMAN
CJ115_HUMAN
YJ006_HUMAN
DAS1_HUMAN
CJ112_HUMAN
AKCL1_HUMAN
FA22B_HUMAN
CJ126_HUMAN
YJ013_HUMAN
CCYL2_HUMAN
genes ensembl gene ncbi map • 2.3k views
ADD COMMENT
3
Entering edit mode
10.9 years ago

Hope this helps You. UniProt ID Mapping. Download the text file from this site and use it As you like(via a program). You can also get data by Organism wise too.

ADD COMMENT
1
Entering edit mode

The UniProt.org Identifier Mapping service also provide a REST Web Service, see http://www.uniprot.org/faq/28#id_mapping_examples for details.

Equivalent functionality is also available via the EMBL-EBI's Protein Identifier Cross-Reference (PICR) service which provides both REST and SOAP based Web Services for automation.

ADD REPLY
1
Entering edit mode
10.9 years ago
Andy Yates ▴ 120

Hi,

As the person who works very closely with Ensembl's xref system. Ensembl's UniProt xrefs come from two sources. The first is direct association where a UniProt annotator has made an assertion about the link between us & them. The second is an alignment using Exonerate. I haven't investigated your list in that much detail but the random selection I took had the following results

  • CJ108_HUMAN redirects to http://www.uniprot.org/uniprot/Q8N8Z3. Seems like a retired identifier. Q8N8Z3 does link to Ensembl (ENSG00000180525)
  • YJ015_HUMAN has a note saying Product of a dubious CDS prediction. Protein evidence level 5 (uncertain - not good)
  • CJ052_HUMAN has a note saying Product of a dubious CDS prediction. Probable non-coding RNA. Protein evidence level 5 (uncertain - not good)

I would recommend pre-filtering this list through UniProt a trying to reduce it down to a set of proteins which you think are really missing from Ensembl. Filtering those which are retired and maybe consider if you really want to look for PE level 5 proteins. Then the best action I can suggest is to use Ensembl's Blast service to find the mappings. If you have any problems please contact our helpdesk and I'm sure we can help out a little more.

ADD COMMENT
0
Entering edit mode

UniProtKB names are subject to change, in the case of CJ108_HUMAN the name has changed to PRR26_HUMAN, but the primary accession (Q8N8Z3) has remained the same (see http://www.uniprot.org/uniprot/Q8N8Z3?version=* for a summary of the changes to this entry). Unsurprisingly UniProt recommend the use of accession numbers over names, although names are slightly more human friendly.

ADD REPLY

Login before adding your answer.

Traffic: 4034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6