Question: Converting A Set Of Protein Ids To Gene Ids
0
gravatar for lin.barnum
6.1 years ago by
lin.barnum230
lin.barnum230 wrote:

I wanted to convert a set of protein IDs to gene IDs. I have a list of proteins IDs such as

ADF1_DROME
A1Z8K2_DROME
A0AQF9_DROME
A0AQG1_DROME
B4J066_DROGR
A0AQG9_DROSI
B3NHM7_DROER
B3NB45_DROER
B3P0U4_DROER
B3NJP0_DROER

and my final aim is to identify the genes that they originate from since many of them are different transcripts from the same gene. How could I go about it? I tried BioMart but could not figure out how this could be done there. All of my proteins are from dipterans.

gene database genes • 3.7k views
ADD COMMENTlink modified 6.1 years ago by Arnaud Ceol840 • written 6.1 years ago by lin.barnum230
3
gravatar for Arnaud Ceol
6.1 years ago by
Arnaud Ceol840
Milan, Italy
Arnaud Ceol840 wrote:

You are looking to convert from Uniprot IDs, so the most straight forward way is to use the mapping tool from Uniprot: go to http://www.uniprot.org/ , and then to the ID mapping tab. Here paste you proteins ID (or load them from a file), choose UniprotKB AC?ID as input and GeneID as output and click on MAP.

ADD COMMENTlink written 6.1 years ago by Arnaud Ceol840
1
gravatar for brentp
6.1 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

You can start with this mapping from the UCSC database given your inputs in a file names.txt

 awk -F_ '{ print $1 }' names.txt \
    | xargs -i mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A uniProt -N -e \
      "select * from gene where acc = '{}'"
ADD COMMENTlink written 6.1 years ago by brentp23k

Thanks, this worked beautifully.

ADD REPLYlink written 6.1 years ago by lin.barnum230
0
gravatar for Bill Pearson
6.1 years ago by
Bill Pearson860
Bill Pearson860 wrote:

As you noticed, your problem is that you are using UniprotKB ID's, and there is no guaranteed mapping between UniprotKB ID's and genes (even in Uniprot). I suggest you find matches between your UniprotKB ID's and NCBI Refseq Protein IDs (which you do on the Uniprot web site using the ID mapping option). (Be aware that sometimes the mapped proteins are not identical.)

Once you have NCBI RefSeq Protein ID's, it is easy to get NCBI Gene ID's and Refseq mRNA IDs.

ADD COMMENTlink written 6.1 years ago by Bill Pearson860
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1837 users visited in the last hour