I've got a huge list of Uniprot IDs and I want to get the matching gene names.
Do you know how to do that in python ? (I'm currently searching with Biopython...)
See related post here.
You can do this using the retrieve function at www.uniprot.org (4th tab element in the top bar)
Upload your list of ID's.
Look for the small blue UniProtKB (number of retrieved) entries link. Click this.
Then use the customize display to select only gene names.
Then click download as tab.
Ok, not very pythonic, but a few http calls from python would work.
Perhaps sth like this:
for record in SwissProt.parse(open('uniprot_sprot.dat')):
accessions = record.accessions
gene_name = record.gene_name
Well, my code snippet was rather meant as an inspiration of how to access the uniprot_id and corresponding gene_name from swissprot. Once you have that mapping (e.g. as a dictionary) it should be easy to do the mapping from that to your problem setting.
Yep. I got what I wanted with this :
url = 'http://www.uniprot.org/mapping/'
data = urllib.urlencode(params)
request = urllib2.Request(url, data)
response = urllib2.urlopen(request)
page = response.read(200000)
Then I've got an homemade dictionary id_Ensembl <-> geneName
Thanks a lot for your answers guys !
It's a good beginning I think. But my input file is like this : "Uniprot_ID--->position", one per line.
So I just want to translate Uniprot_ID to Gene_name in output. Does anything exist yet ?
Thx a lot !
Oups, I don't know how to paste code here...if an admin can edit it..thanks
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy