It's a good beginning I think. But my input file is like this : "Uniprot_ID--->position", one per line. So I just want to translate Uniprot_ID to Gene_name in output. Does anything exist yet ? Thx a lot !
Perhaps sth like this:
for record in SwissProt.parse(open('uniprot_sprot.dat')):
accessions = record.accessions
gene_name = record.gene_name
Chris
It's a good beginning I think. But my input file is like this : "Uniprot_ID--->position", one per line. So I just want to translate Uniprot_ID to Gene_name in output. Does anything exist yet ? Thx a lot !
Well, my code snippet was rather meant as an inspiration of how to access the uniprot_id and corresponding gene_name from swissprot. Once you have that mapping (e.g. as a dictionary) it should be easy to do the mapping from that to your problem setting.
Yep. I got what I wanted with this :
url = 'http://www.uniprot.org/mapping/'
query=uniprot_id params = {'from':'ACC','to':'ENSEMBL_ID','format':'tab','query':query} data = urllib.urlencode(params) request = urllib2.Request(url, data) response = urllib2.urlopen(request) page = response.read(200000)
Then I've got an homemade dictionary id_Ensembl <-> geneName
Thanks a lot for your answers guys !
Yo.
You can do this using the retrieve function at www.uniprot.org (4th tab element in the top bar)
Upload your list of ID's.
Look for the small blue UniProtKB (number of retrieved) entries link. Click this. Then use the customize display to select only gene names. Then click download as tab.
Ok, not very pythonic, but a few http calls from python would work.
See related post here: http://biostar.stackexchange.com/questions/22/gene-id-conversion-tool/8107#8107