Batch convert gene numbers (ordered locus names) to protein IDs
1
0
Entering edit mode
7.8 years ago
Qroid ▴ 40

I asked the same question on stack exchange.

I have a list of >1000 "ordered locus names" from the STRING database (also called gene numbers, ORF numbers, or CDS numbers by uniprot). They're in the form of 'b0014', 'b0015', etc. I'd like to convert them all to protein IDs.

I know how to do it in a non-batch way, since searching Uniprot for 'b0014' returns what (I think) is a protein ID. However I'd like an automated solution, since I might need to repeat this in the future and manually searching even 1000 would take forever.

protein ID gene number string database • 4.7k views
ADD COMMENT
0
Entering edit mode

Please do not cross-post.

ADD REPLY
0
Entering edit mode
7.8 years ago
GenoMax 141k

Uniprot ID converter tool is generally the best way of mapping names in bulk. Are the b* numbers STRING ID's? If I select STRING as the source they are not mapping to Uniprot.

ADD COMMENT
0
Entering edit mode

Thanks, Uniprot ID converter looks good. As far as I know, 'b0014' etc. are 'ordered locus names', ‘ORF numbers’, ‘CDS numbers’ or ‘Gene numbers’ (http://www.uniprot.org/help/gene_name ). If I choose "gene name" in the first drop down everything appears to work! So thank you!

ADD REPLY
0
Entering edit mode

Indeed. Glad that worked for you.

ADD REPLY

Login before adding your answer.

Traffic: 2815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6