Question: Batch convert gene numbers (ordered locus names) to protein IDs
0
gravatar for Qroid
3.0 years ago by
Qroid40
Qroid40 wrote:

I asked the same question on stack exchange.

I have a list of >1000 "ordered locus names" from the STRING database (also called gene numbers, ORF numbers, or CDS numbers by uniprot). They're in the form of 'b0014', 'b0015', etc. I'd like to convert them all to protein IDs.

I know how to do it in a non-batch way, since searching Uniprot for 'b0014' returns what (I think) is a protein ID. However I'd like an automated solution, since I might need to repeat this in the future and manually searching even 1000 would take forever.

ADD COMMENTlink written 3.0 years ago by Qroid40

Please do not cross-post.

ADD REPLYlink written 3.0 years ago by RamRS21k
0
gravatar for genomax
3.0 years ago by
genomax68k
United States
genomax68k wrote:

Uniprot ID converter tool is generally the best way of mapping names in bulk. Are the b* numbers STRING ID's? If I select STRING as the source they are not mapping to Uniprot.

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by genomax68k

Thanks, Uniprot ID converter looks good. As far as I know, 'b0014' etc. are 'ordered locus names', ‘ORF numbers’, ‘CDS numbers’ or ‘Gene numbers’ (http://www.uniprot.org/help/gene_name ). If I choose "gene name" in the first drop down everything appears to work! So thank you!

ADD REPLYlink modified 3.0 years ago by genomax68k • written 3.0 years ago by Qroid40

Indeed. Glad that worked for you.

ADD REPLYlink written 3.0 years ago by genomax68k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2163 users visited in the last hour