Question

Matching protein IDs starting with WP_ to protein IDs starting with YP_

0

Entering edit mode

6 months ago

bio_student • 0

Hi all,

I have annotated genomes where all CDS have protein IDs starting with YP (e.g., YP_005225157.1). Is there a way to automatically convert protein IDs that start with YP to those starting with WP, without the need to search for them manually in the database?

Thank you.

ID protein refseq • 496 views

ADD COMMENT • link 6 months ago by bio_student • 0

0

Entering edit mode

what is

those starting with WP

ADD REPLY • link 6 months ago by Pierre Lindenbaum 161k

0

Entering edit mode

For example, I have protein ID: YP_005229578.1 and identical proteins have ID: WP_004151534.1 (see: https://www.ncbi.nlm.nih.gov/ipg/YP_005229578.1). So I would like to somehow match identical proteins which have two protein IDs - YP and WP.

ADD REPLY • link 6 months ago by bio_student • 0

score 1 · Answer 1 · 2023-10-31

1

Entering edit mode

6 months ago

GenoMax 142k

Using EntrezDirect:

$ esearch -db ipg -query "YP_005225157" | esummary | xtract -pattern DocumentSummary -element Accession
WP_002888811.1

$ esearch -db ipg -query "YP_005229578" | esummary | xtract -pattern DocumentSummary -element Accession
WP_004151534.1