I am trying to automate a query for non-canonical bacterial protein IDs in NCBI.
The IDs are not standard refseq (e.g. NP_*) but instead start with HDK (e.g. GenBank: HDK9254199.1)
They exist in NCBI (https://www.ncbi.nlm.nih.gov/protein/HDK9254199) but I have not yet been able to recover them programmatically.
In R,
library(rentrez)
search <- entrez_search(db = "protein", term = "HDK9254199")
summary <- entrez_summary(db = "protein", id= "HDK9254199")
search <- entrez_search(db = "nuccore", term = "HDK9254199")
summary <- entrez_summary(db = "nuccore", id= "HDK9254199")
All yield nothing. The summary error our and the searches yield nothing.
In a perfect world my colleague would have used the reference genome. But, they have already ordered an expensive library corresponding to these accession terms.
Is there any way to map these HDK accessions back to canonical refseq terms?