I have a set of drug target "Biological Entity" (BE) IDs from DrugBank in the following format:
BE0000048 BE0000767 BE0001529 ...
I am having difficulty identifying an automated way to translate these IDs to Entrez ID format. Each BE identifier is associated with a drug target on the DrugBank site, and each has an associated UniProt ID listed on the DrugBank site (e.g. https://www.drugbank.ca/biodb/bio_entities/BE0000048 ), which is very helpful, as I can subsequently translate from UniProt to Entrez format (e.g. https://support.bioconductor.org/p/71702/ ). However, so far, I have not located any automated way to translate these BE identifiers to UniProt format, so that they can subsequently be translated to the ultimate desired Entrez format.
I have reviewed a number of resources and publications that review tools to convert between ID formats, but the problem seems to be that those tools only convert e.g. one form of gene ID to another form of gene ID, whereas my desired conversion from BE format to UniProt ID is not supported on any of the platforms I have reviewed (e.g. for the UniProt site's tool, when I specify DrugBank as the input field, I think that a "DB"-formatted input is expected, as my "EB" inputs yield the message that no results were found: http://www.uniprot.org/uploadlists/ ).
One possibility is that I might need to do web scraping to extract the UniProt ID from each DrugBank page corresponding to each BE identifier, but if there is an existing platform to do this conversion so that web scraping isn't necessary, that would be very helpful.
I will greatly appreciate any advice about how I can automate the process of converting from each BE identifier to UniProt format. Thanks in advance.