Question: How to obtain DrugBank Drug Name and SMILES Data from DrugBank ID?
0
gravatar for EverInEarnest
10 weeks ago by
EverInEarnest20 wrote:

I have downloaded the complete set of target data from the DrugBank site (available here: https://www.drugbank.ca/releases/latest#protein-identifiers). Each row contains data for one target; the final column of each row lists N DrugBank IDs for the N drugs associated with that target.

For each DrugBank ID, I need to locate the associated Name and SMILES data. I have searched BioStars and elsewhere online, but so far, haven't located a way to do this in an automated way. One idea is to mine the 600 MB DrugBank XML database file, so that I could extract the drug Name and SMILES data associated with each DrugBank drug ID value. However, if there is a simpler way to obtain the Name and SMILES data without need to deal with that huge file, any recommendations will be much appreciated. Thanks in advance for any advice you can provide.

drugbank R • 252 views
ADD COMMENTlink modified 9 weeks ago by cannin230 • written 10 weeks ago by EverInEarnest20
1
gravatar for Björn
10 weeks ago by
Björn620
Germany
Björn620 wrote:

Mining the XML is one way, however probably the most complicated one. Instead you can download the drugbank SDF or SMILES file and map your IDs against these files to filter them for example.

ADD COMMENTlink written 10 weeks ago by Björn620

Thanks for your suggestion, Bjorn; I will pursue your suggested method.

ADD REPLYlink written 10 weeks ago by EverInEarnest20

Could you clarify where the SMILES file and other similar DrugBank files are available? I have reviewed the DrugBank site's contents, and see the full database available for download, as well as this page: https://www.drugbank.ca/releases/latest#protein-identifiers that allows download of the target, enzyme, carrier and transporter datasets, but I haven't seen that the drug names or SMILES data is included with these files...

ADD REPLYlink written 9 weeks ago by EverInEarnest20
1
gravatar for cannin
9 weeks ago by
cannin230
United States
cannin230 wrote:

Try the PUG PubChem API: https://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST.html

Using Topotecan (DB01030) as an example below:

Get data by DrugBank ID: https://pubchem.ncbi.nlm.nih.gov/rest/pug/substance/sourceid/drugbank/DB01030/XML

This XML has a PubChem Compound ID, look for: <PC-CompoundType_id_cid>60700</PC-CompoundType_id_cid>

Then get names (the name PubChem lists is first): https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/60700/synonyms/XML

and SMILES (Canonical and Isomeric): https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/60700/XML

ADD COMMENTlink written 9 weeks ago by cannin230

Thank you, cannin!! I will try your strategy.

ADD REPLYlink written 9 weeks ago by EverInEarnest20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 741 users visited in the last hour