Question: How to obtain DrugBank Drug Name and SMILES Data from DrugBank ID?
0
gravatar for EverInEarnest
4 months ago by
EverInEarnest30 wrote:

I have downloaded the complete set of target data from the DrugBank site (available here: https://www.drugbank.ca/releases/latest#protein-identifiers). Each row contains data for one target; the final column of each row lists N DrugBank IDs for the N drugs associated with that target.

For each DrugBank ID, I need to locate the associated Name and SMILES data. I have searched BioStars and elsewhere online, but so far, haven't located a way to do this in an automated way. One idea is to mine the 600 MB DrugBank XML database file, so that I could extract the drug Name and SMILES data associated with each DrugBank drug ID value. However, if there is a simpler way to obtain the Name and SMILES data without need to deal with that huge file, any recommendations will be much appreciated. Thanks in advance for any advice you can provide.

drugbank R • 390 views
ADD COMMENTlink modified 4 months ago by cannin230 • written 4 months ago by EverInEarnest30
1
gravatar for Björn
4 months ago by
Björn630
Germany
Björn630 wrote:

Mining the XML is one way, however probably the most complicated one. Instead you can download the drugbank SDF or SMILES file and map your IDs against these files to filter them for example.

ADD COMMENTlink written 4 months ago by Björn630

Thanks for your suggestion, Bjorn; I will pursue your suggested method.

ADD REPLYlink written 4 months ago by EverInEarnest30

Could you clarify where the SMILES file and other similar DrugBank files are available? I have reviewed the DrugBank site's contents, and see the full database available for download, as well as this page: https://www.drugbank.ca/releases/latest#protein-identifiers that allows download of the target, enzyme, carrier and transporter datasets, but I haven't seen that the drug names or SMILES data is included with these files...

ADD REPLYlink written 4 months ago by EverInEarnest30
1
gravatar for cannin
4 months ago by
cannin230
United States
cannin230 wrote:

Try the PUG PubChem API: https://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST.html

Using Topotecan (DB01030) as an example below:

Get data by DrugBank ID: https://pubchem.ncbi.nlm.nih.gov/rest/pug/substance/sourceid/drugbank/DB01030/XML

This XML has a PubChem Compound ID, look for: <PC-CompoundType_id_cid>60700</PC-CompoundType_id_cid>

Then get names (the name PubChem lists is first): https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/60700/synonyms/XML

and SMILES (Canonical and Isomeric): https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/60700/XML

ADD COMMENTlink written 4 months ago by cannin230

Thank you, cannin!! I will try your strategy.

ADD REPLYlink written 4 months ago by EverInEarnest30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1128 users visited in the last hour