Question: How to obtain DrugBank Drug Name and SMILES Data from DrugBank ID?
gravatar for EverInEarnest
13 days ago by
EverInEarnest20 wrote:

I have downloaded the complete set of target data from the DrugBank site (available here: Each row contains data for one target; the final column of each row lists N DrugBank IDs for the N drugs associated with that target.

For each DrugBank ID, I need to locate the associated Name and SMILES data. I have searched BioStars and elsewhere online, but so far, haven't located a way to do this in an automated way. One idea is to mine the 600 MB DrugBank XML database file, so that I could extract the drug Name and SMILES data associated with each DrugBank drug ID value. However, if there is a simpler way to obtain the Name and SMILES data without need to deal with that huge file, any recommendations will be much appreciated. Thanks in advance for any advice you can provide.

drugbank R • 144 views
ADD COMMENTlink modified 10 days ago by cannin230 • written 13 days ago by EverInEarnest20
gravatar for Björn
13 days ago by
Björn620 wrote:

Mining the XML is one way, however probably the most complicated one. Instead you can download the drugbank SDF or SMILES file and map your IDs against these files to filter them for example.

ADD COMMENTlink written 13 days ago by Björn620

Thanks for your suggestion, Bjorn; I will pursue your suggested method.

ADD REPLYlink written 12 days ago by EverInEarnest20

Could you clarify where the SMILES file and other similar DrugBank files are available? I have reviewed the DrugBank site's contents, and see the full database available for download, as well as this page: that allows download of the target, enzyme, carrier and transporter datasets, but I haven't seen that the drug names or SMILES data is included with these files...

ADD REPLYlink written 11 days ago by EverInEarnest20
gravatar for cannin
10 days ago by
United States
cannin230 wrote:

Try the PUG PubChem API:

Using Topotecan (DB01030) as an example below:

Get data by DrugBank ID:

This XML has a PubChem Compound ID, look for: <PC-CompoundType_id_cid>60700</PC-CompoundType_id_cid>

Then get names (the name PubChem lists is first):

and SMILES (Canonical and Isomeric):

ADD COMMENTlink written 10 days ago by cannin230

Thank you, cannin!! I will try your strategy.

ADD REPLYlink written 10 days ago by EverInEarnest20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 590 users visited in the last hour