Question: Parse Drugbank XML in R
2.3 years ago by
kakukeshi50 wrote:

Hi guys,

Anybody knows how to parse the complete drugbank database in xml into R. I need the drug mechanism of action, so I can't use the other downloadable files from drugbank.


R drug xml gene • 2.0k views
2.3 years ago by
EverInEarnest30 wrote:

I successfully used the xmlEventParse() function in R ( to extract selected fields from the DrugBank database. (After experimenting with loading the full 600+ MB database into memory, and finding that that was not working, I ended up using this SAX parsing method.)

I've included a subset of my code to give you a feel for what this looks like:

library(gdata) <- array(dim = 0)

# Define function to extract necessary data from each drug (= each main node)
getDrug <- function(x, ...) {

  # name the current drug for easy reference
  current_drug <- read_xml(toString.XMLNode(x));

  # extract properties related to drug <- xml_text(xml_find_first(current_drug, './name'))

  # remove the current node from memory when finished with it


# Use event-driven SAX parser to process the XML without requiring the full tree structure to be loaded into memory
# Call the function defined above
xmlEventParse(file = filename, handlers = NULL, trim = FALSE, branches = list(drug = getDrug))

Hope this helps.

hello I am trying this but i am not getting results. and its throwing end of line error. can somoone help in processing this? it would be greatful ? i tried as the above mentioned script but of no luck. please do help me thank you

8 months ago by
mohfcis20 wrote:

I know it is an old post, but for anyone how might be having the same question. There is a new package called dbparser to parse drugbank database into several R datasets

