Text-Mining Clinicaltrials.Gov For Drug-Gene Interactions
2
5
Entering edit mode
11.8 years ago

I have little direct experience with text-mining tools. Can anyone suggest a good tool or approach for text-mining drug-gene relationships from clinical trials available at clinicaltrials.gov? They provide xml files for each clinical trial record but unfortunately gene information is not a standard field (but often mentioned in free-form descriptive fields). I would have a list of genes and a list of drugs and want to know when they co-occur in a clinical trials record. However, it would be nice to get more than just simple co-occurrence. Is anyone aware of a tool that could rank co-occurrences in some reasonable way based on term incidence, proximity, natural language processing concepts, etc. Here is an example record to give some context.

drug gene • 3.4k views
ADD COMMENT
2
Entering edit mode
11.8 years ago
Arun 2.4k

There was a recent blog post from homolog.us here: It mentions some of the best resources available for text mining. Does this help at all, at least to get you started?

ADD COMMENT
1
Entering edit mode
11.8 years ago
Mary 11k

What you are trying reminds me of the XplorMed tool. http://www.ogic.ca/projects/xplormed/

It used to have more features, but it might still work for you with your input data. It used to be able to start with a keyword, ID, or PubMed query and look for co-occurrence of terms, with ranking. Currently it asks for abstracts but you might be able to fake it out with the Clinical Trials xml records instead. At least it's worth a try.

It would probably help to read their publications about how they did it even if it doesn't work, and you might be able to get the software and tweak it yourself if the abstract trick doesn't work.

ADD COMMENT

Login before adding your answer.

Traffic: 2971 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6