Question: Has Anyone Used Pubchemdb? Any Similar Api?
5
gravatar for Aleadam
7.5 years ago by
Aleadam50
Aleadam50 wrote:

I placed this question in StackOverflow and now I'm reposting it here since I just learned from this site. I am a cell biologist (postdoc) and amateur programming, looking to build an app for our students. As such, my knowledge of bioinformatics tools is limited.


Original question:

I'm building a database of chemical compounds. I need all the synonyms (IUPAC and common names) as well as safety data for each.
I'll be using the freely available data at PubChem (http://pubchem.ncbi.nlm.nih.gov/)

There's an easy way of querying each compound with simple HTTP gets. For example, to obtain glycerol data, the URL is:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753

And the following URL would return an easy to parse format:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753&disopt=DisplaySDF

but it will respond only very basic info, lacking safety data and only a few common names.

There is one public domain API for JAVA that seems a very complete, developed by a group at Scripps (citation). The code is here.

Unfortunately, this API is not very well documented and it's quite difficult to follow due to the complexity of the data involved. For what I gathered, pubchemdb is using the PubChem Power User Gateway (PUG) XML API

Has anyone used this API (or any other one available)? I would appreciate a short description or tutorial on how to start with it.

ncbi java webservice • 2.1k views
ADD COMMENTlink modified 7.3 years ago by Anon10 • written 7.5 years ago by Aleadam50
2

You can also post this question at this Q&A website: http://blueobelisk.shapado.com/ (PubChem developers tend to hang out there...)

ADD REPLYlink written 7.5 years ago by Egon Willighagen5.2k

@Egon thanks for the link. I had no idea about that site.

ADD REPLYlink written 7.5 years ago by Aleadam50
4
gravatar for Rich Apodaca
7.5 years ago by
Rich Apodaca170
La Jolla, CA
Rich Apodaca170 wrote:

I haven't used the software you mention. Based on the requirements you outline and preferred solutions, I do have some ideas:

  1. Write your own PUG interface. You can use the PubChem web interface to develop your query once, and then just submit the same XML file(s) over and over again. The Feihn Lab offers yet another Java API.
  2. Create your own PubChem mirror. Then write any interface you want. This approach could offer significant performance benefits for users of your database compared to calling the PUG interface with each request. You can even use a simple Java library to treat all of the download files as one continuous stream, pulling out only what you need.
  3. Use the chemical identifier resolver, which has a much simpler API. gChem offers an easy way to test out the service using Google Spreadsheets.

What kind of safety data are you interested in? MSDS sheets? Something else?

The last time I checked, PubChem didn't have it. If that's still the case, you'll need a way to link chemical structures with CAS number, and then one approach might be to work with your university safety department (and other departments) to create a database that links CAS numbers with MSDS sheets. Both the chemical identifier resolver and PubChem can be helpful here.

ADD COMMENTlink written 7.5 years ago by Rich Apodaca170

Thanks for the link to the Feihn lab site. It somehow eluded all my searches. The option 2 I believe exceeds my abilities for now, but I will definitely look into the NCI resolver page. Safety data would be a bonus (just basic flammability, reactivity, health hazard ratings), but not a must for now. I did not find anything in PubChem either, but I was not sure if it was there or not.

ADD REPLYlink written 7.5 years ago by Aleadam50
2
gravatar for Michael Schubert
7.5 years ago by
Cambridge, UK
Michael Schubert6.8k wrote:

If using ChEBI instead of PubChem is ok for you, they provide database dumps in a number of formats that are well documented.

The advantage would be that you also have ontology terms that link the different entries together if you can make use of them (citation).

ADD COMMENTlink written 7.5 years ago by Michael Schubert6.8k

Looks like a great alternative. I'm OK with any database actually, and SQL is something I can start using faster than the XML interface from PubChem. I will definitely try this.

ADD REPLYlink written 7.5 years ago by Aleadam50
0
gravatar for Anon
7.1 years ago by
Anon10
Anon10 wrote:

Use eUtils and PUG/SOAP... google them... they are well documented.

ADD COMMENTlink written 7.1 years ago by Anon10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour