Has Anyone Used Pubchemdb? Any Similar Api?
3
5
Entering edit mode
13.0 years ago
Aleadam ▴ 50

I placed this question in StackOverflow and now I'm reposting it here since I just learned from this site. I am a cell biologist (postdoc) and amateur programming, looking to build an app for our students. As such, my knowledge of bioinformatics tools is limited.


Original question:

I'm building a database of chemical compounds. I need all the synonyms (IUPAC and common names) as well as safety data for each.
I'll be using the freely available data at PubChem (http://pubchem.ncbi.nlm.nih.gov/)

There's an easy way of querying each compound with simple HTTP gets. For example, to obtain glycerol data, the URL is:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753

And the following URL would return an easy to parse format:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753&disopt=DisplaySDF

but it will respond only very basic info, lacking safety data and only a few common names.

There is one public domain API for JAVA that seems a very complete, developed by a group at Scripps (citation). The code is here.

Unfortunately, this API is not very well documented and it's quite difficult to follow due to the complexity of the data involved. For what I gathered, pubchemdb is using the PubChem Power User Gateway (PUG) XML API

Has anyone used this API (or any other one available)? I would appreciate a short description or tutorial on how to start with it.

ncbi java webservice • 4.0k views
ADD COMMENT
2
Entering edit mode

You can also post this question at this Q&A website: http://blueobelisk.shapado.com/ (PubChem developers tend to hang out there...)

ADD REPLY
0
Entering edit mode

@Egon thanks for the link. I had no idea about that site.

ADD REPLY
4
Entering edit mode
13.0 years ago
Rich Apodaca ▴ 170

I haven't used the software you mention. Based on the requirements you outline and preferred solutions, I do have some ideas:

  1. Write your own PUG interface. You can use the PubChem web interface to develop your query once, and then just submit the same XML file(s) over and over again. The Feihn Lab offers yet another Java API.
  2. Create your own PubChem mirror. Then write any interface you want. This approach could offer significant performance benefits for users of your database compared to calling the PUG interface with each request. You can even use a simple Java library to treat all of the download files as one continuous stream, pulling out only what you need.
  3. Use the chemical identifier resolver, which has a much simpler API. gChem offers an easy way to test out the service using Google Spreadsheets.

What kind of safety data are you interested in? MSDS sheets? Something else?

The last time I checked, PubChem didn't have it. If that's still the case, you'll need a way to link chemical structures with CAS number, and then one approach might be to work with your university safety department (and other departments) to create a database that links CAS numbers with MSDS sheets. Both the chemical identifier resolver and PubChem can be helpful here.

ADD COMMENT
0
Entering edit mode

Thanks for the link to the Feihn lab site. It somehow eluded all my searches. The option 2 I believe exceeds my abilities for now, but I will definitely look into the NCI resolver page. Safety data would be a bonus (just basic flammability, reactivity, health hazard ratings), but not a must for now. I did not find anything in PubChem either, but I was not sure if it was there or not.

ADD REPLY
2
Entering edit mode
13.0 years ago

If using ChEBI instead of PubChem is ok for you, they provide database dumps in a number of formats that are well documented.

The advantage would be that you also have ontology terms that link the different entries together if you can make use of them (citation).

ADD COMMENT
0
Entering edit mode

Looks like a great alternative. I'm OK with any database actually, and SQL is something I can start using faster than the XML interface from PubChem. I will definitely try this.

ADD REPLY
0
Entering edit mode
12.6 years ago
Anon ▴ 10

Use eUtils and PUG/SOAP... google them... they are well documented.

ADD COMMENT

Login before adding your answer.

Traffic: 2533 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6