Get all chemical compounds interacted with target
Entering edit mode
4.1 years ago

I want to get all Pubchem Chemical compounds (CID) interacted with a given uniprot target. I need it to study relation between Chemical compounds and Targets

Thanks in advance

cid uniprot interaction • 2.8k views
Entering edit mode

What have you tried?

Entering edit mode

I tried pubchem pug rest api, but can't reach such service

Entering edit mode

What do you mean by can't reach such service? What location are you based out of? Is the website blocked there?

Entering edit mode

Why are you interested in PubChem, that contains compounds linked to < 1,000 targets? Have you thought in exploring ChEMBL that a much larger proportion of active compounds identified using dose–response assays and linked to > 4,000 targets (see Gauton et al. 2012 for the stats and more details on PubChem and ChEMBL).

Because some of the PubChem data is available in ChEMBL, you could perhaps give the Open Targets Platform a go and get the (ChEMBL) drug compounds (with known mechanism of action) that modulate a target (as UniProt IDs or Ensembl gene IDs).

Drug information is also available via the Open Targets batch search or the REST API. Check this short animation to see how easy is to run (and interpret) the batch search too. Or read the post on Open Targets and programmatic access. If REST is the way for you, this endpoint is one example on how to get the evidence used to link ENSG00000145335 and diseases when filtering for drug information (from ChEMBL).

This is how one can visualise the associated diseases in the user interface.

If you are not interested in associations with these diseases, I'd recommend you exploring the ChEMBL web services API documentation.

Entering edit mode

Thanks alot for your reply. I need pubchem CID for each chemical compound. If I use CHEMBL, Is there any way to get Pubchem CID from Chembl ID

Entering edit mode

There is the PubChem Identifier Exchange Service. If you have a list of ChEMBL IDs, select the option "synonyms' and as output CIDs (or vice-versa). I've tried ChEMBL1000 aka CETIRIZINE, which gives me PubChem CID 2678.

Check more on the theme of Converting between drug identifier formats here on Biostars, as another tool (Cactvs Cheminformatics Toolkit) has been mentioned as well.

I will check with ChEMBL if they have a web interface toolkit (or plan to release one) since they have already all cross-referenced anyway. Perhaps their web services could also do the converting.

And at last but not least, I also recommend you to check Pierre Lindenbaum's comments ;-)

Entering edit mode
24 months ago
bhavya.v • 0


even I have a similar problem. I had pubchem CID and wanted to extract ChEMBL targets for compound CIDs. I used pubchem exchange service to convert Pubchem CID to ChEMBL ID to get ChEMBL targets.

The problem is I cant find an API code to get compound related targets from ChEMBL. I don't know which API exactly extracts targets from compound input. Please help me with this.

I used the following API code to get targets, I get most of the targets, but not able to extract uniprotID and gene symbol using this API. Could you please help me

Thank you in advance

import requests
import xmltodict
import pandas as pd
from pprint import pprint
from chembl_webresource_client.new_client import new_client
from collections import defaultdict
from requests.auth import HTTPBasicAuth

Path = 'chembl.txt'
emblout_df = pd.DataFrame(columns=['Compound ID', 'Compound name', 'Target ID', 'Target name'])
row = 0

#read other file
with open(Path, 'r') as f:
    smilesinput = f.readlines()
    smiles = [x[:-1] for x in smilesinput]

#if find special text, write other lines to new file            
for line in smiles:
    compounds2targets = defaultdict(list)

    res = new_client.activity.filter(molecule_chembl_id__in = line).only([

    for target in res.filter(target_organism='Homo sapiens'):[row, 'Compound ID'] = target['molecule_chembl_id'][row, 'Target ID'] = target['target_chembl_id'][row, 'Compound name'] = target['molecule_pref_name'][row, 'Target name'] = target['target_pref_name'][row, 'Target UniprotID'] = target['target_uniprot_accessions']

        row += 1

emblout_df = emblout_df.drop_duplicates(["Compound ID", "Target ID"])

emblout_df.reset_index(drop=True, inplace=True)


emblout_df.to_csv('boswellia_targets.tsv', sep = '\t', index = False)

target =
for row, chembl in enumerate(emblout_df['Target ID']):
    a = target.get(chembl)

    if not a:

    b = a['target_components']

    if len(b) == 1:

        c= b[0][row, 'Uniprot accession'] = c.get('accession')

        d = c['target_component_synonyms']

        for symbol in d:
             if symbol['syn_type'] == 'GENE_SYMBOL':
          [row, 'Gene symbol'] = symbol['component_synonym']

Entering edit mode

Please open a new question and add a link to this question as a related post. Adding an answer does not make sense here as you're not exactly answering the top level question. Also, please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.


Login before adding your answer.

Traffic: 1520 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6