I am trying to map 1309 connectivity map small molecules to PubChem CID manually, attached is the obtained files.

But some may be wrong and there are 80 molecules I can not find the mapping.

So I hope others can correct my mappings and help figure out the 80 molecules which can not be found in the pubchem database.

If someone can give the SMILES for these 1309 molecules, it will be great.

Here is the download Link for the mapping.



cmap microarray pubchem • 1.1k views
Check out LigDig which can help you map PubChem CIDs from cmap output. You may have better luck for some that you could not map manually (worth a try anyways). For obtaining SMILES, try the R/Bioconductor package ChemmineR. Of course, it can't help you with the compounds that did not map to PubChem. Lastly, see this forum post about cmap using non-standard compound names. This might be part of your problem and I don't know if there is a solution.

Thanks for your help.

I will check that.

Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep threads logically organized.

