Question

Cluster Ligand Database And Extract Fragments

3

Entering edit mode

12.3 years ago

Flow ★ 1.6k

We have a database of 5000 mol2 ligand structures from ZINC database. We wonder what could be the best approach for performing this;

a) process database and get ligand clusters in terms of similarity, and also after flexible alignment

b) after a), extract the mol2 files of the most common fragments

Thanks

clustering alignment small • 3.9k views

ADD COMMENT • link updated 10.8 years ago by Biostar 20 • written 12.3 years ago by Flow ★ 1.6k

score 2 · Answer 1 · 2012-02-27

2

Entering edit mode

12.3 years ago

dimkal ▴ 730

Both points that you're describing are very easy to do with Canvas (by Schrodinger). Not sure if you have access to it, perhaps you can get an evaluation copy. Here's a link to the Canvas product page.

ADD COMMENT • link 12.3 years ago by dimkal ▴ 730

1

Entering edit mode

if you want, i have access to Canvas. send it to me I'll do it for you.

ADD REPLY • link 12.3 years ago by dimkal ▴ 730

0

Entering edit mode

I had a quick look but they do not seem to be free

ADD REPLY • link 12.3 years ago by Flow ★ 1.6k

0

Entering edit mode

thanks a lot for your offer, but I am in the situation that I need to do this on a regular basis

ADD REPLY • link 12.3 years ago by Flow ★ 1.6k

score 2 · Answer 2 · 2012-03-02

There are a number of free tools to perform the clustering and the fragment based analysis for small molecules. You could use the CDK to evaluate numerical descriptors or binary fingerprints and use that data to perform clustering (R, WEKA, Matlab, Python, ....). Other options include RDKit or OpenBabel

For fragmentation, you could also use the CDK, RDKit or a tool from NCGC to generate molecular fragments. R is particularly handy for fragment based analysis, and works directly with the CDK (see slides 158-169, though the API has been updated to be easier to use)

And given you're working with just 5000 molecules, the whole thing could be quite straightforward within R (assuming a decent amount of RAM), but of course other environments such as PipelinePilot and KNIME are also pretty straihgtforward