Traffic: 484 ip/hr
Question: Cluster ligand database and extract fragments
 
3
 
 

We have a database of 5000 mol2 ligand structures from ZINC database. We wonder what could be the best approach for performing this;

a) process database and get ligand clusters in terms of similarity, and also after flexible alignment

b) after a), extract the mol2 files of the most common fragments

Thanks

log in to commentrevisions • 2 bookmarks • permalink similar posts • request help via email

2 answers

 
2
 
 
 

There are a number of free tools to perform the clustering and the fragment based analysis for small molecules. You could use the CDK to evaluate numerical descriptors or binary fingerprints and use that data to perform clustering (R, WEKA, Matlab, Python, ....). Other options include RDKit or OpenBabel

For fragmentation, you could also use the CDK, RDKit or a tool from NCGC to generate molecular fragments. R is particularly handy for fragment based analysis, and works directly with the CDK (see slides 158-169, though the API has been updated to be easier to use)

And given you're working with just 5000 molecules, the whole thing could be quite straightforward within R (assuming a decent amount of RAM), but of course other environments such as PipelinePilot and KNIME are also pretty straihgtforward

 
 
2
 
 

Both points that you're describing are very easy to do with Canvas (by Schrodinger). Not sure if you have access to it, perhaps you can get an evaluation copy. Here's a link to the Canvas product page.

 

I had a quick look but they do not seem to be free

log in to reply • written 14 months ago by Flow  1,35028
 
1

if you want, i have access to Canvas. send it to me I'll do it for you.

log in to reply • written 14 months ago by dimkal  63018
 

thanks a lot for your offer, but I am in the situation that I need to do this on a regular basis

log in to reply • written 14 months ago by Flow  1,35028
 
Log in to add a post