Tuesday, May 7, 2013

Arange:  a tool for mining docking poses

I frequently use this blog and others to promote interesting and useful chemical informatics / molecular modeling tools that I've come across in my consulting work.  Although I enjoy programming, I have to admit that the plethora of quality, free, open source tools ripe for the picking means that I rarely have to string code together anymore.  That said, though, I do feel the need to give back a little; not just advocating for the selfless developers whose programs I use every day, but maybe contributing a bit of my own methodology.

So here's the first step:  a Lushington in Silico SourceForge page on which I will be stashing some of my more robust tools as I find the time.  Gerald Lushington's first contribution to the repository is Arange, a tool for mining molecular docking poses.

Mining docking poses, you might ask?  Why?

The answer is that many people tend to approach molecular docking with a narrow and somewhat prejudiced mind.  Specifically, they (well, okay, me too) often scan through the docked poses looking for our preconceived notion of what the pharmacophore should be.  Perhaps we have prior NMR data that suggests some key interactions should be conserved.  Possibly, we hone right in on the subset of poses that are conserved across a family of inhibitors.  With these and related mindsets, we thus often skip over lots of poses that probably do not represent a stable bound conformation, but much of that conformational data is still potentially indicative of metastable or transient states that reflect aspects of the overall dynamic interaction that contribute to the entropic favorability of a ligand for the receptor.  Such transient interactions can be important because a ligand never jumps straight into its ultimate bound conformation:  it bounces from surface to surface, either gradually moving toward the best binding spot or sometimes being expelled in order to try again later.  Favorable transient interactions in the right places in the receptor can thus kinetically expedite ligand approach toward a final binding conformer.

These transient interactions can also be useful for assembling chimeric ligands that exploit more than one interaction surface.

So, what Arange does is to examine all docking poses made by known active compounds and contrast their spatial distribution relative to poses made by inactive compounds.  For each unique atom type, a weight is assigned to a 3D grid according to the following formula:

where in the above, most terms are self-explanatory except for IACT which is a simple factor that is equal to 1.0 if the compound is a known active and equal to -1.0 for a known inactive.  This allows the method to discriminate areas of binding site that discriminate between active and inactive compounds.  Specifically, if both actives and inactives bind to a given region with similar efficacy, that region will be accorded a score close to zero and the region will be considered pharmacophorically irrelevant.  If active ligands bind to a region favorably and inactive ones do not, then that region's interactions will be considered to be pharmacophorically favorable.  The reverse is true for regions that predominantly favor interaction with inactive compounds.

In current form, Arange processes the docking outputs of a Surflex docking simulation and generates a spatial depiction of pharmacophorically favorable (green dots) and unfavorable (red dots) that can be loaded into PyMol for plotting next to a receptor model, as per:

where in the above the size of the spheres is indicative of the significance (i.e., weight) of the pharmacophoric interaction at that specific grid point.  The above graphic reflects carbon atom interactions (i.e., lipophiles), but analogous plots are provided for N, O, F, P, S, Cl and Br.

Please don't hesitate to take a look at the code and provide comments.  If people are interested, the code can be readily extended to interfacing with other docking software, and can be made a bit smarter (i.e., so as to differentiate between different valence states of the various atom types).