Spatial chemical distance based on atomic property fields

Journal of Computer-Aided Molecular Design, Mar 2010

Similarity of compound chemical structures often leads to close pharmacological profiles, including binding to the same protein targets. The opposite, however, is not always true, as distinct chemical scaffolds can exhibit similar pharmacology as well. Therefore, relying on chemical similarity to known binders in search for novel chemicals targeting the same protein artificially narrows down the results and makes lead hopping impossible. In this study we attempt to design a compound similarity/distance measure that better captures structural aspects of their pharmacology and molecular interactions. The measure is based on our recently published method for compound spatial alignment with atomic property fields as a generalized 3D pharmacophoric potential. We optimized contributions of different atomic properties for better discrimination of compound pairs with the same pharmacology from those with different pharmacology using Partial Least Squares regression. Our proposed similarity measure was then tested for its ability to discriminate pharmacologically similar pairs from decoys on a large diverse dataset of 115 protein–ligand complexes. Compared to 2D Tanimoto and Shape Tanimoto approaches, our new approach led to improvement in the area under the receiver operating characteristic curve values in 66 and 58% of domains respectively. The improvement was particularly high for the previously problematic cases (weak performance of the 2D Tanimoto and Shape Tanimoto measures) with original AUC values below 0.8. In fact for these cases we obtained improvement in 86% of domains compare to 2D Tanimoto measure and 85% compare to Shape Tanimoto measure. The proposed spatial chemical distance measure can be used in virtual ligand screening.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs10822-009-9316-x.pdf

Spatial chemical distance based on atomic property fields

A. V. Grigoryan 0 1 I. Kufareva 0 1 M. Totrov 0 1 R. A. Abagyan 0 1 0 M. Totrov Molsoft, LLC, 3366 N Torrey Pines Ct. Suite 300, La Jolla, CA 92037, USA 1 A. V. Grigoryan I. Kufareva R. A. Abagyan (&) Department of Molecular Biology , TPC28, The Scripps Research Institute , 10550 N Torrey Pines Rd., La Jolla, CA 92037, USA Similarity of compound chemical structures often leads to close pharmacological profiles, including binding to the same protein targets. The opposite, however, is not always true, as distinct chemical scaffolds can exhibit similar pharmacology as well. Therefore, relying on chemical similarity to known binders in search for novel chemicals targeting the same protein artificially narrows down the results and makes lead hopping impossible. In this study we attempt to design a compound similarity/ distance measure that better captures structural aspects of their pharmacology and molecular interactions. The measure is based on our recently published method for compound spatial alignment with atomic property fields as a generalized 3D pharmacophoric potential. We optimized contributions of different atomic properties for better discrimination of compound pairs with the same pharmacology from those with different pharmacology using Partial Least Squares regression. Our proposed similarity measure was then tested for its ability to discriminate pharmacologically similar pairs from decoys on a large diverse dataset of 115 protein-ligand complexes. Compared to 2D Tanimoto and Shape Tanimoto approaches, our new approach led to improvement in the area under the receiver operating characteristic curve values in 66 and 58% of domains respectively. The improvement was particularly high for the previously problematic cases (weak performance of the 2D Tanimoto and Shape Tanimoto measures) with original AUC values below 0.8. In fact for these cases we obtained improvement in 86% of domains compare to 2D Tanimoto measure and 85% compare to Shape Tanimoto measure. The proposed spatial chemical distance measure can be used in virtual ligand screening. - Ligand-based approaches to protein family profiling has been widely studied and used for in silico pharmacology [1]. Similarity of compound chemical structures often leads to close pharmacological profiles, including binding to the same protein targets. By this reason, chemical similarity criterion is widely used for identification of novel lead molecules in the development of pharmaceuticals. A variety of chemical similar measures has been proposed. However, in many cases compounds with similar pharmacology escape correct recognition as they appear to be dissimilar by any existing measure. In order to navigate in ligand space, one need to represent the compound using appropriate properties (descriptors) and then use a master equation to measure a distance between two compounds. Descriptors are usually classified according to their dimensionality ranging from one-dimensional (1-D) to three-dimensional (3D) properties [2, 3, 10]. Easy and fast to compute 1-D descriptors describe global properties which can be derived from chemical formula and classify compounds or ligands from various target families [35, 10]. To perform fast comparison 1-D linear representations of compounds are often used. The most popular of this kind of simplified string is the Simplified Molecular Input Line Entry System or SMILES [3, 6, 10]. To improve discrimination, 2D topological descriptors are used. Graph-based methods, such as maximum common subgraph (MCS) [3, 7, 10] and fingerprint-based methods [3, 8, 10] are popular for substructure clustering chemical compounds into subfamilies. Subgraph isomorphism in large molecular databases is quite often time consuming to perform on large numbers of structures and it was for this reason that substructure screening was developed as a rapid method of filtering out those molecules that definitely do not contain the substructure of interest [10, 46]. The similarity between two molecules represented by 2D binary fingerprints is most frequently quantified using the Tanimoto coefficient, which gives a measure of the number of fragments in common between the two molecules [3, 9, 10]. It is well known that molecular recognition depends on the 3D structure and properties of molecule rather than the underlying substructure(s) [10]. 3D methods are computationally more expensive than 2D descriptor based methods, because they require consideration of conformational space of the molecule. These methods can be divided into methods that are alignment-independent and methods that require the molecules to be aligned in 3D space before similarity function is used [10]. Some computationally expensive alignment-independent methods use 3D geometrical descriptors represent them in a binary fingerprint and then use with the Tanimoto coefficient exactly as for 2D fingerprints [10, 11]. Other methods are 3D equivalent of the MCS [10, 12, 13]. Many 3D approaches are based on the use of distances matrices where the value of each element (i, j) equals the interatomic distance between atoms i and j [10, 14]. Also there are approaches where the pharmacophore points are used for similarity comparisons [10, 1517]. Consideration of conformational flexibility of the molecules as well as their relative orientation is required for alignment dependent methods [10]. These methods devised to align the compared structures via maximization of the similarity function that is used [10, 45]. Many different ways have been developed to represent molecules and calculate similarity based on molecular shape and/or field [1831, 45]. For reviews of molecular similarity methods, see refs [2, 10, 3236]. The aim of this study is to design a spatial distance measure between two chemicals that optimizes recognition of their pharmacological similarity by using their 3D conformational ensembles and properties pertaining to molecular interactions. We recently introduced a novel spatial alignment method based on atomic property fields (APF) as a generalized 3D pharmacophoric potential [37]. APF is the representation of the ligand by a multi-component (vector) 3D potential, with the components corresponding to various physico-chemical atomic properties. In the present study, the APF alignment is used to measure spatial chemical similarity/distance between ligands. A diverse benchmark of 99 proteins (see Supplementary Table 1 for details) and ligands co-crystallized with these proteins (with 6 ligands per protein on average) was used to train APF parameters for better discrimination of pharmacologically similar pairs from dissimilar ones. All possible combinations of pairs of ligands from the same receptors as well as for ligands co-crystallized with certain protein all possible combinations of pairs with ligands co-crystallized with 20 different randomly chosen from benchmark other proteins, were taken and APF representation of larger ligand was us (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs10822-009-9316-x.pdf
Article home page: http://link.springer.com/article/10.1007/s10822-009-9316-x

A. V. Grigoryan, I. Kufareva, M. Totrov, R. A. Abagyan. Spatial chemical distance based on atomic property fields, Journal of Computer-Aided Molecular Design, 2010, pp. 173-182, Volume 24, Issue 3, DOI: 10.1007/s10822-009-9316-x