Can human experts predict solubility better than computers? (pdf)

Article PDF cannot be displayed. You can download it here:

https://jcheminf.biomedcentral.com/track/pdf/10.1186/s13321-017-0250-y

Can human experts predict solubility better than computers?

Boobier et al. J Cheminform (2017) 9:63 https://doi.org/10.1186/s13321-017-0250-y Open Access RESEARCH ARTICLE Can human experts predict solubility better than computers? Samuel Boobier1, Anne Osbourn2 and John B. O. Mitchell1* Abstract In this study, we design and carry out a survey, asking human experts to predict the aqueous solubility of druglike organic compounds. We investigate whether these experts, drawn largely from the pharmaceutical industry and academia, can match or exceed the predictive power of algorithms. Alongside this, we implement 10 typical machine learning algorithms on the same dataset. The best algorithm, a variety of neural network known as a multi-layer perceptron, gave an RMSE of 0.985 log S units and an R2 of 0.706. We would not have predicted the relative success of this particular algorithm in advance. We found that the best individual human predictor generated an almost identical prediction quality with an RMSE of 0.942 log S units and an R2 of 0.723. The collection of algorithms contained a higher proportion of reasonably good predictors, nine out of ten compared with around half of the humans. We found that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median generated excellent predictivity. While our consensus human predictor achieved very slightly better headline figures on various statistical measures, the difference between it and the consensus machine learning predictor was both small and statistically insignificant. We conclude that human experts can predict the aqueous solubility of druglike molecules essentially equally well as machine learning algorithms. We find that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median is a powerful way of benefitting from the wisdom of crowds. Background Solubility is the property of a chemical solute dissolving in a solvent to form a homogeneous system [1]. Solubility depends on the solvent used, as well as the pressure and temperature at which it was recorded. Water solubility is one of the key requirements of drugs, ensuring that they can be absorbed through the stomach lining and small intestine, eventually passing through the liver into the bloodstream. This means that low solubility is linked with poor bioavailability [2]. Another typical requirement of a drug is delivery in tablet form, again adequate solubility is needed. Tablets are strongly preferred to intravenous delivery of drugs, not least for patient compliance, ease of controlling the dose, and of self-administration. There are also toxicity problems associated with low solubility *Correspondence: jbom@st‑andrews.ac.uk 1 Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St Andrews, St Andrews KY16 9ST, Scotland, UK Full list of author information is available at the end of the article drugs, for example crystalluria caused by the drug forming a crystalline solid in the body [3]. Moreover, poor pharmacokinetics and toxicity are major causes of late stage failure in drug development. In fact 40% of drug failures stem from poor pharmacokinetics [4]. Prediction of key pharmaceutical properties has become increasingly important with the use of high throughput screening (HTS). As HTS has gained popularity, drug candidates have had increasingly higher molecular weight and lipophilicity, leading to lower solubility which is considered the predominant problem [5]. It is vital that solubility can be understood and predicted, in order to reduce the number of late stage failures due to poor bioavailability. Thus springs the need for ways to accurately predict both solubility and the essential properties, often referred to as ADMET (absorption, distribution, metabolism, elimination and toxicity), in which solubility is a key factor. As a way to increase the success of developing effective medicines, Lipinski’s popular “rule of five” was an empirical analysis of the attributes © The Author(s) 2017. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/ publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Boobier et al. J Cheminform (2017) 9:63 of successful drugs, giving guidelines on what makes a good pharmaceutical [2]. He found that effective drugs had molecular weight < 500, lipophilicity of log P < 5, and numbers of hydrogen bond donor and acceptor atoms that were < 5 and < 10 respectively. Increasingly, in silico approaches are being used to predict ADMET properties, in order to streamline the number of candidates coming through HTS. Solubility itself is difficult to measure. Typically log S, the base 10 logarithm of the solubility as referred to units of mol/dm3, is reported. There are many different definitions of solubility and various experimental ways of measuring it, which can lead to poor reproducibility of solubility measurements. Thus with varied sources of data, especially when the exact details of the solubility methodology are not specified, assembling a high quality dataset for solubility prediction can be difficult. Thermodynamic solubility is the solubility measured under equilibrium conditions. It can be determined with a shake flask approach, or by using a method like CheqSol [6], where equilibration is speeded up by shuttling between super- and subsaturated solutions via additions of small titres of acid or alkali. The Solubility Challenge [7, 8] used its own bespoke dataset, measuring intrinsic aqueous thermodynamic solubility with the CheqSol method. Its authors reported high reproducibility and claimed random errors of only 0.05 log S units. Despite this, study of the literature suggests that overall errors in reported intrinsic solubilities of drug-like molecules are around 0.6–0.7 log S units, as discussed by Palmer & Mitchell and previously by Jorgensen & Duffy [9, 10]. This means that the best computational predictions possible would have root mean squared errors (RMSE) similar to the experimental error in reported solubilities. The feasible prediction accuracy will be dataset-dependent. Using various machine learning (ML) methods similar to those utilised herein, we obtained a best RMSE of 0.69 log S units for a test set of 330 druglike molecules, 0.90 for a different test set of 87 such compounds, 0.91 for the Solubility Challenge test set of 28 molecules, and in the same paper 1.11 log S units for a tenfold cross-validation of our DLS-100 set of 100 druglike compounds [11–15]. F (...truncated)