Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/s12859-016-1452-4.pdf

Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy

Penas et al. BMC Bioinformatics (2017) 18:52 DOI 10.1186/s12859-016-1452-4 METHODOLOGY ARTICLE Open Access Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy David R. Penas1 , Patricia González2 , Jose A. Egea3 , Ramón Doallo2 and Julio R. Banga1* Abstract Background: The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. Results: The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. Conclusions: The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models. Keywords: Dynamic models, Parameter estimation, Global optimization, Metaheuristics, Parallelization Background Computational simulation and optimization are key topics in systems biology and bioinformatics, playing a central role in mathematical approaches considering the reverse engineering of biological systems [1–9] and the handling of uncertainty in that context [10–14]. Due to the significant computational cost associated with the simulation, calibration and analysis of models of realistic size, several authors have considered different parallelization strategies in order to accelerate those tasks [15–18]. *Correspondence: BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, 36208 Vigo, Spain Full list of author information is available at the end of the article 1 Recent efforts have been focused on scaling-up the development of dynamic (kinetic) models [19–25], with the ultimate goal of obtaining whole-cell models [26, 27]. In this context, the problem of parameter estimation in dynamic models (also known as model calibration) has received great attention [28–30], particularly regarding the use of global optimization metaheuristics and hybrid methods [31–35]. It should be noted that the use of multistart local methods (i.e. repeated local searches started from different initial guesses inside a bounded domain) also enjoys great popularity, but it has been shown to be rather inefficient, even when exploiting high-quality gradient information [35]. Parallel global optimization © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Penas et al. BMC Bioinformatics (2017) 18:52 strategies have been considered in several system biology studies, including parallel variants of simulated annealing [36], evolution strategies [37–40], particle swarm optimization [41, 42] and differential evolution [43]. Scatter search is a promising metaheuristic that in sequential implementations has been shown to outperform other state of the art stochastic global optimization methods [35, 44–50]. Recently, a prototype of cooperative scatter search implementation using multiple processors was presented [51], showing good performance for the calibration of several large-scale models. However, this prototype used a simple synchronous strategy and small number of processors (due to inefficient communications). Thus, although it could reduce the computation times of sequential scatter search, it still required very significant efforts when dealing with large-scale applications. Here we significantly extend and improve this method by proposing a new parallel cooperative scheme, named self-adaptive cooperative enhanced scatter search (saCeSS) that incorporates the following novel strategies: • the combination of a coarse-grained distributed-memory parallelization paradigm and an underlying fine-grained parallelization of the individual tasks with a shared-memory model, in order to improve the scalability. • an improved cooperation scheme, including an information exchange mechanism driven by the quality of the solutions, an asynchronous communication protocol to handle inter-process information exchange, and a self-adaptive procedure to dynamically tune the settings of the parallel searches. We present below a detailed description of saCeSS, including the details of a high-performance implementation based on a hybrid message passing interface (MPI) and open multi-processing (OpenMP) combination. The excellent performance and scalability of this novel method are illustrated considering a set of very challenging parameter estimation problems in large-scale dynamic models of biological systems. These problems consider kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing a very significant reduction of computation times with respect to previous methods (from days to minutes, in several cases) even when only a small number of processors is used. Therefore, we believe tha (...truncated)