State and parameter estimation of the heat shock response system using Kalman and particle filters (pdf)

Article PDF cannot be displayed. You can download it here:

https://bioinformatics.oxfordjournals.org/content/28/11/1501.full.pdf

State and parameter estimation of the heat shock response system using Kalman and particle filters

Xin Liu 0 Mahesan Niranjan 0 Associate Editor: Trey Ideker 0 School of Electronics and Computer Science, University of Southampton , Southampton SO17 1BJ , UK Motivation: Traditional models of systems biology describe dynamic biological phenomena as solutions to ordinary differential equations, which, when parameters in them are set to correct values, faithfully mimic observations. Often parameter values are tweaked by hand until desired results are achieved, or computed from biochemical experiments carried out in vitro. Of interest in this article, is the use of probabilistic modelling tools with which parameters and unobserved variables, modelled as hidden states, can be estimated from limited noisy observations of parts of a dynamical system. Results: Here we focus on sequential filtering methods and take a detailed look at the capabilities of three members of this family: (i) extended Kalman filter (EKF), (ii) unscented Kalman filter (UKF) and (iii) the particle filter, in estimating parameters and unobserved states of cellular response to sudden temperature elevation of the bacterium Escherichia coli. While previous literature has studied this system with the EKF, we show that parameter estimation is only possible with this method when the initial guesses are sufficiently close to the true values. The same turns out to be true for the UKF. In this thorough empirical exploration, we show that the non-parametric method of particle filtering is able to reliably estimate parameters and states, converging from initial distributions relatively far away from the underlying true values. Availability and implementation: Software implementation of the three filters on this problem can be freely downloaded from http://users.ecs.soton.ac.uk/mn/HeatShock Contact: Supplementary information: Supplementary data are available at Bioinformatics online. The Author 2012. Published by Oxford University Press. All rights reserved. For Permissions, please email: 1 INTRODUCTION In systems biology, sets of ordinary differential equations (ODEs) are often used to characterize biochemical reactions. The differential equations capture our knowledge of the underlying interactions in a quantitative way and solutions to such systems of equations help explain behaviour at an overall systems level. Much of the work in the area deals with deterministic differential equations, often non-linear (e.g. Hill kinetics). Unknown parameter values in their specification, such as reaction rates, are obtained from biochemical experiments conducted in vitro, or are set to specific values by elaborate hand-tuning, so that a set of observations from the system under study are best explained. Sometimes, such parameters may not have direct biological interpretations and are used as approximations (Lillacci and Khammash, 2010). The yeast cell cycle model of (Chen et al., 2000) is a good example of this. The parameter estimation problem has often been posed as a search and optimization problem in the literature. For example, (Mendes and Kell, 1998) have explored a range of optimization methods including steepest descent gradient search techniques and methods suitable for global optimization, such as simulated annealing and genetic programming, in the parameter estimation of metabolic systems. An alternate method using spline approximation of the solution, using linear and non-linear programming is described in a recent paper (Zhan and Yeung, 2011). These authors consider an enzyme kinetic system and a subset of a cell cycle model. Ashyraliyev et al. (2008) show how parameters of a developmental gap gene circuit model may be estimated by extensive search methods. Such approaches to model-based explanation of biological phenomena, often do not take into account system and measurement noise in the modelling process. They also do not explicitly seek to infer parameter values or unobserved states from noisy measurements. Techniques using Bayesian inference are alternatives to optimization-based approaches, but having the added motivation of being able to capture uncertainties in parameter and state estimates. Examples of work along these lines include (Dewar et al., 2010; Golightly and Wilkinson, 2005) who consider stochastic systems. Barenco et al. (2006), Jayawardhana et al. (2008), Lawrence et al. (2006) and Vyshemirsky and Girolami (2008) address ODE-based systems for which parameter estimation is performed by Markov Chain Monte Carlo (MCMC) methods in a probabilistic inference framework. Some authors have recently studied the role of sequential estimation methods for state and parameter estimation from systems biology models, formulated as state-space models. Lillacci and Khammash (2010), Nakamura et al. (2009), Quach et al. (2007), Sun et al. (2008) and Yang et al. (2007) fall into this category. The first three of these use parametric methods based on Kalman filtering, whereas the latter two use the non-parametric approach of particle filtering. The power of the tools we explore in this article have been demonstrated in other areas of application over many decades. These include the adaptive estimation of neural networks (Kadirkamanathan and Niranjan, 1993), target tracking from bearing-only measurements (Bar-Shalom et al., 2001), modelling futures contracts in computational finance (Niranjan, 1997) and to find global minima of artificial neural networks (de Freitas et al., 2000). In the context of biology, Dewar et al. (2010), Quach et al. (2007) and Wilkinson (2009) are examples of using probabilistic dynamical systems models to characterize biological systems and to make parameter estimation and inferences from them. While most literature on the subject focuses on batch-based models, in which parameter estimation and inference are performed on all the available data, our attention is on sequential, or online, methods. The Kalman filter and its variants, and the particle filter (PF) belong to this family. The formulation of jointly estimating state and parameters by an extended state representation we pursue is due to Sitz et al. (2002), who applied unscented Kalman filter (UKF) to analyze two dynamical systems: the LotkaVolterra system (Hofbauer and Sigmund, 1988) and the Lorenz system (Colin, 1982). Unknown parameters are treated as states with their time variation set to zero. To motivate non-parametric particle filtering, with the argument that computation is sufficiently cheap to be able to do this, Nakamura et al. (2009) used an exceedingly large number of Monte Carlo samples to completely cover all the possible states and parameters space for a complex system such as the mammals circadian genetic control model (Matsuno et al., 2005) which has 45 unknown parameters. While this is an ambitious attempt, motivated under the term peta-computing, we do not believe the authors make a persuasive case for flooding the space with particles. Instead, as we find in this work, a modest number of particles, (...truncated)