Equilibrium model selection: dTTP induced R1 dimerization (pdf)

Article PDF cannot be displayed. You can download it here:

https://bmcsystbiol.biomedcentral.com/track/pdf/10.1186/1752-0509-2-15

Equilibrium model selection: dTTP induced R1 dimerization

BMC Systems Biology BioMed Central Methodology article Open Access Equilibrium model selection: dTTP induced R1 dimerization Tomas Radivoyevitch Address: Department of Epidemiology and Biostatistics, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106, USA Email: Tomas Radivoyevitch - Published: 4 February 2008 BMC Systems Biology 2008, 2:15 doi:10.1186/1752-0509-2-15 Received: 27 July 2007 Accepted: 4 February 2008 This article is available from: http://www.biomedcentral.com/1752-0509/2/15 © 2008 Radivoyevitch; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Biochemical equilibria are usually modeled iteratively: given one or a few fitted models, if there is a lack of fit or over fitting, a new model with additional or fewer parameters is then fitted, and the process is repeated. The problem with this approach is that different analysts can propose and select different models and thus extract different binding parameter estimates from the same data. An alternative is to first generate a comprehensive standardized list of plausible models, and to then fit them exhaustively, or semi-exhaustively. Results: A framework is presented in which equilibriums are modeled as pairs (g, h) where g = 0 maps total reactant concentrations (system inputs) into free reactant concentrations (system states) which h then maps into expected values of measurements (system outputs). By letting dissociation constants Kd be either freely estimated, infinity, zero, or equal to other Kd, and by letting undamaged protein fractions be either freely estimated or 1, many g models are formed. A standard space of g models for ligand-induced protein dimerization equilibria is given. Coupled to an h model, the resulting (g, h) were fitted to dTTP induced R1 dimerization data (R1 is the large subunit of ribonucleotide reductase). Models with the fewest parameters were fitted first. Thereafter, upon fitting a batch, the next batch of models (with one more parameter) was fitted only if the current batch yielded a model that was better (based on the Akaike Information Criterion) than the best model in the previous batch (with one less parameter). Within batches models were fitted in parallel. This semi-exhaustive approach yielded the same best models as an exhaustive model space fit, but in approximately one-fifth the time. Conclusion: Comprehensive model space based biochemical equilibrium model selection methods are realizable. Their significance to systems biology as mappings of data into mathematical models warrants their development. Background Ribonucleotide reductase (RNR) has a small subunit R2 that exists almost exclusively as a dimer, and a large subunit R1 that dimerizes when dTTP, dGTP, dATP, or ATP binds to its specificity site, and hexamerizes when dATP or ATP binds to its activity site [1-6]. Thus, R1 is the backbone of a biochemical equilibrium network that contains a large number of R1 complexes. This network has more dissociation constants (Kd) than can be estimated from currently available data, so assumptions must be made to reduce the number of independent Kd. These assumptions come in two forms: those that state that for the data at hand, a Kd is too large or small to be distinguished from infinity or zero, respectively, and those that state that the Page 1 of 13 (page number not for citation purposes) BMC Systems Biology 2008, 2:15 http://www.biomedcentral.com/1752-0509/2/15 data are too weak to rule out a null hypothesis of the form Kd = K ′d . Model parameters such as the fraction of R1 capable of forming dimers and hexamers, and the enzymatic activities of these R1 states, also come with plausible null hypotheses. In general, different null hypotheses define different models that yield different estimates of the freely estimated parameters. Unfortunately, as modelers traverse a path of reasonable hypotheses until they arrive at a model that provides both a good fit and Kd confidence interval limits that are not too wide, they often stop at different places, and thus report different Kd values. Such Kd estimate extraction differences could be reduced, if a systematic reproducible approach to biochemical equilibria model building was established. Progress toward this goal is described in this paper. where all of the h specific parameters (e.g. kcat's and protein masses) are contained in the vector L and, if the yn are vectors of measurements, the en are vectors of zero mean noise, potentially correlated within steady states, but uncorrelated between steady states; only scalar yn are considered hereafter. The model parameters K, p and L are not indexed by n because they are fitted jointly to the entire dataset, i.e. one set of estimates of these parameters describes all N steady states simultaneously as one (g, h) model of one underlying biochemical equilibrium network. System models The I equations of a system model g = 0 are g 1(Fn , Tn , K , p) = pTn1 − Fn1 − Results Model Consider a dataset comprised of N steady state non-covalent binding equilibriums indexed by n in which J different complexes can potentially form from a protein R of known total concentration Tn1 through interactions with itself and I - 1 other reactants (e.g. substrate, effectors and other proteins) of known total concentrations Tni (1 <i ≤ I). Suppose Wij copies of the ith reactant exist in the jth complex and that a particular R molecule is either undamaged with probability p, and thus capable of forming each of the plausible complexes, or damaged with probability 1 - p, and thus incapable of forming any complexes. Define Tn = (Tn1, Tn2, ...TnI), Fn = (Fn1, Fn2, ...FnI) as the corresponding free reactant concentrations, K = (K1, K2, ...KJ) as the dissociation constants (of complexes to free reactants), yn as the measurement(s) made at the nth steady state, and Zn = (Zn1, Zn2, ...ZnJ) as the concentrations of complexes predicted by W, K and Fn to be Wi′j ∏iI′ =1 Fni′ Z nj = . Kj (1) The relationship between the system inputs (Tn), states (Fn) and outputs (yn) is then modeled by I total concentration constraints g(Fn, Tn, K, p) = 0 that must be solved for the I free reactant concentrations Fn at each n (1 <n ≤ N) given the inputs Tn, and an output measurement model h that connects Fn to expected values of the outputs E(yn) yn = h(Fn, K, p, L) + εn Wi′j ∏iI′ =1 Fni′ J ∑W 1j Kj j =1 =0 g i (Fn , Tn , K ) = Tni − Fni J − ∑W ij (2) Wi′j ∏iI′ =1 Fni′ Kj j =1 = 0 (1 < i ≤ I) where pTn1 is the total concentration of undamaged R and Fn1 is the concentration of free R that is undamaged and thus capable of forming complexes. If all biologically plausible candidate complexes are present in these (...truncated)