GRiP: a computational tool to simulate transcription factor binding in prokaryotes
Copyedited by: TRJ
MANUSCRIPT CATEGORY: APPLICATIONS NOTE
BIOINFORMATICS APPLICATIONS NOTE
Systems biology
Vol. 28 no. 9 2012, pages 1287–1289
doi:10.1093/bioinformatics/bts132
Advance Access publication March 16, 2012
GRiP: a computational tool to simulate transcription factor
binding in prokaryotes
Nicolae Radu Zabet1,2,∗ and Boris Adryan1,2
1 Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge
2 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
CB2 1QR and
Associate Editor: Trey Ideker
Received on November 14, 2011; revised on February 2, 2012;
accepted on March 12, 2012
1
INTRODUCTION
It is well established now that transcription factor (TF) find their
target site through facilitated diffusion, a combination between 1D
random walk on the DNA and 3D diffusion in the cytoplasm (Berg
et al., 1981; Elf et al., 2007). Once bound to the DNA, TFs perform
three main types of movements: (i) sliding , (ii) hopping and (iii)
jumping (Mirny et al., 2009). The first two mechanisms, sliding and
hopping, assume that the TF performs small movements on the DNA
without releasing into the cytoplasm, whereas the third assumes a
3D diffusion in the cytoplasm before rebinding.
With few exceptions, most of the theoretical efforts have
been invested into analytical solutions of the facilitated diffusion
mechanism. If one wants to consider real DNA sequences and
dynamic crowding on the DNA (mobile ‘roadblocks’), then this
rules out analytical solutions. Computational methods and, in
particular, stochastic simulations overcome these limitations and
∗ To
whom correspondence should be addressed.
provide a more accurate mechanistic representation of the underling
biological process. In particular, these type of stochastic simulations
can be used to answer question related to how TFs perform the search
process. For example, one could investigate whether molecules
prefer to hop or to slide and what is the contribution of these two
alternative movements on the DNA to the overall 1D random walk
in a crowded environment.
Building on the comprehensive model constructed in (N.R.Zabet
and B.Adryan, submitted for publication), we developed GRiP
(gene regulation in prokaryotes), a program that allows stochastic
simulation of the search process of TFs for their target sites on the
DNA.
The analyzed systems can be large. For example, Escherichia.coli
K-12 has a 4.6 Mbp genome and there are ∼104 DNA binding
proteins (agents). To produce results within relative short time,
previous software had to either rely on coarse grain models
(Wunderlich and Mirny, 2008) or to consider small subsystems (Chu
et al., 2009). GRiP represents a new and efficient implementation
of the TF search process, which considers a highly detailed model
of 1D diffusion and, at the same time, it simulates at least ≈4
times faster than previous software (Barnes and Chu, 2010; Chu
et al., 2009). Consequently, by allowing genome-wide stochastic
simulations of a highly detailed model of facilitated diffusion, GRiP
can highlight possible biases in the results, where the level of details
was insufficient (coarse grain models) or the size of the analyzed
system was too small.
A few studies, such as Das and Kolomeisky (2010), addressed the
problem of facilitated diffusion through simulations focusing on the
3D diffusion rather than the 1D case. The 3D diffusion is time and
resource consuming, especially for simulations at the genome level.
van Zon et al. (2006) showed that the model based on the zerodimensional Chemical Master Equation can reliably represent the
rate at which TFs associate non-specifically with the DNA, as long as
the model takes into account that once a molecule unbinds from the
DNA, it has a high probability of fast rebinding in close proximity.
This suggests that there is no need to simulate the 3D diffusion
explicitly, but rather have this replaced by a simple arrival rate and
ensuring that the model incorporates the fast rebinding probability
in the unbinding rate, a strategy which we also adopt.
2
ABSTRACT
Motivation: Transcription factors (TFs) are proteins that regulate
gene activity by binding to specific sites on the DNA. Understanding
the way these molecules locate their target site is of great importance
in understanding gene regulation. We developed a comprehensive
computational model of this process and estimated the model
parameters in (N.R.Zabet and B.Adryan, submitted for publication).
Results: GRiP (gene regulation in prokaryotes) is a highly versatile
implementation of this model and simulates the search process
in a computationally efficient way. This program aims to provide
researchers in the field with a flexible and highly customizable
simulation framework. Its features include representation of
DNA sequence, TFs and the interaction between TFs and the
DNA (facilitated diffusion mechanism), or between various TFs
(cooperative behaviour). The software will record both information
on the dynamics associated with the search process (locations
of molecules) and also steady-state results (affinity landscape,
occupancy-bias and collision hotspots).
Availability: http://logic.sysbiol.cam.ac.uk/grip
Contact:
Supplementary information: Supplementary data are available at
Bioinformatics online.
DESCRIPTION
We implemented the target finding process as a hybrid model
mixing agent-based methods with event driven stochastic simulation
algorithms (Gillespie, 1977). The software is implemented in Java
1.6, which ensures high portability.
© The Author(s) 2012. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
[12:59 9/4/2012 Bioinformatics-bts132.tex]
Page: 1287
1287–1289
Copyedited by: TRJ
MANUSCRIPT CATEGORY: APPLICATIONS NOTE
N.R.Zabet and B.Adryan
with the case where they are not in contact); for more details see
(N.R.Zabet and B.Adryan, submitted for publication).
The simulation speed is sensitive to the number of agents in the
system. This mainly comes from the fact that the events queue
becomes larger with increasing number of molecules in the system
and, consequently, higher queues require higher maintenance time.
For 106 TF molecules and the genome of E.coli K-12 (4.6 Mbp), we
can simulate ∼4 ×105 events per second on a Mac Pro 2x2.26 GHz
quad-core Intel Xeon with 32 GB memory running Mac OSX 10.6.8.
3
DISCUSSION
GRiP is a highly versatile program which comes with both
command-line interface and graphical user interface. Furthermore,
being written in Java, the software can be run on any machine where
the Java Runtime Environment 1.6 (or higher) is installed.
The program takes as input a parameters file, which can specify,
(...truncated)