Predicting protein ligand binding motions with the conformation explorer
BMC Bioinformatics
Predicting protein ligand binding motions with the conformation explorer
Samuel C Flores 0
Mark B Gerstein 1 2
0 Department of Cell and Molecular Biology, Uppsala University , BMC Box 596, Uppsala, 75124 , Sweden
1 Department of Computer Science, Yale University , PO Box 208114 MBB, New Haven, CT, 06520 , USA
2 Department of Molecular Biophysics and Biochemistry, Yale University , PO Box 208114 MBB, New Haven, CT, 06520 , USA
Background: Knowledge of the structure of proteins bound to known or potential ligands is crucial for biological understanding and drug design. Often the 3D structure of the protein is available in some conformation, but binding the ligand of interest may involve a large scale conformational change which is difficult to predict with existing methods. Results: We describe how to generate ligand binding conformations of proteins that move by hinge bending, the largest class of motions. First, we predict the location of the hinge between domains. Second, we apply an Euler rotation to one of the domains about the hinge point. Third, we compute a short-time dynamical trajectory using Molecular Dynamics to equilibrate the protein and ligand and correct unnatural atomic positions. Fourth, we score the generated structures using a novel fitness function which favors closed or holo structures. By iterating the second through fourth steps we systematically minimize the fitness function, thus predicting the conformational change required for small ligand binding for five well studied proteins. Conclusions: We demonstrate that the method in most cases successfully predicts the holo conformation given only an apo structure.
-
Background
Conformational changes in proteins can take place in a
wide variety of ways, not all of which have been formally
classified. One important class of motions is shear, in
which stacked side chains of the protein can slide
without losing contact. In this work we focus on the largest
class, domain hinge bending, in which one structural
domain of the protein moves relative to another domain
about a hinge which connects the two [1,2]. Such
motions typically involve the slowest degrees of freedom
of that protein and so are difficult to predict by existing
methods.
The prediction of ligand binding motions of the
protein receptor has considerable potential applications in
protein-protein and protein-ligand docking. Many
methods can predict the side chain rearrangements required
for docking [3,4] but these assume that the large scale
conformation is already nearly correct. Thus there is a
need for a method that will put the receptor in the
correct large scale conformation which can be a
productive starting point [5].
Much work has been done in this area. Molecular
Dynamics (MD) [6-9] explicitly computes the dynamical
trajectory of molecules modeled as point masses
connected by linear and nonlinear springs and can be used
to predict conformational change, but usually only
small- or moderate-scale domain motions can be
reproduced [10] with many biologically relevant motions
remaining out of reach [11]. Accordingly several
methods used MD to account for the fast fluctuations of
proteins in drug docking by first computing the protein
trajectory using MD [4,12,13]. One limitation of such
techniques is that they may not escape the vicinity of an
initial conformation, even in a time span experimentally
known to be sufficient for conformational change [14].
Althaus et al created a combinatorial tree of side-chain
rotamers which they explored using a branch-and-cut
algorithm, [15] without varying the backbone
conformation. Sandak et al. created a flexible-receptor docking
code which articulates the protein at a hinge point, but
leaves the two resulting domains rigid [16]. This method
suffered from the opposite problem: it could generate
large scale protein motions, but had no way of dealing
with even small side chain rearrangements, a weakness
leading to failure [15]. The described methods are good
at either treating the side-chain flexibility, or the large
scale conformational changes, but not both
simultaneously. Conformation Explorer uses Sandak et al.s idea
of moving domains about a hinge point to generate
large scale conformational change, but also includes
equilibration steps which permit relaxation and
adjustment of all atoms.
Normal modes have also been used by many authors
to predict the conformational changes of proteins [17].
Comparison of the atomic coordinates of homologous
pairs of proteins shows that the lowest order modes are
most involved in conformational change, [18,19] but
also that multiple modes are needed to accurately
represent the motion [20]. It is possible to determine the
correct combination of normal modes that will reproduce a
desired motion, but this requires knowledge of at least a
few interatomic distance constraints for the final
structure [21].
In a different approach, a docked protein-ligand
complex was displaced along the lowest-frequency normal
mode directions to minimize non-bonded energy terms
in an MD force field [22-24]. However a normal mode
expansion assumes a quadratic potential and so is
accurate only for small fluctuations about an equilibrium
structure; therefore the method cannot be used to
predict larger scale conformational changes such as we
treat in this work. The method of Lindahl et al. gains
improvements of 0.3 to 3.2 for several proteins; [22]
our method recapitulates much larger conformational
changes as we will show.
Maiorov and Abagyan [25] rigidified all protein bonds
except those in the interdomain linker and interface
using Internal Coordinate Modeling, and then used the
Biased Probability Monte Carlo protocol to generate
potential alternate conformations of the protein. The
method succeeded in generating a large number of
alternate conformations, and some of these were somewhat
similar to alternate conformations known
crystallographically. However without referring to the known
alternate conformations, it was impossible to determine
which of the many predicted structures was
thermodynamically plausible. Further, many energy evaluations
and minimizations were expended in evaluating
generated conformers which were later discarded. Lastly, it
was not easy to know how long a thorough exploration
of conformation space would take, and no clear way to
restrict the search to a given region of interest. Our
method is similar in several ways to Maiorov et al.s, but
also addresses these limitations.
In more recent work, de Groot et al. [26] showed they
could find the holo conformations of several
ligandbinding proteins. The method relies on tCONCOORD,
[5] which determines flexible regions by analyzing
hydrogen bonding networks. Once these are known, an
ensemble of plausible structures is generated. An
interative process involving docking, MD refinement, and
filtering by radius of gyration then generates holo
structures. However the radius of gyration must be
pro (...truncated)