Equivalent binding sites reveal convergently evolved interaction motifs
Andreas Henschel
0
Wan Kyu Kim
0
Michael Schroeder
0
0
Bioinformatics Group, Biotechnological Centre
, TU Dresden,
Germany
Motivation: Much research has been devoted to the characterization of interaction interfaces found in complexes with known structure. In this context, the interactions of non-homologous domains at equivalent binding sites are of particular interest, as they can reveal convergently evolved interface motifs. Such motifs are an important source of information to formulate rules for interaction specificity and to design ligands based on the common features shared among diverse partners. Results: We develop a novel method to identify non-homologous structural domains which bind at equivalent sites when interacting with a common partner. We systematically apply this method to all pairs of interactions with known structure and derive a comprehensive database for these interactions. Of all non-homologous domains, which bind with a common interaction partner, 4.2% use the same interface of the common interaction partner (excluding immunoglobulins and proteases). This rises to 16% if immunoglobulin and proteases are included. We demonstrate two applications of our database: first, the systematic screening for viral protein interfaces, which can mimic native interfaces and thus interfere; and second, structural motifs in enzymes and its inhibitors. We highlight several cases of virus protein mimicry: viral M3 protein interferes with a chemokine dimer interface. The virus has evolved the motif SVSPLP, which mimics the native SSDTTP motif. A second example is the regulatory factor Nef in HIV which can mimic a kinase when interacting with SH3. Among others the virus has evolved the kinase's PxxP motif. Further, we elucidate motif resemblances in Baculovirus p35 and HIV capsid proteins. Finally, chymotrypsin is subject to scrutiny wrt. its structural similarity to subtilisin and wrt. its inhibitor's similar recognition sites. Contact: Supplementary informaton: A database is online at scoppi.biotec. tu-dresden.de/abac/ The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email:
-
INTRODUCTION
Protein interactions underlie all cellular processes and are important
to reveal function. The interactions from known three-dimensional
structures have been of particular interest in that they allow a
number of detailed analyses of interfaces in terms of physico-chemical
properties, shape and geometry [Jones and Thornton (1996); Conte
To whom correspondence should be addressed.
et al. (1999); Bashton and Chothia (2002); Chakrabarti and Janin
(2002); Nussinov et al. (1997); Ofran and Rost (2003)]. The rapid
growth of multichain and multidomain structures in Protein
Databank (PDB) [Berman et al. (2000)] enabled systematic analyses
of domaindomain interactions and interfaces [Park et al. (2001);
Bolser et al. (2003); Apic et al. (2001); Kim et al. (2004)] and
several databases dedicated to the collection of structural domain
domain interactions are available [Finn et al. (2005); Stein et al.
(2005); Davis and Sali (2005)]. Much work has concentrated on
understanding under what circumstances homologous interactions
are conserved [Pazos and Valencia (2001); Aloy et al. (2003); Tsai
et al. (1996); Torrance et al. (2005)]. Aloy et al. (2003) did an
extensive analysis on the relationship between sequence similarity
and binding orientation and showed the geometry of interaction
tends to be conserved between highly similar pairs.
An alternative approach is to investigate how non-homologous
proteins bind at equivalent surfaces of homologous proteins [Tsai
et al. (1996)]. Such interactions do not necessarily compete in vivo,
but they reveal equivalent interaction sites. In some cases, the
interactions may be truly competitive and regulated temporally
by chemical modification or regulatory factors and spatially by
compartmentalization. Independent of competitive or
noncompetitive binding, the identification of equivalent interfaces is
a pointer to convergently evolved motifs. The motifs help to reveal
key features which are necessary for the interaction.
A well-known example of a convergently evolved motif is the
catalytic triad (Ser, His, Asp) found in both chymotrypsin and
subtilisin (e.g. Fig. 4a). The local features of the enzymes catalytic
sites are conserved in other enzymes [Torrance et al. (2005)].
Chymotrypsin and subtilisin do not share any sequence or structure
similarity. Indeed, both belong to different classes with
chymotrypsin consisting only of beta-sheets and subtilisin of betaalphabeta
units. Despite this different architecture, there are various inhibitors,
which inhibit both enzymes and which use the same interface to do
so. Thus, despite non-homology of the enzymes, equivalent binding
sites are used.
Consider Figure 1a. To elucidate such interfaces with
convergently evolved motifs, we screen the known structures in PDB for
pairs of interactions A B and A0 C, where B, C are from different
superfamilies and A, A0 from the same family. If B and C bind to
equivalent sites of A and A0, respectively, we label B and C as
interfaces with convergently evolved motifs. To define the
equivalence of interfaces we use sequence and structure alignments of
the shared domains A and A0. If there is sufficient overlap in the
sequence alignment of A and A0s interface residues and if the angle
between interfaces of B and C after superimposition of A and A0 is
sufficiently small, B and C bind at equivalent sites.
In our analysis we use Structural Classification of Proteins
(SCOP) domains [Murzin et al. (1995)]. We require A and A0 to
be of the same family, since interfaces are known to be more
conserved in both sequence and structure within a family [Valdar and
Thornton (2001)], but not across the families of a superfamily
[Rekha et al. (2005)]. For B and C we require different
superfamilies, which ensures that they are evolutionarily not related.
The method sketched above identifies all pairs of interfaces with
convergently evolved motifs. One application of such a resource is
the study of viral proteins, which mimic the interfaces of native
proteins and can thus interfere accordingly. We discuss two such
cases: the M3 protein, which mimics the chemokine homodimer
interface and the regulatory factor Nef found in HIV, which mimics
a kinase interface when interacting with SH3.
To identify non-homologous domains binding at equivalent sites we proceed
as illustrated in Figure 1a: we consider all pairs of interactions A B and
A0 C, where A and A0 belong to the same family and B and C to different
superfamilies. If B and C bind at equivalent interfaces, we screen them for
shared motifs. To define equivalent binding sites, we use a two-stage
procedure: first, we scan for a significant interface residue overlap on the
aligned sequences; second, the angle and the spatial overlap between the two
interfaces are used to further refine the sea (...truncated)