Molecular Evolutionary Analysis of Nematode Zona Pellucida (ZP) Modules Reveals Disulfide-Bond Reshuffling and Standalone ZP-C Domains
GBE
Molecular Evolutionary Analysis of Nematode Zona Pellucida
(ZP) Modules Reveals Disulfide-Bond Reshuffling and
Standalone ZP-C Domains
Cameron J. Weadick*1
1
Department of Biosciences, University of Exeter, United Kingdom
Accepted: 13 May 2020
Abstract
Zona pellucida (ZP) modules mediate extracellular protein–protein interactions and contribute to important biological processes
including syngamy and cellular morphogenesis. Although some biomedically relevant ZP modules are well studied, little is known
about the protein family’s broad-scale diversity and evolution. The increasing availability of sequenced genomes from “nonmodel”
systems provides a valuable opportunity to address this issue and to use comparative approaches to gain new insights into ZP module
biology. Here, through phylogenetic and structural exploration of ZP module diversity across the nematode phylum, I report evidence
that speaks to two important aspects of ZP module biology. First, I show that ZP-C domains—which in some modules act as
regulators of ZP-N domain-mediated polymerization activity, and which have never before been found in isolation—can indeed
be found as standalone domains. These standalone ZP-C domain proteins originated in independent (paralogous) lineages prior to
the diversification of extant nematodes, after which they evolved under strong stabilizing selection, suggesting the presence of ZP-N
domain-independent functionality. Second, I provide a much-needed phylogenetic perspective on disulfide bond variability, uncovering evidence for both convergent evolution and disulfide-bond reshuffling. This result has implications for our evolutionary understanding and classification of ZP module structural diversity and highlights the usefulness of phylogenetics and diverse sampling
for protein structural biology. All told, these findings set the stage for broad-scale (cross-phyla) evolutionary analysis of ZP modules
and position Caenorhabditis elegans and other nematodes as important experimental systems for exploring the evolution of ZP
modules and their constituent domains.
Key words: gene family evolution, supradomain, domain architecture, cysteine connectivity, nematode cuticle, cuticlin.
Introduction
Secreted proteins help cells withstand, react to, and shape
external conditions (Agrawal et al. 2010; Naba et al. 2016;
Cuesta-Astroz et al. 2017). The extracellular environment can
be variable and stressful, and in order to properly function
under such challenging conditions, secreted proteins often
employ specialized domains that can be repurposed to different ends by being recombined into different protein architectures (Bork et al. 1996; Martin et al. 1998). Obtaining an
appreciation of the structural diversity of secreted proteins is
key to understanding the many biological processes that extend beyond the cellular membrane. In many cases, however,
insights into the biology of secreted protein families derive
from restricted and potentially nonrepresentative sets of
model proteins (e.g., those linked to particular biomedical
conditions, those expressed in already established model systems, and those that can be collected at high levels). Taking a
broad, comparative view can uncover important but otherwise overlooked aspects of secreted protein structure and
function.
The zona pellucida (ZP) module is a key component of
many secreted proteins (Bork and Sander 1992; Plaza et al.
2010; Litscher and Wassarman 2015; Bokhove and Jovine
2018). Named after the mammalian egg coat (from which
the first family-members were found), ZP modules mediate
extracellular protein–protein interactions. Through these
actions, ZP-module-bearing proteins (hereafter referred to
simply as “ZPD proteins,” following Litscher and
Wassarman [2015]) contribute to a variety of critical cellular
and developmental processes, including regulating
C The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
V
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse,
distribution, and reproduction in any medium, provided the original work is properly cited.
Genome Biol. Evol. 0(0):pp. 1240–1255. doi:10.1093/gbe/evaa095 Advance Access publication 18 May 2020
1
*Corresponding author: E-mail:
GBE
Weadick
2
are capable of folding independently in vitro (Lin et al. 2011;
Diestel et al. 2013; Bokhove et al. 2016) and they contribute
to protein–protein binding interfaces in some ZPD proteins
(Hanet al. 2010; Linet al. 2011; Diestelet al. 2013; Okumura
et al. 2015). These points combine to suggest that standalone
ZP-C domains could in theory prove functional and exist on
their own in nature.
ZP modules are characterized by the presence of multiple
intradomain disulfide bonds (Bork and Sander 1992).
However, the number of cysteine residues found per module
varies and this has led to contrary views about how the cysteines connect and whether this variation has any functional
effect (Jovine et al. 2005; Yonezawa 2014). ZP modules have
often been classified as either Type I or Type II based on the
number of cysteines found within the ZP-C domain; these two
groups were alleged to have nonnested connectivity patterns,
and to differ functionally, with Type II but not Type I modules
able to homopolymerize (Boja et al. 2003; Darie et al. 2004;
Kanai et al. 2008). However, in light of the solved structures of
a few ZP modules and isolated ZP-C domains, it was subsequently argued that there is no reliable distinction between
these groups, and that polymerization tendencies are unrelated to cysteine connectivity patterns (Bokhove and Jovine
2018). Rather, Bokhove et al proposed that ZP-C domains
typically have a standard set of three disulfide bonds (Cys5–
Cys7, Cys6–Cys8, and CysA–CysB), with cysteine variation
among ZPD proteins resulting primarily from lineage-specific
gains and losses of disulfide pairs.
For example, the ZP module component of the BMP coreceptor endoglin lacks the Cys6–Cys8 and CysA–CysB disulfides found in uromodulin (Saito et al. 2017), whereas
additional disulfides associated with lineage-specific insertions
have been found in some vertebrate egg-coat proteins (e.g.,
trout VEa/b and chicken ZP3; Darie et al. 2004; Han et al.
2010). The case of ZP3 is an interesting example, as this family
of egg-coat proteins possesses a ZP-C subdomain that introduces four additional cysteine residues that are closely situated both along the sequence and in 3D space. Through
protein crystallography of chicken ZP3, Han et al. (2010)
showed that disulfide bonds covalently link the ZP-C core to
its subdomain. By contrast, the results of earlier mass spectrometric analysis of other vertebrate ZP3 proteins (but not including chicken ZP3) indicated several cases where the
subdomain’s cysteines paired only among each other (Boja
et al (...truncated)