Adaptive evolution of centromere proteins in plants and animals
Journal
of Biology
BioMed Central
Open Access
Research article
Adaptive evolution of centromere proteins in plants and animals
Paul B Talbert, Terri D Bryson and Steven Henikoff
Address: Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N, Seattle, WA 98109-1024, USA.
Correspondence: Steven Henikoff. E-mail:
Published: 31 August 2004
Received: 25 May 2004
Revised: 20 July 2004
Accepted: 22 July 2004
Journal of Biology 2004, 3:18
The electronic version of this article is the complete one and can be
found online at http://jbiol.com/content/3/4/18
© 2004 Talbert et al., licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution
License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the
original work is properly cited.
Abstract
Background: Centromeres represent the last frontiers of plant and animal genomics.
Although they perform a conserved function in chromosome segregation, centromeres are
typically composed of repetitive satellite sequences that are rapidly evolving. The
nucleosomes of centromeres are characterized by a special H3-like histone (CenH3), which
evolves rapidly and adaptively in Drosophila and Arabidopsis. Most plant, animal and fungal
centromeres also bind a large protein, centromere protein C (CENP-C), that is characterized
by a single 24 amino-acid motif (CENPC motif).
Results: Whereas we find no evidence that mammalian CenH3 (CENP-A) has been evolving
adaptively, mammalian CENP-C proteins contain adaptively evolving regions that overlap with
regions of DNA-binding activity. In plants we find that CENP-C proteins have complex
duplicated regions, with conserved amino and carboxyl termini that are dissimilar in sequence
to their counterparts in animals and fungi. Comparisons of Cenpc genes from Arabidopsis
species and from grasses revealed multiple regions that are under positive selection, including
duplicated exons in some grasses. In contrast to plants and animals, yeast CENP-C (Mif2p) is
under negative selection.
Conclusions: CENP-Cs in all plant and animal lineages examined have regions that are rapidly
and adaptively evolving. To explain these remarkable evolutionary features for a single-copy
gene that is needed at every mitosis, we propose that CENP-Cs, like some CenH3s, suppress
meiotic drive of centromeres during female meiosis. This process can account for the rapid
evolution and the complexity of centromeric DNA in plants and animals as compared to fungi.
Background
Centromeres are the chromosomal loci where kinetochores
assemble to serve as attachment sites for the spindle microtubules that direct chromosome segregation during mitosis
and meiosis. Despite this essential conserved function in all
eukaryotes, centromere structure is highly variable, ranging
from the simple short centromeres of budding yeast, which
have a consensus sequence of approximately 125 base
Journal of Biology 2004, 3:18
18.2 Journal of Biology 2004,
Volume 3, Article 18
Talbert et al.
pairs (bp) on each chromosome, to holokinetic centromeres that span the entire length of a chromosome [1].
In plants and animals, centromeres are large and complex,
typically comprising megabase-sized arrays of tandemly
repeated satellite sequences that are rapidly evolving [2] and
may differ significantly between closely related species [3-5].
The failure of conventional cloning and sequencing assembly tools to adequately characterize rapidly evolving satellite
sequences at centromeres has made them the last regions of
most eukaryotic genomes to be well understood [1].
Although there is no discernable conservation of centromeric
DNA sequences in disparate eukaryotes, considerable
progress has been made in identifying common proteins that
form the kinetochore [6]. A universal protein component of
centromeric chromatin found in all eukaryotes that have
been examined is a centromere-specific variant of histone H3
(CenH3), which replaces canonical H3 in centromeric
nucleosomes [7,8]. CenH3s are essential kinetochore components yet, like centromeric DNA, they are rapidly evolving
[1]. In both Drosophila [9] and Arabidopsis [10], this rapid
evolution of CenH3s is associated with positive selection
(adaptive evolution), and involves regions of CenH3 that are
predicted to contact the centromeric DNA [9,11,12].
The finding of positive selection in a protein that is required
at every cell division is remarkable. Ancient proteins with
conserved function are expected to be under negative selection because they typically have achieved an optimal
sequence, so new mutations tend to produce deleterious
variants that are quickly eliminated from populations. The
canonical histones are extreme examples of this type of
protein. In contrast, recurrent positive selection generally
occurs as a consequence of genetic conflict, for example in
the ‘arms race’ between pathogen surface antigens and the
immune-cell proteins that recognize them. In this case, a
mutation in a surface antigen that allows the pathogen to
escape detection and proliferate will trigger selection for a
new immune receptor to fight the mutated pathogen, which
can then mutate again, and so on. The evidence for positive
selection of CenH3 proteins specifically in the regions that
contact DNA thus suggests a conflict between centromeric
DNA and a histone component of the nucleosome that
packages it. Is it commonplace for eukaryotes to have such
a conflict at their centromeres? Is the conflict unique to
centromere-specific histones, or are other proteins that bind
centromeres also involved in this conflict? Is conflict
responsible for centromere complexity? To answer these
questions, we investigated the evolution of a second
common DNA-binding kinetochore protein.
Of the handful of essential kinetochore proteins that are
widely distributed among eukaryotes, only one class other
http://jbiol.com/content/3/4/18
than CenH3 has been shown to bind centromeric DNA:
centromere protein C (CENP-C), a conserved component of
the inner kinetochore in vertebrates [13-16]. Human CENP-C
binds DNA non-specifically in vitro [17-19] and binds centromeric alpha satellite DNA in vivo [20,21]. Vertebrate
CENP-C and the yeast centromere protein Mif2p [22,23]
share a 24 amino-acid motif (CENPC motif) that has also
been found in kinetochore proteins in nematodes [24] and
plants [25]. As expected for kinetochore proteins, disruption
or inactivation of genes encoding proteins containing a
CENPC motif (CENP-Cs) results in the failure of proper
chromosome segregation [16,23,24,26-28].
Other than the defining CENPC motif, these proteins are
dissimilar in sequence across disparate phyla. Such a small
stretch of sequence conservation, accounting for less than
5% of the length of these 549-943 amino-acid proteins, is
unexpected considering that CENP-Cs are encoded b (...truncated)