The planetary biology of cytochrome P450 aromatases
BMC Biology
The planetary biology of cytochrome P450 aromatases
Eric A Gaucher 2
Logan G Graddy 1
Tang Li 2
Rosalia CM Simmen 0
Frank A Simmen 0
David R Schreiber 2
David A Liberles 5
Christine M Janis 4
Steven A Benner 3
0 Department of Physiology & Biophysics, Medical Sciences & Children's Nutrition Center, University of Arkansas , 1120 Marshall Street, Little Rock AR, 72202 , USA
1 Department of Psychiatry, Duke University Medical Center , Durham, NC 27708 , USA
2 Foundation for Applied Molecular Evolution , 1115 NW 4th Street, Gainesville FL 32601-4256 , USA
3 Department of Chemistry, University of Florida , Gainesville FL 32611-7200 , USA
4 Ecology and Evolutionary Biology, Brown University , Providence RI 02912 , USA
5 Computational Biology Unit, Bergen Center for Computational Science, University of Bergen , 5020 Bergen , Norway
Background: Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system. Results: Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases-enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene. Conclusions: This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems.
-
Background
The emergence of complete genomes for many organisms,
including humans, has created the need for hypotheses
concerning the "function" of specific genes that encode
specific proteins. While "function" is interpreted by
different workers in different ways [1], Darwinian theory (by
axiom) requires that the term be connected to fitness;
natural selection is the only mechanism admitted by theory
to generate functional behavior in a living system, macro
or molecular. This, in turn, implies that the hypotheses
about function have a "systems" component, including
the interaction of the protein with other proteins, their
impact on the physiology (defined broadly) of the cell
and organism, and the consequences of physiology in a
changing ecosystem in a planetary context [2].
Systems hypotheses can be supported by information
from many areas. Geology, paleontology, and genomics,
for example, provide three records that capture the natural
history of past life on Earth. At the same time, structural
biology, genetics, and organic chemistry describe the
structures, behaviors and reactivities of proteins that allow
them to support present life. It has been appreciated that
a combination of these six types of analysis provides
insights into functional behavior of proteins that cannot
be provided by any of these alone [2]. Over the long term,
we expect that the histories of the geosphere, the
biosphere, and the genosphere will converge to give a
coherent picture showing the relationship between life and the
planet that supports it. This picture will be based,
however, on individual cases that serve as paradigms for
making the connection.
The aromatase family of proteins offers an interesting
system to illustrate the power of this combination as a way to
create hypotheses regarding protein function within a
system [3]. These hypotheses are not "proof", of course, but
are limiting in genomics-inspired biological
experimentation, now that genomic data themselves are so abundant.
Aromatases are cytochrome P450-dependent enzymes
that use dioxygen to catalyze a multistep transformation
of an androgenic steroid (such as testosterone) to an
estrogenic steroid (such as estradiol) (Figure 1). The protein
plays a key role in normal vertebrate reproductive
biologya role that appears to have arisen before fish and
tetrapods (land vertebrates, including mammals)
diverged some 375 million years ago [4]. Aromatase is
important in modern medicine as well, especially in
breast and other hormone-dependent cancers [5].
Different numbers of aromatase genes are found in
different vertebrates. Two aromatase genes are known in teleost
fish [6,7]. Only a single gene is known in the horse [8], rat
[9], and mouse [10]. Cattle have both a functional gene
and a pseudogene built from homologs of exons 2, 3, 5,
8, and 9 of their functional gene; these are interspersed
with a bovine repeat element [11,12]. In several
mammalian species, including humans and rabbits, a single gene
yields multiple forms of the mRNA for aromatase in
different tissues via alternative splicing [13-16].
A still different phenomenology is observed in the pig (Sus
scrofa). Three different mRNA molecules had been
reported in different tissues from pig [17-21]. Compelling
evidence then emerged that the three variants of mRNA
identified in cDNA studies arose from three paralogous
genes [22], rather than from a single gene differentially
spliced [23]. This implies that the three aromatase
paralogs in pigs arose via gene duplications relatively recent in
geologic time.
Hypotheses relating to the function of the three aromatase
paralogs depend in part on when those duplications took
place. If they were very recent, the three genes might have
helped pigs adapt to domestication. If they pre-dated the
divergence of pigs and fish [6], they may have different
roles that are very fundamental to reproductive
endocrinology in vertebrates. We apply here a series of tools to
generate better hypotheses concerning the aromatase
family of paralogs in swine.
Resul (...truncated)