A general linear model-based approach for inferring selection to climate

BMC Genetics, Sep 2013

Many efforts have been made to detect signatures of positive selection in the human genome, especially those associated with expansion from Africa and subsequent colonization of all other continents. However, most approaches have not directly probed the relationship between the environment and patterns of variation among humans. We have designed a method to identify regions of the genome under selection based on Mantel tests conducted within a general linear model framework, which we call MAntel-GLM to Infer Clinal Selection (MAGICS). MAGICS explicitly incorporates population-specific and genome-wide patterns of background variation as well as information from environmental values to provide an improved picture of selection and its underlying causes in human populations. Our results significantly overlap with those obtained by other published methodologies, but MAGICS has several advantages. These include improvements that: limit false positives by reducing the number of independent tests conducted and by correcting for geographic distance, which we found to be a major contributor to selection signals; yield absolute rather than relative estimates of significance; identify specific geographic regions linked most strongly to particular signals of selection; and detect recent balancing as well as directional selection. We find evidence of selection associated with climate (P < 10-5) in 354 genes, and among these observe a highly significant enrichment for directional positive selection. Two of our strongest 'hits’, however, ADRA2A and ADRA2C, implicated in vasoconstriction in response to cold and pain stimuli, show evidence of balancing selection. Our results clearly demonstrate evidence of climate-related signals of directional and balancing selection.

Article PDF cannot be displayed. You can download it here:

https://bmcgenet.biomedcentral.com/track/pdf/10.1186/1471-2156-14-87

A general linear model-based approach for inferring selection to climate

Raj et al. BMC Genetics 2013, 14:87 http://www.biomedcentral.com/1471-2156/14/87 METHODOLOGY ARTICLE Open Access A general linear model-based approach for inferring selection to climate Srilakshmi M Raj1,2†, Luca Pagani1†, Irene Gallego Romero3, Toomas Kivisild1 and William Amos4*† Abstract Background: Many efforts have been made to detect signatures of positive selection in the human genome, especially those associated with expansion from Africa and subsequent colonization of all other continents. However, most approaches have not directly probed the relationship between the environment and patterns of variation among humans. We have designed a method to identify regions of the genome under selection based on Mantel tests conducted within a general linear model framework, which we call MAntel-GLM to Infer Clinal Selection (MAGICS). MAGICS explicitly incorporates population-specific and genome-wide patterns of background variation as well as information from environmental values to provide an improved picture of selection and its underlying causes in human populations. Results: Our results significantly overlap with those obtained by other published methodologies, but MAGICS has several advantages. These include improvements that: limit false positives by reducing the number of independent tests conducted and by correcting for geographic distance, which we found to be a major contributor to selection signals; yield absolute rather than relative estimates of significance; identify specific geographic regions linked most strongly to particular signals of selection; and detect recent balancing as well as directional selection. Conclusions: We find evidence of selection associated with climate (P < 10-5) in 354 genes, and among these observe a highly significant enrichment for directional positive selection. Two of our strongest ‘hits’, however, ADRA2A and ADRA2C, implicated in vasoconstriction in response to cold and pain stimuli, show evidence of balancing selection. Our results clearly demonstrate evidence of climate-related signals of directional and balancing selection. Keywords: Climate, Adaptation, Human evolution, Natural selection, Environmental adaptation, Population genetics Background Within the last 100,000 years humans dispersed from Africa to occupy most of the habitable space in the world. During this process our species has successfully combined cultural buffering, biological plasticity and adaptation to cope with the wide range of new ecosystems, pathogens and climates they encountered [1-3]. Climate, in particular, comprises many diverse elements such as temperature, humidity, precipitation and solar radiation, so it would be surprising if many different genes had not been influenced by natural selection. Indeed, many physiological traits exhibit geographic trends that correlate with climate [4-8]. However, without an explicit link to global patterns of * Correspondence: † Equal contributors 4 Department of Zoology, University of Cambridge, Cambridge, UK Full list of author information is available at the end of the article genetic variation, the extent to which these trends reflect adaptation through natural selection remains unclear. Many genetic studies on humans have attempted to identify genes and genomic regions associated with regional adaptation by looking for signatures of selection [2,9-15]. These studies have relied on a diverse range of approaches that mostly identify outliers in the empirical genome-wide data, including searches for markers exhibiting unusually high levels of geographic differentiation [2,9], for genomic regions with high linkage disequilibrium and derived allele frequency [10], and for markers where the loss of genetic variability that occurred when humans migrated out of Africa has been particularly high or low [11-14]. These approaches suggest that a substantial proportion of the human genome contains candidates of positive selection [15]. However, it can be difficult to ascribe environmental or biological factors to any particular signal. Furthermore, © 2013 Raj et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Raj et al. BMC Genetics 2013, 14:87 http://www.biomedcentral.com/1471-2156/14/87 wherever signatures of selection are sought by considering patterns of genetic variation in isolation, i.e. without reference to a specific hypothesis, it can become difficult to separate genuine signals from those that arise from other sources including genotyping errors and other artifacts. One way to increase statistical power when searching for signatures of selection is to study patterns of genomic variation across populations in relation to particular environmental characteristics. For example, physiological adaptations to temperature and solar radiation, as well as several other traits, have been shown to vary along a latitudinal cline [16-18], suggesting selection by climate. Even modest regional allele frequency differences can provide evidence of selection if they correlate strongly with one or more environmental variables, provided the environmental variables are accurately measured and also approximate the selective pressure over the time of evolution. Explored earlier by Prugnolle et al. (2005) [19], this approach has been pioneered by Hancock et al. [20-23], who use a Bayesian algorithm [24] to search for markers at which variations in allele frequency correlate more than the genomic average with global variation in one or more climatic variables. In this approach, absolute significance is not determined. Instead, markers are ranked in terms of their degree of association. On the one hand this makes the approach sensibly conservative, but on the other it precludes a meaningful estimate of the proportion of the genome actually influenced by selection. Here we present a new approach for detecting signatures of selection based on the use of general linear models to analyze similarity matrices. This framework allows three important advantages. First, data from neighboring markers can be combined into a single genetic window, thereby reducing greatly the number of independent tests that need to be performed. Second, the method is flexible, allowing incorporation of possible cofactors such as geographic distance between populations and interactions between variables. In particular, by fitting genome-wide genetic relatedness we can control for variation in the level of shared ancestry between different pairs of individuals or populations. Third, statistical significance is determined through a form of Mantel test, based on repeated randomization (scrambling) of the data at one predictor variable, allowing absolute estimates of significance rather than empirical (...truncated)


This is a preview of a remote PDF: https://bmcgenet.biomedcentral.com/track/pdf/10.1186/1471-2156-14-87
Article home page: https://bmcgenet.biomedcentral.com/articles/10.1186/1471-2156-14-87

Srilakshmi M Raj, Luca Pagani, Irene Gallego Romero, Toomas Kivisild, William Amos. A general linear model-based approach for inferring selection to climate, BMC Genetics, 2013, pp. 87, Volume 14, Issue 1, DOI: 10.1186/1471-2156-14-87