Genome-Wide Population-Based Association Study of Extremely Overweight Young Adults – The GOYA Study
et al. (2011) Genome-Wide Population-Based Association Study of Extremely
Overweight Young Adults - The GOYA Study. PLoS ONE 6(9): e24303. doi:10.1371/journal.pone.0024303
Genome-Wide Population-Based Association Study of Extremely Overweight Young Adults - The GOYA Study
Lavinia Paternoster
David M. Evans
Ellen Aagaard Nohr
Claus Holst
Valerie Gaborieau
Paul Brennan
Anette Prior Gjesing
Niels Grarup
Daniel R. Witte
Torben Jrgensen
Allan Linneberg
Torsten Lauritzen
Anelli Sandbaek
Torben Hansen
Oluf Pedersen
Katherine S. Elliott
John P. Kemp
Beate St. Pourcain
George McMahon
Diana Zelenika
Jo rg Hager
Mark Lathrop
Nicholas J. Timpson
George Davey Smith
Thorkild I. A. Srensen
Philip Awadalla, University of Montreal, Canada
Background: Thirty-two common variants associated with body mass index (BMI) have been identified in genome-wide association studies, explaining ,1.45% of BMI variation in general population cohorts. We performed a genome-wide association study in a sample of young adults enriched for extremely overweight individuals. We aimed to identify new loci associated with BMI and to ascertain whether using an extreme sampling design would identify the variants known to be associated with BMI in general populations. Methodology/Principal Findings: From two large Danish cohorts we selected all extremely overweight young men and women (n = 2,633), and equal numbers of population-based controls (n = 2,740, drawn randomly from the same populations as the extremes, representing ,212,000 individuals). We followed up novel (at the time of the study) association signals (p,0.001) from the discovery cohort in a genome-wide study of 5,846 Europeans, before attempting to replicate the most strongly associated 28 SNPs in an independent sample of Danish individuals (n = 20,917) and a population-based cohort of 15-year-old British adolescents (n = 2,418). Our discovery analysis identified SNPs at three loci known to be associated with BMI with genome-wide confidence (P,561028; FTO, MC4R and FAIM2). We also found strong evidence of association at the known TMEM18, GNPDA2, SEC16B, TFAP2B, SH2B1 and KCTD15 loci (p,0.001), and nominal association (p,0.05) at a further 8 loci known to be associated with BMI. However, meta-analyses of our discovery and replication cohorts identified no novel associations. Significance: Our results indicate that the detectable genetic variation associated with extreme overweight is very similar to that previously found for general BMI. This suggests that population-based study designs with enriched sampling of individuals with the extreme phenotype may be an efficient method for identifying common variants that influence quantitative traits and a valid alternative to genotyping all individuals in large population-based studies, which may require tens of thousands of subjects to achieve similar power.
-
Genome-wide association (GWA) studies have successfully
identified genetic loci associated with body mass index (BMI)
[13]. Despite the very large sample-sizes employed in these
studies (.120,000 in the most recent), most of the heritability has
yet to be explained, with the 32 confirmed loci accounting for only
,1.45% of the variance in BMI (24% of the genetic variance)
[3].
Given the difficulty in identifying loci responsible for small
proportions of the phenotypic variance in BMI, a number of
strategies have been proposed to increase the power to detect
association. One suggestion has been to selectively genotype
individuals at either one or both ends of the distribution of BMI
scores (i.e. obese and/or extremely lean individuals) [47]. The
rationale is that individuals taken from the extreme ends of the
sample are more likely to be enriched for alleles of interest than
individuals sampled from the middle of the distribution.
Theoretically, under simple models of many variants of small
effect, this selection strategy should markedly increase power to
detect association relative to a similar size sample of unselected
individuals. Case individuals can be considered, under certain
assumptions, to reflect extreme scores on an underlying continuous
distribution of disease liability, which is the primary reason why
case-control studies are expected to have greater power to detect
loci than the same number of individuals selected randomly from a
population cohort.
Less well appreciated is that under certain scenarios, selecting
extreme individuals will not always increase power to detect
common genetic variants [8]. One reason is that some individuals
exhibiting extreme trait values may carry rare alleles of large
effect, rather than reflecting the normal variation from common
alleles at quantitative trait loci. Consequently, an extreme sample
may not be enriched for common alleles of interest. Whilst rare
alleles will also be of interest, it may be difficult to identify them via
genome-wide association, since commercial SNP chips have
limited ability to tag rare variants [9] and the power to detect
rare variants via genetic association is low in general. In addition,
individuals may exhibit extreme BMI because of non-genetic
factors or rare combinations of gene-gene or gene-environment
interactions, all of which may also decrease power to detect
common variants of small effect. Thus if the extremes have risk
factors that are unique from the general distribution of BMI, such
a study design may not be useful for identifying general population
BMI risk alleles.
Three previous GWA studies have employed an extreme
sampling strategy to detect obesity loci. One study, which
compared gastric bypass surgery patients with population controls,
found no novel loci, but reported that 6 of the 12 BMI variants
known at the time were associated with risk of severe obesity,
suggesting that generally obesity represents the extreme of a
phenotypic spectrum, rather than a distinct condition [10].
Another study, sampled early-onset obese children and morbidly
obese (BMI.40 kg/m2) adults, and compared them to normal
weight controls [11]. As well as identifying variants in FTO and
MC4R, they detected a further three loci associated with obesity
(NPC1, MAF and PTER). Another, aimed to identify variants
associated with early-onset extreme obesity [12]. They detected
the known FTO, MC4R and TMEM18 loci as well as two novel
loci (SDCCAG8 and TNKS/MSRA). However, the authors of these
three studies analysed the data using a dichotomous coding (i.e.
obese vs non-obese), effectively discarding information from the
underlying continuous distribution of BMI values. Since
casecontrol methods ignore the underlying BMI trait values, they do
not make complete use of the data, and are therefore inefficient
and likely to be less powerful than approaches that incorporate
quantitative information [13].
In this study we aimed to investigate the genetic profile at the
upper extreme of the BMI distribution using a population sample
enriched for these individuals. We employed a genome-wi (...truncated)