Using gene expression to investigate the genetic basis of complex disorders

Human Molecular Genetics, Oct 2008

The identification of complex disease susceptibility loci through genome-wide association studies (GWAS) has recently become possible and is now a method of choice for investigating the genetic basis of complex traits. The number of results from such studies is constantly increasing but the challenge lying forward is to identify the biological context in which these statistically significant candidate variants act. Regulatory variation plays an important role in shaping phenotypic differences among individuals and thus is very likely to also influence disease susceptibility. As such, integrating gene expression data and other disease relevant intermediate phenotypes with GWAS results could potentially help prioritize fine-mapping efforts and provide a shortcut to disease biology. Combining these different levels of information in a meaningful way is however not trivial. In the present review, we outline the several approaches that have been explored so far in this sense and their achievements. We also discuss the limitations of the methods and how upcoming technological developments could help circumvent these limitations. Overall, such efforts will be very helpful in understanding initially regulatory effects on disease and disease etiology in general.

Article PDF cannot be displayed. You can download it here:

https://hmg.oxfordjournals.org/content/17/R2/R129.full.pdf

Using gene expression to investigate the genetic basis of complex disorders

Alexandra C. Nica 0 Emmanouil T. Dermitzakis 0 0 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus , Cambridge CB10 1HH , UK The identification of complex disease susceptibility loci through genome-wide association studies (GWAS) has recently become possible and is now a method of choice for investigating the genetic basis of complex traits. The number of results from such studies is constantly increasing but the challenge lying forward is to identify the biological context in which these statistically significant candidate variants act. Regulatory variation plays an important role in shaping phenotypic differences among individuals and thus is very likely to also influence disease susceptibility. As such, integrating gene expression data and other disease relevant intermediate phenotypes with GWAS results could potentially help prioritize fine-mapping efforts and provide a shortcut to disease biology. Combining these different levels of information in a meaningful way is however not trivial. In the present review, we outline the several approaches that have been explored so far in this sense and their achievements. We also discuss the limitations of the methods and how upcoming technological developments could help circumvent these limitations. Overall, such efforts will be very helpful in understanding initially regulatory effects on disease and disease etiology in general. - The ability of genome-wide association studies (GWAS) to help understand the genetic basis of complex disorders has recently become apparent. Well-documented common human genetic variation maps (e.g. HapMap project) (1), large patient samples with accurately recorded phenotypic information as well as appropriate statistical methods to assess significance (2) and account for potential biases, have all contributed to the current outburst of successful GWAS. Numerous susceptibility variants for a large number of complex diseases have been reported and effectively replicated. A present catalog of published GWAS (http:// www.genome.gov/26525384) includes single nucleotide polymorphisms (SNPs) not only associated with major common disorders [Crohns disease (3), type 2 diabetes (4), lung cancer (5) etc.] but also with disease-relevant or anthropomorphic quantitative traits [e.g. body mass index (6) or height (7)]. What has not kept the pace however with the capacity to design and perform successful GWAS is our ability to understand how variants discovered via this hypothesisfree approach influence complex traits, In fact, few of the association studies go beyond reporting the most statistically significant hits and if they do, the suggested functionality is typically speculative, based on available annotation of genes in the vicinity of the variants. Since many of the discovered susceptibility polymorphisms fall in non-coding regions and with an increasing number of regulatory variants already implicated in a series of common disorders (8), one conventional approach has been to interrogate disease associated SNPs for associations with differential gene expression. Moffatt et al. (9) found that the same most significant SNPs associated with childhood asthma risk also explain 29.5% of the variance in ORMDL3 transcript levels, measured in lymphoblastoid cell lines. While an interesting observation, this still cannot be regarded as convincing evidence for a causal relationship between ORMDL3 and asthma onset. The concurrent progress towards uncovering the genetic basis of regulatory variation (10) has revealed an abundance of expression quantitative trait loci (eQTLs) in the human genome, making an accidental overlap between these and disease signals very likely. Thus, while gene expression is a very informative and immediate DNA phenotype, integrating expression data and disease studies genetics for an ultimate understanding of disease etiology is not straightforward. ADVANCES AND CURRENT ISSUES IN EXPRESSION AND DISEASE STUDIES Power of current eQTL studies Natural variation in human gene expression has been recently quantified on a genome-wide scale using microarray technologies. Linkage and association studies coupling expression with genetic variability data have started to reveal the genetics underlying part of this variation, including complex allele-specific interactions (11) and its relatively high level of heritability (12 15). Most of the variants discovered with these approaches (a field also called Genetical Genomics) explain variance in transcript levels of nearby genes (so called cis eQTLs) but a few distal acting regulators have also been reported (trans associations). The sample sizes of genome-wide expression association studies have been fairly small though, meaning that the discoveries made so far represent generally large genetic effects [Stranger et al. (12) report an R2 coefficient of determination ranging from 0.27 to almost 1 for the SNP gene associations detected in the 270 HapMap individuals]. The magnitude of the discovered effects drops when pooling populations together with appropriate corrections, a direct consequence of the increased statistical power due to the larger sample sizes. The importance of appropriate statistical power has been extensively demonstrated in complex disease GWAS, where samples of a few thousand paired cases and controls have become a prerequisite (16). The main reason for this requirement is the fact that the individual contribution of genetic variants towards complex trait determination is known to be small. In fact, all susceptibility alleles discovered so far explain only a small fraction of disease risk, with odds ratios typically in the range of 1.21.5 (16,17). Given the marked difference between the magnitudes of detected genetic effects on expression variation and disease predisposition, respectively, it is not surprising that only few instances of overlapping signals have been observed, even when expression in a disease relevant tissue was considered. Small genetic effects on expression variation or complex interactions between regulatory variants with moderate or large effects could become decisive on a permissive environmental background. Current expression analyses are underpowered with respect to these kinds of discoveries; hence whole-genome expression association studies on larger samples would be very desirable. Such efforts are on the way, including the quantification of expression levels in blood cells of 820 HapMap III individuals from eight populations (Barbara Stranger, Stephen Montgomery and Emmanouil Dermitzakis, personal communication). Combined with SNP genotyping data, this resource will give insight into the level of expression differences among populations and generate many additional eQTLs with more subtle effects, some of them potentially related to disease. Tissue-specific phenotypes Confined by the availability of human tissue samples, expression experiments have been initially performed in lympho (...truncated)


This is a preview of a remote PDF: https://hmg.oxfordjournals.org/content/17/R2/R129.full.pdf
Article home page: http://hmg.oxfordjournals.org/content/17/R2/R129.abstract

Alexandra C. Nica, Emmanouil T. Dermitzakis. Using gene expression to investigate the genetic basis of complex disorders, Human Molecular Genetics, 2008, pp. R129-R134, 17/R2, DOI: 10.1093/hmg/ddn285