Discovering genetic interactions bridging pathways in genome-wide association studies
ARTICLE
https://doi.org/10.1038/s41467-019-12131-7
OPEN
Discovering genetic interactions bridging pathways
in genome-wide association studies
1234567890():,;
Gang Fang 1,6, Wen Wang 2,6, Vanja Paunic2, Hamed Heydari 3, Michael Costanzo 3, Xiaoye Liu2,
Xiaotong Liu 2, Benjamin VanderSluis 2, Benjamin Oately2, Michael Steinbach 2, Brian Van Ness4,
Eric E. Schadt 1, Nathan D. Pankratz 5, Charles Boone3, Vipin Kumar2 & Chad L. Myers 2
Genetic interactions have been reported to underlie phenotypes in a variety of systems, but
the extent to which they contribute to complex disease in humans remains unclear. In
principle, genome-wide association studies (GWAS) provide a platform for detecting genetic
interactions, but existing methods for identifying them from GWAS data tend to focus on
testing individual locus pairs, which undermines statistical power. Importantly, a global
genetic network mapped for a model eukaryotic organism revealed that genetic interactions
often connect genes between compensatory functional modules in a highly coherent manner.
Taking advantage of this expected structure, we developed a computational approach called
BridGE that identifies pathways connected by genetic interactions from GWAS data. Applying
BridGE broadly, we discover significant interactions in Parkinson’s disease, schizophrenia,
hypertension, prostate cancer, breast cancer, and type 2 diabetes. Our novel approach
provides a general framework for mapping complex genetic networks underlying human
disease from genome-wide genotype data.
1 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. 2 Department of Computer Science
and Engineering, University of Minnesota, Minneapolis, MN 55455, USA. 3 Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada.
4 Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN 55455, USA. 5 Department of Laboratory Medicine and
Pathology, University of Minnesota, Minneapolis, MN 55455, USA. 6These authors contributed equally: Gang Fang, Wen Wang. Correspondence and
requests for materials should be addressed to G.F. (email: ) or to V.K. (email: )
or to C.L.M. (email: )
NATURE COMMUNICATIONS | (2019)10:4274 | https://doi.org/10.1038/s41467-019-12131-7 | www.nature.com/naturecommunications
1
ARTICLE
G
NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12131-7
enome-wide association studies (GWAS) have been
increasingly successful at identifying single-nucleotide
polymorphisms (SNPs) with statistically significant association to a variety of diseases1,2 and gene sets significantly
enriched for SNPs with moderate association3. However, for most
diseases, there remains a substantial disparity between the disease
risk explained by the discovered loci and the estimated total
heritable disease risk based on familial aggregation4,5. While there
are a number of possible explanations for this “missing heritability”, including many loci with small effects or rare variants4,
genetic interactions between loci are one potential culprit5,6.
Genetic interactions generally refer to a combination of two or
more genes whose contribution to a phenotype cannot be completely explained by their independent effects5,7. One example of
an extreme genetic interaction is synthetic lethality where two
mutations, neither of which is lethal on its own, combine to
generate a lethal double mutant phenotype. Thus, genetic interactions may explain how relatively benign variation can combine
to generate more extreme phenotypes, including complex human
diseases4,5,8. Several studies have reported genetic interactions
between specific variants in various disease contexts7,9, and
scalable computational tools have been developed for searching
for interactions amongst SNPs7,10. However, systematic discovery
of statistically significant genetic interactions on a genome-scale
remains a major challenge. For example, a theoretical analysis
estimated that ~500,000 subjects would be needed to detect significant genetic interactions under reasonable assumptions5,
which remains beyond the cohort sizes available for a typical
GWAS study or even the large majority of meta-GWAS studies.
Genome-wide, reverse genetic screens in model organisms
have produced rich insights into the prevalence and organization
of genetic interactions11,12. Specifically, the mapping and analysis
of the yeast genetic network revealed that genetic interactions are
numerous and tend to cluster into highly organized network
structures, connecting genes in two different but compensatory
functional modules (e.g., pathways or protein complexes) as
opposed to appearing as isolated instances11,13. For example,
nonessential genes belonging to the same pathway often exhibit
negative genetic interactions with the genes of a second nonessential pathway that impinges on the same essential function
(Fig. 1a). Owing to their functional redundancy, the two different
pathways can compensate for the loss of the other, and thus, only
simultaneous perturbation of both pathways (e.g., A* and Y*)
(Fig. 1a) results in an extreme loss of function phenotype, which
could be associated with either increased or decreased disease
risk. Importantly, the same phenotypic outcome could be
achieved by several different combinations of genetic perturbations in both pathways (e.g., A-X, A-Z, B-X, B-Y, B-Z) (Fig. 1b).
This model for the local topology of genetic networks, called the
“between-pathway model” (BPM), has been widely observed in
yeast genetic interaction networks11,14. Indeed, as many as ~70%
of negative genetic interactions observed in yeast occur in BPM
structures, indicating that genetic interactions are highly organized and this type of local clustering is the rule rather than the
exception13. In addition to BPMs, combinations of mutations in
genes within the same pathway or protein complex also tend to
exhibit a high frequency of genetic interaction (Fig. 1b), a
network structure referred to as a “within-pathway model”
(WPM)11,14. Indeed, ~80% of essential protein complexes in yeast
exhibit a significantly elevated frequency of within-pathway
interactions15. In the context of human disease, a WPM may
reflect an individual that inherits two variants in the same
pathway, resulting in reduced flux or function of a particular
pathway and an increase or decrease in disease risk.
The prevalence of BPM and WPM structures observed in the
yeast global genetic network has important practical implications
2
that can be exploited to explore disease-associated genetic interactions in humans based on GWAS data. Although tests to
identify genetic interactions between specific SNP or gene pairs
are statistically under-powered, we may be able to detect genetic
interactions by leveraging the fact that pairwise interactions
between genome variants are likely to cluster into larger BPM and
WPM network structures similar to those observed in the yeast
global (...truncated)