Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain
Human Molecular Genetics, 2016, Vol. 25, No. 5
927–935
doi: 10.1093/hmg/ddv625
Advance Access Publication Date: 5 January 2016
Original Article
ORIGINAL ARTICLE
Integrating population variation and protein structural
analysis to improve clinical interpretation of missense
variation: application to the WD40 domain
1
European Bioinformatics Institute (EMBL-EBI) and 2Wellcome Trust Sanger Institute, Wellcome Genome
Campus, Hinxton, Cambridge, UK, 3Sheffield Regional Genetics Services, Sheffield Children’s Hospital, Western
Bank, Sheffield S10 2TH, UK, 4West of Scotland Genetic Services, Level 1, Laboratory Medicine Building, South
Glasgow University Hospital, 1345 Govan Road, Glasgow G51 4TF, UK, 5Human Genetics, Ninewells Hospital,
Dundee DD1 9SY, UK, 6Northern Genetics Service, Newcastle upon Tyne Hospitals NHS Foundation Trust,
Institute of Genetic Medicine, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
and 7East Anglian Medical Genetics Service, Addenbrooke’s Treatment Centre, Addenbrooke’s Hospital,
Cambridge University Hospitals, Cambridge CB2 0QQ, UK
*To whom correspondence should be addressed. Tel: +44 1223834244; Fax: +44 1223494919; Email:
Abstract
We present a generic, multidisciplinary approach for improving our understanding of novel missense variants in recently
discovered disease genes exhibiting genetic heterogeneity, by combining clinical and population genetics with protein
structural analysis. Using six new de novo missense diagnoses in TBL1XR1 from the Deciphering Developmental Disorders study,
together with population variation data, we show that the β-propeller structure of the ubiquitous WD40 domain provides a
convincing way to discriminate between pathogenic and benign variation. Children with likely pathogenic mutations in this
gene have severely delayed language development, often accompanied by intellectual disability, autism, dysmorphology and
gastrointestinal problems. Amino acids affected by likely pathogenic missense mutations are either crucial for the stability
of the fold, forming part of a highly conserved symmetrically repeating hydrogen-bonded tetrad, or located at the top face of the
β-propeller, where ‘hotspot’ residues affect the binding of β-catenin to the TBLR1 protein. In contrast, those altered by
population variation are significantly less likely to be spatially clustered towards the top face or to be at buried or highly
conserved residues. This result is useful not only for interpreting benign and pathogenic missense variants in this gene, but also
in other WD40 domains, many of which are associated with disease.
†
These authors contributed equally to this work.
Received: October 27, 2015. Revised and Accepted: December 22, 2015
© The Author 2016. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/),
which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
927
Roman A. Laskowski1,†, Nidhi Tyagi1,†, Diana Johnson3, Shelagh Joss4,
Esther Kinning4, Catherine McWilliam5, Miranda Splitt6, Janet M. Thornton1,
Helen V. Firth7, the DDD Study2 and Caroline F. Wright2, *
928
| Human Molecular Genetics, 2016, Vol. 25, No. 5
Introduction
Results
Six children within the DDD study were found to have likely
pathogenic de novo mutations in TBL1XR1, including five single
nucleotide variants predicted to cause a missense change, and
one 1 bp frameshift insertion predicted to result in loss of function through truncation or nonsense-mediated decay (Table 1).
Two additional likely de novo missense mutations have also
been published in children affected by developmental disorders (25,28), as well as a de novo 1 bp frameshift deletion (25).
A number of whole gene deletions have also been described
(27,29).
Children with likely pathogenic mutations in TBL1XR1 have
developmental delay often with autistic features (Table 1). All patients have marked expressive speech and language delay as the
most consistent feature, and most have special needs requiring
specialist educational assistance. In addition, most of the children identified via the DDD study have gastrointestinal disturbance or constipation. Although a number of patients have
dysmorphic features, a preliminary assessment of facial photographs does not suggest an identifiable facial gestalt and growth
parameters were typically within the normal range (Supplementary Material, Table S1). There are no apparent differences in either the phenotypes or severity of the children with missense
mutations versus those with truncating mutations and gene deletions, potentially suggesting a common loss of function
mechanism.
Although TBL1XR1 is a highly constrained gene [Exome Aggregation Consortium (ExAC), Cambridge, MA, USA; http://exac.
broadinstitute.org/; accessed December 2015], we were able to
identify 64 unique germline population missense variants in
TBL1XR1 in population controls, in which benign variants are expected to be relatively enriched and pathogenic variants relatively depleted for rare childhood onset dominant disorders with
obvious phenotypes. These variants were identified using multiple databases: the ExAC (http://exac.broadinstitute.org/; accessed June 2015), dbSNP (http://www.ncbi.nlm.nih.gov/SNP/),
the Exome Variant Server [NHLBI GO Exome Sequencing Project
(ESP), Seattle, WA, USA; http://evs.gs.washington.edu/EVS/; accessed June 2015] and the European Variant Archive (http://
www.ebi.ac.uk/eva/) (32).
All five DDD missense mutations and one published likely
pathogenic mutation are located within the WD40 domain of
TBLR1, in addition to 33 of the population missense variants
(Table 2). Interestingly, we also identified 16 likely non-pathogenic missense variants in TBL1XR1 within the DDD cohort (where
the variant is in, or inherited from, an unaffected parent), all of
which either lie outside the WD40 domain or have already been
observed in the population.
The WD40 domain of TBLR1 has a β-propeller structure consisting of eight propeller ‘blades’, each formed by a four-stranded
antiparallel β-sheet, which are joined by β-hairpins. The blades
are arranged symmetrically about a central axis, like the staves
of a barrel, and β-catenin binds to the ‘top’ face of the propeller
to promote the transcription of Wnt target genes (33) (Fig. 1B).
A number of ‘hotspot residues’ have been identified previously
Understanding the impact of missense variants in known disease genes is a major challenge for the clinical application of
genomics (1,2). A handful of well-known disease genes [such
as CFTR (3) and TP53 (4)] have been extremely well studied over
several decades through both research and clinical genetic testing, and multiple known pathogenic missense variants have
been individually characterized in silico, in vitro and in vivo.
However, the rat (...truncated)