Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain

Human Molecular Genetics, Feb 2016

We present a generic, multidisciplinary approach for improving our understanding of novel missense variants in recently discovered disease genes exhibiting genetic heterogeneity, by combining clinical and population genetics with protein structural analysis. Using six new de novo missense diagnoses in TBL1XR1 from the Deciphering Developmental Disorders study, together with population variation data, we show that the β-propeller structure of the ubiquitous WD40 domain provides a convincing way to discriminate between pathogenic and benign variation. Children with likely pathogenic mutations in this gene have severely delayed language development, often accompanied by intellectual disability, autism, dysmorphology and gastrointestinal problems. Amino acids affected by likely pathogenic missense mutations are either crucial for the stability of the fold, forming part of a highly conserved symmetrically repeating hydrogen-bonded tetrad, or located at the top face of the β-propeller, where ‘hotspot’ residues affect the binding of β-catenin to the TBLR1 protein. In contrast, those altered by population variation are significantly less likely to be spatially clustered towards the top face or to be at buried or highly conserved residues. This result is useful not only for interpreting benign and pathogenic missense variants in this gene, but also in other WD40 domains, many of which are associated with disease.

Article PDF cannot be displayed. You can download it here:

https://hmg.oxfordjournals.org/content/25/5/927.full.pdf

Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain

Human Molecular Genetics, 2016, Vol. 25, No. 5 927–935 doi: 10.1093/hmg/ddv625 Advance Access Publication Date: 5 January 2016 Original Article ORIGINAL ARTICLE Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain 1 European Bioinformatics Institute (EMBL-EBI) and 2Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK, 3Sheffield Regional Genetics Services, Sheffield Children’s Hospital, Western Bank, Sheffield S10 2TH, UK, 4West of Scotland Genetic Services, Level 1, Laboratory Medicine Building, South Glasgow University Hospital, 1345 Govan Road, Glasgow G51 4TF, UK, 5Human Genetics, Ninewells Hospital, Dundee DD1 9SY, UK, 6Northern Genetics Service, Newcastle upon Tyne Hospitals NHS Foundation Trust, Institute of Genetic Medicine, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK and 7East Anglian Medical Genetics Service, Addenbrooke’s Treatment Centre, Addenbrooke’s Hospital, Cambridge University Hospitals, Cambridge CB2 0QQ, UK *To whom correspondence should be addressed. Tel: +44 1223834244; Fax: +44 1223494919; Email: Abstract We present a generic, multidisciplinary approach for improving our understanding of novel missense variants in recently discovered disease genes exhibiting genetic heterogeneity, by combining clinical and population genetics with protein structural analysis. Using six new de novo missense diagnoses in TBL1XR1 from the Deciphering Developmental Disorders study, together with population variation data, we show that the β-propeller structure of the ubiquitous WD40 domain provides a convincing way to discriminate between pathogenic and benign variation. Children with likely pathogenic mutations in this gene have severely delayed language development, often accompanied by intellectual disability, autism, dysmorphology and gastrointestinal problems. Amino acids affected by likely pathogenic missense mutations are either crucial for the stability of the fold, forming part of a highly conserved symmetrically repeating hydrogen-bonded tetrad, or located at the top face of the β-propeller, where ‘hotspot’ residues affect the binding of β-catenin to the TBLR1 protein. In contrast, those altered by population variation are significantly less likely to be spatially clustered towards the top face or to be at buried or highly conserved residues. This result is useful not only for interpreting benign and pathogenic missense variants in this gene, but also in other WD40 domains, many of which are associated with disease. † These authors contributed equally to this work. Received: October 27, 2015. Revised and Accepted: December 22, 2015 © The Author 2016. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 927 Roman A. Laskowski1,†, Nidhi Tyagi1,†, Diana Johnson3, Shelagh Joss4, Esther Kinning4, Catherine McWilliam5, Miranda Splitt6, Janet M. Thornton1, Helen V. Firth7, the DDD Study2 and Caroline F. Wright2, * 928 | Human Molecular Genetics, 2016, Vol. 25, No. 5 Introduction Results Six children within the DDD study were found to have likely pathogenic de novo mutations in TBL1XR1, including five single nucleotide variants predicted to cause a missense change, and one 1 bp frameshift insertion predicted to result in loss of function through truncation or nonsense-mediated decay (Table 1). Two additional likely de novo missense mutations have also been published in children affected by developmental disorders (25,28), as well as a de novo 1 bp frameshift deletion (25). A number of whole gene deletions have also been described (27,29). Children with likely pathogenic mutations in TBL1XR1 have developmental delay often with autistic features (Table 1). All patients have marked expressive speech and language delay as the most consistent feature, and most have special needs requiring specialist educational assistance. In addition, most of the children identified via the DDD study have gastrointestinal disturbance or constipation. Although a number of patients have dysmorphic features, a preliminary assessment of facial photographs does not suggest an identifiable facial gestalt and growth parameters were typically within the normal range (Supplementary Material, Table S1). There are no apparent differences in either the phenotypes or severity of the children with missense mutations versus those with truncating mutations and gene deletions, potentially suggesting a common loss of function mechanism. Although TBL1XR1 is a highly constrained gene [Exome Aggregation Consortium (ExAC), Cambridge, MA, USA; http://exac. broadinstitute.org/; accessed December 2015], we were able to identify 64 unique germline population missense variants in TBL1XR1 in population controls, in which benign variants are expected to be relatively enriched and pathogenic variants relatively depleted for rare childhood onset dominant disorders with obvious phenotypes. These variants were identified using multiple databases: the ExAC (http://exac.broadinstitute.org/; accessed June 2015), dbSNP (http://www.ncbi.nlm.nih.gov/SNP/), the Exome Variant Server [NHLBI GO Exome Sequencing Project (ESP), Seattle, WA, USA; http://evs.gs.washington.edu/EVS/; accessed June 2015] and the European Variant Archive (http:// www.ebi.ac.uk/eva/) (32). All five DDD missense mutations and one published likely pathogenic mutation are located within the WD40 domain of TBLR1, in addition to 33 of the population missense variants (Table 2). Interestingly, we also identified 16 likely non-pathogenic missense variants in TBL1XR1 within the DDD cohort (where the variant is in, or inherited from, an unaffected parent), all of which either lie outside the WD40 domain or have already been observed in the population. The WD40 domain of TBLR1 has a β-propeller structure consisting of eight propeller ‘blades’, each formed by a four-stranded antiparallel β-sheet, which are joined by β-hairpins. The blades are arranged symmetrically about a central axis, like the staves of a barrel, and β-catenin binds to the ‘top’ face of the propeller to promote the transcription of Wnt target genes (33) (Fig. 1B). A number of ‘hotspot residues’ have been identified previously Understanding the impact of missense variants in known disease genes is a major challenge for the clinical application of genomics (1,2). A handful of well-known disease genes [such as CFTR (3) and TP53 (4)] have been extremely well studied over several decades through both research and clinical genetic testing, and multiple known pathogenic missense variants have been individually characterized in silico, in vitro and in vivo. However, the rat (...truncated)


This is a preview of a remote PDF: https://hmg.oxfordjournals.org/content/25/5/927.full.pdf
Article home page: http://hmg.oxfordjournals.org/content/25/5/927.abstract

Roman A. Laskowski, Nidhi Tyagi, Diana Johnson, Shelagh Joss, Esther Kinning, Catherine McWilliam, Miranda Splitt, Janet M. Thornton, Helen V. Firth, the DDD Study, Caroline F. Wright. Integrating population variation and protein structural analysis to improve clinical interpretation of missense variation: application to the WD40 domain, Human Molecular Genetics, 2016, pp. 927-935, 25/5, DOI: 10.1093/hmg/ddv625