Kin-Driver: a database of driver mutations in protein kinases

Database: The Journal of Biological Databases and Curation, Jan 2014

Somatic mutations in protein kinases (PKs) are frequent driver events in many human tumors, while germ-line mutations are associated with hereditary diseases. Here we present Kin-driver, the first database that compiles driver mutations in PKs with experimental evidence demonstrating their functional role. Kin-driver is a manual expert-curated database that pays special attention to activating mutations (AMs) and can serve as a validation set to develop new generation tools focused on the prediction of gain-of-function driver mutations. It also offers an easy and intuitive environment to facilitate the visualization and analysis of mutations in PKs. Because all mutations are mapped onto a multiple sequence alignment, analogue positions between kinases can be identified and tentative new mutations can be proposed for studying by transferring annotation. Finally, our database can also be of use to clinical and translational laboratories, helping them to identify uncommon AMs that can correlate with response to new antitumor drugs. The website was developed using PHP and JavaScript, which are supported by all major browsers; the database was built using MySQL server. Kin-driver is available at: http://kin-driver.leloir.org.ar/

Article PDF cannot be displayed. You can download it here:

https://database.oxfordjournals.org/content/2014/bau104.full.pdf

Kin-Driver: a database of driver mutations in protein kinases

Franco L. Simonetti 2 Cristian Tornador 1 Nuria Nabau-Moreto 0 Miguel A. Molina-Vila 3 Cristina Marino-Buslje 2 0 Computational Genomics Laboratory, Genetics Department, Institut de Biologia Universitat de Barcelona (IBUB), Facultat de Biologia , Av Diagonal 645 1 Pompeu Fabra University (UPF), Dept. de Tecnologies de la Informacio i les Comunicacions. Tanger 122-140 08018, Barcelona , Spain 2 Fundacio n Instituto Leloir , Av. Patricias Argentinas 435. C1405BWE, Buenos Aires , Argentina 3 Breakthrough Cancer Research Unit, Dexeus University Hospital , Sabino Arana 5-19, Barcelona , Spain Somatic mutations in protein kinases (PKs) are frequent driver events in many human tumors, while germ-line mutations are associated with hereditary diseases. Here we present Kin-driver, the first database that compiles driver mutations in PKs with experimental evidence demonstrating their functional role. Kin-driver is a manual expert-curated database that pays special attention to activating mutations (AMs) and can serve as a validation set to develop new generation tools focused on the prediction of gain-of-function driver mutations. It also offers an easy and intuitive environment to facilitate the visualization and analysis of mutations in PKs. Because all mutations are mapped onto a multiple sequence alignment, analogue positions between kinases can be identified and tentative new mutations can be proposed for studying by transferring annotation. Finally, our database can also be of use to clinical and translational laboratories, helping them to identify uncommon AMs that can correlate with response to new antitumor drugs. The website was developed using PHP and JavaScript, which are supported by all major browsers; the database was built using MySQL server. Kin-driver is available at: http://kin-driver.leloir.org.ar/ VC The Author(s) 2014. Published by Oxford University Press. Page 1 of 5 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Introduction Cancer arises due to somatic mutations that result in a growth advantage for the tumor cells. These mutations are known as drivers and can be divided into two groups: (i) loss-of-function mutations, which inactivate tumor suppressor genes (from here on inactivating mutations) and (ii) activating or gain-of-function mutations that transform proto-oncogenes into oncogenes. Somatic mutations in protein kinases (PKs) are frequent driver events in many human tumor types and functionally relevant germline mutations are associated with hereditary disorders. Clinical laboratories worldwide are analysing thousands of human tumor samples, looking for activating mutations (AMs) in certain PKssuch as EGFR, HER2 or BRAFthat correlate with good responses to new generations of antitumor drugs that are kinase inhibitors. Mutations either new or not functionally characterized are often found. In addition, whole-genomic sequencing of human malignancies and other diseases is identifying thousands of changes in PKs, but most of them are likely to be passenger mutations or even polymorphisms. Discriminating driver mutations in PKs is a significant challenge that is hampered by the fact that there are no curated sets of true driver and passenger alterations. The extent of this challenge was evidenced when three state-ofthe-art methods, namely MutationAssessor (1), TransFITC (2) and FATHMM (3), were fed with well-established, tumor-associated AMs of PKs and failed to predict them as high impact or disease related (4). Therefore, it is uncertain that the current tools, which are generally based on conservation calculations, can be trusted to screen whole-genome sequencing data in search of driver mutations in PKs. New methods need to be developed and unambiguously assessed datasets of driver mutations are required to train and test them. Mutation recruitment Recruitment procedure is described by Molina-Vila et al. (4). Briefly, in the case of proto-oncogenic kinases, abstracts and titles of PubMed manuscripts were mined with the kinase name, plus words activating, gain of function or constitutive activation. For tumor suppressor kinases, the words inactivating and loss of function were used. Furthermore, all UniProt entries for human kinases were mined for the same keywords to identify new variants. The references were manually checked to confirm its status. For each annotated mutation, all samples with that mutation were retrieved from COSMIC using the Biomart perl API. MSA construction Human STK and TKs domains were obtained from Pfam families PF00069 and PF07714, respectively. To account for classification problems in Pfam families, some sequences incorrectly classified as TK were moved from this alignment to the corresponding one and realigned with T-coffee (5). For each MSA, a sequence logo was calculated using seq2logo (6). Mutation relative frequency calculation A relative frequency was computationally calculated for all mutations of the 518 PKs of the COSMIC database release 70 (7) as the frequency of mutation in COSMIC for that gene times 1000 over the total number of tumor samples sequenced for that gene. All mutations with a relative frequency above 2 (0.2%) were then checked in PubMed by introducing the name of the mutation (e.g. P267R) and added to the dataset if they were found to have functional effects. EGFR mutations conferring a response rate to erlotinib higher than 50%, according to the EGFR somatic mutations database (http:// www.somaticmutations-egfr.info/), were also added. Kin-Driver database offers a comprehensive set of 560 primary AMs in the kinase and justamembrane (JM) domains of 39 PKs and 83 inactivating mutations in 5 kinases compiled by a two-step systematic search for each of the 518 PKs present in the complete kinase study of the COSMIC database (7) (release 70). Only primary mutations with experimental evidence demonstrating their activating/ inactivating role were included. Kin-Driver is a MySQL relational database offering structural and sequence data cross-referenced with COSMIC and with our set of curated mutations. It also provides the frequencies of these mutations in actual tumor samples. The CosmicMart service is used to fetch the data, so frequencies for new mutations can easily be added and data are kept up to date with the periodic COSMIC releases. Our database can be interrogated by protein name, gene name or keyword, amino acid position or specific mutation name (i.e. T790M). Range or specific mutations can also be used to look for driver mutations in other PKs in equivalent positions (see later). Finally, the database can be browsed by PK name, domain, tissue or type of histology, and these last two attributes obtained from the corresponding mutated samples are available (...truncated)


This is a preview of a remote PDF: https://database.oxfordjournals.org/content/2014/bau104.full.pdf
Article home page: http://database.oxfordjournals.org/content/2014/bau104.abstract

Franco L. Simonetti, Cristian Tornador, Nuria Nabau-Moretó, Miguel A. Molina-Vila, Cristina Marino-Buslje. Kin-Driver: a database of driver mutations in protein kinases, Database: The Journal of Biological Databases and Curation, 2014, 2014, DOI: 10.1093/database/bau104