Quantifying side-chain conformational variations in protein structure (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/srep37024.pdf

Quantifying side-chain conformational variations in protein structure

www.nature.com/scientificreports OPEN received: 29 March 2016 accepted: 24 October 2016 Published: 15 November 2016 Quantifying side-chain conformational variations in protein structure Zhichao Miao1,2,3 & Yang Cao4 Protein side-chain conformation is closely related to their biological functions. The side-chain prediction is a key step in protein design, protein docking and structure optimization. However, side-chain polymorphism comprehensively exists in protein as various types and has been long overlooked by side-chain prediction. But such conformational variations have not been quantitatively studied and the correlations between these variations and residue features are vague. Here, we performed statistical analyses on large scale data sets and found that the side-chain conformational flexibility is closely related to the exposure to solvent, degree of freedom and hydrophilicity. These analyses allowed us to quantify different types of side-chain variabilities in PDB. The results underscore that protein sidechain conformation prediction is not a single-answer problem, leading us to reconsider the assessment approaches of side-chain prediction programs. Protein side-chain conformations have been shown to be closely related to protein mutations1. The protein interactions with proteins, RNA/DNA or ligands are mainly mediated by side-chain contacts. During these functional steps, some critical side-chains may change their conformations to adapt to the shape and character of its interaction partner. The ‘induced-and-fit’ model2 gave us plenty of examples inferring the importance of side-chain conformational change. Therefore, side-chain conformational changes, or side-chain polymorphism, could be closely related to protein functions. Nevertheless, the side-chain variations have not yet been quantitatively analyzed, the understanding of side-chain conformational variation is still intuitive and not systematic. Side-chain conformation prediction, or side-chain packing, has been a well-established problem in computational biology and involved in diverse applications, such as protein folding3, docking4, design5, engineering6 and structure optimization7. Various new programs8–12, which do not consider conformational change, have been proposed in recent years. Laleh et al.13 tried to predict side-chain conformation with polymorphism, but covered only a simple type of polymorphism in small scale data. Quantifying the side-chain variations can directly contribute to our understanding of structural essence of side-chain conformation and improving side-chain packing programs. A protein structure solved by X-ray crystallography or cyro-EM is normally considered as a unique 3D conformation of the molecule in the defined condition. Molecules deposited in Protein Data Bank (PDB)14 generally include only unique sets of coordinates to demonstrate the 3D structures, which lead to the unique answer as ‘gold standard’ in structure prediction, such as protein structure prediction (CASP)15 and side-chain packing16. Protein side-chain conformations are usually clustered into rotamers, which are rigid conformations represented by discrete side-chain dihedral angles. However, some clues have led us to consider the variation and polymorphism17,18 of the protein side-chains: (i) the temperature factor (or B factor) of an atom describes the attenuation of X-ray scattering caused by thermal motion, and thus can be taken to indicate the relative vibrational motion of the atom; (ii) the alternate location and atom occupancy columns in PDB describe the probability of possible conformational states. Although a molecule keeps its global topology in a defined environment, the side-chain conformations are not constrained and could be flexible. With more and more crystal structures being solved, 1 Architecture et Réactivité de l′ARN, Université de Strasbourg, Institut de biologie moléculaire et cellulaire du CNRS, 67000 Strasbourg, France. 2European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. 3Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. 4Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences and State Key Laboratory of Biotherapy, Sichuan University, Chengdu, 610014, China. Correspondence and requests for materials should be addressed to Z.M. (email: ) or Y.C. (email: ) Scientific Reports | 6:37024 | DOI: 10.1038/srep37024 1 www.nature.com/scientificreports/ we now can find structural differences of the same protein either in the same or in different crystals. Hence, side-chain conformational variations are detected through comparing these structures of the same protein. The first step to understand side-chain conformation is to know their immanent conformational variability in unbound state. In this work, we focus on understanding the conformational variation of protein side-chains, which do not bind nucleic acid chains. First, we described four different classes of side-chain conformations observed in crystal structures. Then, we carefully curated and analyzed several datasets of protein structures to quantify: (1) the reliabilities of the atom coordinates according to electron density; (2) the alternate location defined side-chain conformation variations; (3) side-chain conformation variations in either the same or different crystal structures and (4) the influences of backbone structure deviations and sequence mutations to side-chain variations. This work is the first quantitative analysis of side-chain conformational variations based on experimentally reliable data, providing useful knowledge of side-chain flexibility to side-chain prediction and its assessment. Till now, protein side-chain packing methods have been widely benchmarked, considering the crystal environment19 or residue environments20. According to the conclusions of this analysis, we realized that the protein side-chain conformation prediction is not a single-answer problem and an urgent need is to reconsider the problem of side-chain packing and side-chain prediction assessment. Therefore, we propose novel approaches and large scale datasets in benchmarking side-chain packing programs considering different types of side-chain conformational variations. In perspective, this work can help us (1) to understand the flexibility of protein side-chains; (2) to optimize the side-chain prediction programs and help optimizing cryo-EM structures and (3) to relate side-chain conformational changes to protein functions in later researches. Results Side-chain conformation models. The coordinates of crystal structures are determined according to the electron density maps. However, some electron density maps tend to cover a larger region than a unique position. Side-chain conformations in proteins may adopt more than on (...truncated)