Evol and ProDy for bridging protein sequence evolution and structural dynamics

Bioinformatics, Sep 2014

Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. Contact: bahar{at}pitt.edu

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://bioinformatics.oxfordjournals.org/content/30/18/2681.full.pdf

Evol and ProDy for bridging protein sequence evolution and structural dynamics

Evol and ProDy for bridging protein sequence evolution and structural dynamics Ahmet Bakany 0 Anindita Duttay 0 Wenzhi Mao 0 Ying Liu 0 Chakra Chennubhotla 0 Timothy R. Lezon 0 Ivet Bahar 0 Associate Editor: Anna Tramontano 0 Department of Computational and Systems Biology, and Clinical & Translational Science Institute, School of Medicine, University of Pittsburgh , Pittsburgh, PA 15213 , USA Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt. edu/. Contact: The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: INTRODUCTION The significance of protein dynamics in a wide range of biological functions, including cell signaling, regulation and machinery is widely established (Bahar et al., 2010; Bhabha et al., 2011; Marsh et al., 2012) . In many cases, sequence variability goes hand in hand with structural dynamics (Glembo et al., 2012; Liu and Bahar, 2012; Marks et al., 2011; Micheletti, 2012; Worth et al., 2009; Zheng et al., 2005) . Structural dynamics correlates with evolvability (Tokuriki and Tawfik, 2009) or sequence and conformational diversity (Friedland et al., 2009) and enables adaptation to substrate binding while maintaining specificity (Liu et al., 2010) . To our knowledge, existing software usually relate evolutionary properties to static structures (Ashkenazy et al., 2010; Morgan et al., 2006; Wainreb et al., 2011) , or they are exclusively dedicated to either sequence analysis (Waterhouse et al., 2009) or structural dynamics (Eyal et al., 2006; Suhre and Sanejouand, 2004) . There is a need for methods that allow combined analysis of sequence (co)evolution and structural dynamics. These would be particularly useful if they could be performed and visualized in a versatile, integrated computing environment. *To whom correspondence should be addressed. yThe authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors. Toward addressing this need, we introduce the v1.5 of ProDy (Bakan et al., 2011) with Evol applications. Highlights of the new version are rich methods for coevolutionary analysis, and extensions for analyzing and interpreting structural dynamics, following the approach adopted in our recent comparative study of sequence conservation and coevolution patterns versus structure/dynamics properties for a representative set of protein families (Liu and Bahar, 2012) , which has been validated in detailed case studies (e.g. General et al., 2014; Liu et al., 2010) . A distinctive feature of ProDy is its capability to extract mechanistic information from principal component analysis (PCA) of ensembles of structures (e.g. drug targets) (Bakan and Bahar, 2009) . The new release has several new modules and command line applications named ‘evol’ to evaluate sequence conservation and coevolution using information-theoretic and statistical approaches. To our knowledge, this is the only package that enables comparative analysis of protein dynamics with sequence evolution data extracted from multiple sequence alignments (MSAs) for protein families. 2 2.1 DESCRIPTION AND FUNCTIONALITY Input for ProDy and Evol The input for ProDy is a set of protein coordinates in PDB format, or simply the PDB ID or protein sequence. The speed of PDB parser and AtomGroup classes has been increased in the current version, such that parsing coordinates is 4.5–40 times faster than using Biopython PDB module (Hamelryck and Manderick, 2003) , and atomic data storage occupies 10 times less memory footprint. We implemented efficient and flexible features for handling MSAs. Notably, the new MSA parser can evaluate various formats at a rate of 700 MB/s (on 3.6 GHz Intel Xeon CPU, 16 GB RAM and Samsung SSD) and is up to 80 times faster than the alignment parser of Biopython (Cock et al., 2009) . Flexible classes store MSA data parsimoniously in the memory and provide ways of subsampling. Sequences can be filtered based on their labels to retain those in certain categories (e.g. human) and sliced to retain specific regions or sequences (e.g. regions matching structurally resolved amino acids). Such refinements, performed in a fraction of a second, allow for real-time processing of large MSAs and systematic analyses (...truncated)


This is a preview of a remote PDF: https://bioinformatics.oxfordjournals.org/content/30/18/2681.full.pdf

Ahmet Bakan, Anindita Dutta, Wenzhi Mao, Ying Liu, Chakra Chennubhotla, Timothy R. Lezon, Ivet Bahar. Evol and ProDy for bridging protein sequence evolution and structural dynamics, Bioinformatics, 2014, pp. 2681-2683, 30/18, DOI: 10.1093/bioinformatics/btu336