DNA conformations and their sequence preferences

Nucleic Acids Research, Jun 2008

The geometry of the phosphodiester backbone was analyzed for 7739 dinucleotides from 447 selected crystal structures of naked and complexed DNA. Ten torsion angles of a near-dinucleotide unit have been studied by combining Fourier averaging and clustering. Besides the known variants of the A-, B- and Z-DNA forms, we have also identified combined A + B backbone-deformed conformers, e.g. with α/γ switches, and a few conformers with a syn orientation of bases occurring e.g. in G-quadruplex structures. A plethora of A- and B-like conformers show a close relationship between the A- and B-form double helices. A comparison of the populations of the conformers occurring in naked and complexed DNA has revealed a significant broadening of the DNA conformational space in the complexes, but the conformers still remain within the limits defined by the A- and B- forms. Possible sequence preferences, important for sequence-dependent recognition, have been assessed for the main A and B conformers by means of statistical goodness-of-fit tests. The structural properties of the backbone in quadruplexes, junctions and histone-core particles are discussed in further detail.

Article PDF cannot be displayed. You can download it here:

https://nar.oxfordjournals.org/content/36/11/3690.full.pdf

DNA conformations and their sequence preferences

Daniel Svozil 1 Jan Kalina 0 Marek Omelka 0 Bohdan Schneider 1 0 Jaroslav Ha jek Center for Theoretical and Applied Statistics, Department of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, Charles University , Sokolovska 83, CZ-186 75 Prague, Czech Republic 1 Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Molecular Systems , Flemingovo na m. 2, CZ-166 10 Prague The geometry of the phosphodiester backbone was analyzed for 7739 dinucleotides from 447 selected crystal structures of naked and complexed DNA. Ten torsion angles of a near-dinucleotide unit have been studied by combining Fourier averaging and clustering. Besides the known variants of the A-, Band Z-DNA forms, we have also identified combined A + B backbone-deformed conformers, e.g. with a/c switches, and a few conformers with a syn orientation of bases occurring e.g. in G-quadruplex structures. A plethora of A- and B-like conformers show a close relationship between the A- and B-form double helices. A comparison of the populations of the conformers occurring in naked and complexed DNA has revealed a significant broadening of the DNA conformational space in the complexes, but the conformers still remain within the limits defined by the A- and B- forms. Possible sequence preferences, important for sequence-dependent recognition, have been assessed for the main A and B conformers by means of statistical goodness-of-fit tests. The structural properties of the backbone in quadruplexes, junctions and histone-core particles are discussed in further detail. - INTRODUCTION The apparent simplicity of double-helical DNA, the icon of molecular biology, is deceiving. While the architecture of its antiparallel strands remains the same, subtle conformational variations suffice to guarantee its recognition by other molecules. The structural variations are critical especially for reliable recognition between DNA and proteins, which is the conditio sine qua non in the essential processes of replication, transcription and DNA chromatin compaction. Local conformational changes induced by interactions with other molecules can either leave the DNA structure unaltered (i.e. in the form of a straight double helix) or introduce bends and kinks within the double helix, as in sequence-dependent CAP/DNA complexes (1) or in DNA coiled around histone-core proteins (2). The necessity of understanding DNA variability has become more urgent as the sequence-specific protein/ DNA recognition required e.g. by transcription factors seems less likely to follow simple and generally applicable rules analogous to the rules governing DNA selfrecognition by the complementarity of the WatsonCrick (WC) paired bases (3). The idea of the general code of recognition between amino acids and nucleotides (4) has not been confirmed despite extensive efforts. The lack of simple rules for general protein/DNA recognition has been explained by the existence of too many structural degrees of freedom at the protein/DNA interface (5), and so far only limited rules of recognition have been formulated within narrower groups of transcription factors with certain binding motifs, such as zinc fingers or helix turnhelix (69). Ultimately, the variability and plasticity of the local DNA structure, and thus its ability to recognize other molecules and be recognized by them, can be attributed to the properties of the bases and to their sequence-dependent arrangement. Base-pair and base-step morphology (10,11) has been widely analyzed to describe sequence-dependent deformability as observed in the crystal structures of DNA complexes with sequence-specific proteins (12,13) as well as in noncomplexed DNA (14). By combining descriptors of base morphology with constraints imposed by a simple model of the phosphodiester backbone, slide and shift have been suggested to describe the key sequence properties of dinucleotide steps (15). However, the backbone does not act as a passive link merely holding the bases at their positions, but its inherent flexibility contributes to, and limits, the base placement so that the local DNA structure results from the interplay between optimal base positions and preferred conformations of the sugar phosphate backbone. An analysis of the conformational space populated by the DNA backbone and the correlation between its conformation and the DNA sequence are therefore important for fully understanding DNA recognition. The structural alphabet of the DNA double-helical A-, B- and Z-forms has been described in detail earlier (16,17). Nevertheless, DNA is known to adopt also other forms, such as triple (18) and quadruple helices (19), junction (cruciform) structures (20) and parallel helices (21). However unusual some of these DNA forms may be, their architecture is, in full analogy to the double helical DNA, almost completely based on the self-assembly of two or more DNA strands and does not form complicated folds analogous to RNA. The availability of some of these unusual DNA structures in well-refined crystal structures as well as the growing number and quality of more conventional DNA crystal structures present a challenge to undertake an analysis of the DNA conformational space in much greater detail than it was possible a few years ago (22). This work presents a comprehensive analysis of the conformational space of the DNA backbone using a neardinucleotide building block as a model. Dinucleotide conformations have been clustered as the local structural property without any consideration of the classification of the overall DNA architecture as, for instance, B- or Atype double helix. The study has been performed on almost 8000 dinucleotide units from 447 crystal structures of DNA, alone or in complexes with other molecules and has made use of a slightly modified procedure developed earlier for an analysis of RNA conformations (23). To assess the nature of the broadening of the DNA conformational space upon interacting with other molecules (mainly with proteins), the classified conformers of naked DNA have been compared to those of complexed DNA molecules. In addition, the structural properties of the backbone have been discussed in selected unusual structures like quadruplexes and histone-core particles. Because the possible sequence preferences of various conformers are important for the sequence-dependent recognition they have been assessed by means of rigorous nonparametric statistical testing within the group of naked B- and A-form double helices. The selection of structures used for the analysis was limited to nucleic acid (NA) crystal structures containing only DNA (thus excluding hybrids with RNA) present in the Nucleic Acid Database (24) on 19 July 2005. Four hundred and fifteen structures with crystallographic resolution better than or equal to 1.9 A were selected; this resolution had previously been identified as limiting t (...truncated)


This is a preview of a remote PDF: https://nar.oxfordjournals.org/content/36/11/3690.full.pdf
Article home page: http://nar.oxfordjournals.org/content/36/11/3690.abstract

Daniel Svozil, Jan Kalina, Marek Omelka, Bohdan Schneider. DNA conformations and their sequence preferences, Nucleic Acids Research, 2008, pp. 3690-3706, 36/11, DOI: 10.1093/nar/gkn260