DNA conformations and their sequence preferences
3690–3706 Nucleic Acids Research, 2008, Vol. 36, No. 11
doi:10.1093/nar/gkn260
Published online 13 May 2008
DNA conformations and their sequence preferences
Daniel Svozil1, Jan Kalina2, Marek Omelka2 and Bohdan Schneider1,*
1
Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for
Biomolecules and Complex Molecular Systems, Flemingovo nám. 2, CZ-166 10 Prague and 2Jaroslav Hájek Center
for Theoretical and Applied Statistics, Department of Probability and Mathematical Statistics, Faculty of Mathematics
and Physics, Charles University, Sokolovská 83, CZ-186 75 Prague, Czech Republic
Received March 5, 2008; Revised April 17, 2008; Accepted April 18, 2008
ABSTRACT
The geometry of the phosphodiester backbone was
analyzed for 7739 dinucleotides from 447 selected
crystal structures of naked and complexed DNA.
Ten torsion angles of a near-dinucleotide unit have
been studied by combining Fourier averaging and
clustering. Besides the known variants of the A-, Band Z-DNA forms, we have also identified combined
A + B backbone-deformed conformers, e.g. with a/c
switches, and a few conformers with a syn orientation of bases occurring e.g. in G-quadruplex structures. A plethora of A- and B-like conformers show a
close relationship between the A- and B-form
double helices. A comparison of the populations of
the conformers occurring in naked and complexed
DNA has revealed a significant broadening of the
DNA conformational space in the complexes, but
the conformers still remain within the limits defined
by the A- and B- forms. Possible sequence preferences, important for sequence-dependent recognition, have been assessed for the main A and B
conformers by means of statistical goodness-of-fit
tests. The structural properties of the backbone in
quadruplexes, junctions and histone-core particles
are discussed in further detail.
INTRODUCTION
The apparent simplicity of double-helical DNA, the icon
of molecular biology, is deceiving. While the architecture
of its antiparallel strands remains the same, subtle
conformational variations suffice to guarantee its recognition by other molecules. The structural variations are
critical especially for reliable recognition between DNA
and proteins, which is the conditio sine qua non in the
essential processes of replication, transcription and DNA
chromatin compaction. Local conformational changes
induced by interactions with other molecules can either
leave the DNA structure unaltered (i.e. in the form of a
straight double helix) or introduce bends and kinks within
the double helix, as in sequence-dependent CAP/DNA
complexes (1) or in DNA coiled around histone-core
proteins (2).
The necessity of understanding DNA variability has
become more urgent as the sequence-specific protein/
DNA recognition required e.g. by transcription factors
seems less likely to follow simple and generally applicable rules analogous to the rules governing DNA selfrecognition by the complementarity of the Watson–Crick
(W–C) paired bases (3). The idea of the general ‘code of
recognition’ between amino acids and nucleotides (4) has
not been confirmed despite extensive efforts. The lack of
simple rules for general protein/DNA recognition has
been explained by the existence of too many structural
degrees of freedom at the protein/DNA interface (5), and
so far only limited rules of recognition have been formulated within narrower groups of transcription factors
with certain binding motifs, such as zinc fingers or helix–
turn–helix (6–9).
Ultimately, the variability and plasticity of the local
DNA structure, and thus its ability to recognize other
molecules and be recognized by them, can be attributed to
the properties of the bases and to their sequence-dependent
arrangement. Base-pair and base-step morphology (10,11)
has been widely analyzed to describe sequence-dependent
deformability as observed in the crystal structures of DNA
complexes with sequence-specific proteins (12,13) as well as
in noncomplexed DNA (14). By combining descriptors of
base morphology with constraints imposed by a simple
model of the phosphodiester backbone, slide and shift have
been suggested to describe the key sequence properties of
dinucleotide steps (15). However, the backbone does not
act as a passive link merely holding the bases at their
positions, but its inherent flexibility contributes to, and
limits, the base placement so that the local DNA structure
results from the interplay between optimal base positions
and preferred conformations of the sugar phosphate
*To whom correspondence should be addressed. Tel: +420 728 303 566; Fax: +420 296 443 610; Email: ,
ß 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Nucleic Acids Research, 2008, Vol. 36, No. 11 3691
backbone. An analysis of the conformational space
populated by the DNA backbone and the correlation
between its conformation and the DNA sequence are
therefore important for fully understanding DNA
recognition.
The structural alphabet of the DNA double-helical A-,
B- and Z-forms has been described in detail earlier (16,17).
Nevertheless, DNA is known to adopt also other forms,
such as triple (18) and quadruple helices (19), junction
(cruciform) structures (20) and parallel helices (21).
However unusual some of these DNA forms may be,
their architecture is, in full analogy to the double helical
DNA, almost completely based on the self-assembly of
two or more DNA strands and does not form complicated
folds analogous to RNA. The availability of some of these
unusual DNA structures in well-refined crystal structures
as well as the growing number and quality of more
conventional DNA crystal structures present a challenge
to undertake an analysis of the DNA conformational
space in much greater detail than it was possible a few
years ago (22).
This work presents a comprehensive analysis of the
conformational space of the DNA backbone using a neardinucleotide building block as a model. Dinucleotide
conformations have been clustered as the local structural
property without any consideration of the classification of
the overall DNA architecture as, for instance, B- or Atype double helix. The study has been performed on
almost 8000 dinucleotide units from 447 crystal structures
of DNA, alone or in complexes with other molecules and
has made use of a slightly modified procedure developed
earlier for an analysis of RNA conformations (23). To
assess the nature of the broadening of the DNA
conformational space upon interacting with other molecules (mainly with proteins), the classified conformers of
naked DNA have been compared to those of complexed
DNA molecules. In addition, the structural properties of
the backbone have been dis (...truncated)