Transcriptional activation during cell reprogramming correlates with the formation of 3D open chromatin hubs
ARTICLE
https://doi.org/10.1038/s41467-020-16396-1
OPEN
Transcriptional activation during cell
reprogramming correlates with the formation
of 3D open chromatin hubs
1234567890():,;
Marco Di Stefano 1,2 ✉, Ralph Stadhouders
Thomas Graf 2 ✉ & Marc A. Marti-Renom
2,5, Irene Farabella
1,2,3,4 ✉
1,2, David Castillo
1,2, François Serra
1,2,6,
Chromosome structure is a crucial regulatory factor for a wide range of nuclear processes.
Chromosome conformation capture (3C)-based experiments combined with computational
modelling are pivotal for unveiling 3D chromosome structure. Here, we introduce TADdyn, a
tool that integrates time-course 3C data, restraint-based modelling, and molecular dynamics
to simulate the structural rearrangements of genomic loci in a completely data-driven way.
We apply TADdyn on in situ Hi-C time-course experiments studying the reprogramming of
murine B cells to pluripotent cells, and characterize the structural rearrangements that take
place upon changes in the transcriptional state of 21 genomic loci of diverse expression
dynamics. By measuring various structural and dynamical properties, we find that during gene
activation, the transcription starting site contacts with open and active regions in 3D chromatin domains. We propose that these 3D hubs of open and active chromatin may constitute
a general feature to trigger and maintain gene transcription.
1 CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain.
2 Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Spain. 3 Universitat
Pompeu Fabra (UPF), 08002 Barcelona, Spain. 4 ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain. 5Present address: Department of Pulmonary
Medicine and Department of Cell Biology, Erasmus MC, Rotterdam, the Netherlands. 6Present address: Computational Biology Group—Barcelona
Supercomputing Center (BSC), 08034 Barcelona, Spain. ✉email: ; ;
NATURE COMMUNICATIONS | (2020)11:2564 | https://doi.org/10.1038/s41467-020-16396-1 | www.nature.com/naturecommunications
1
ARTICLE
NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16396-1
T
he three-dimensional (3D) structure of the genome has
been shown to modulate transcriptional regulation1–3 and
to play a role in cancer and developmental abnormalities4.
In the effort of characterizing 3D genome structures, chromosome conformation capture (3C)-based experiments5 allow to
capture a single snapshot of the genome conformation at a given
time. A plethora of theoretical approaches have been developed to
take advantage of 3C-based experimental data and model genome
spatial organization. Restraint-based modelling approaches6 take
3C-based contact frequencies as input and employ ad hoc conversions to spatial distances for determining 3D genome structure7–12. This approach has provided valuable insights into the
structural organization of chromosomal regions in various
organisms13. Complementary, thermodynamics-based approaches14–22 use physics-based principles to test specific interactions
or interaction mechanisms to explain the molecular origins of the
contact patterns obtained in 3C-based experiments. Together,
these theoretical strategies provide insights into chromatin
conformation16,17,23,24 and the possible mechanisms that form
chromosome territories18, compartments19 and topologically
associating domains (TADs)20,22,25,26.
Decreased sequencing costs, together with more refined
experimental protocols, has permitted performing 3C-based timeresolved experiments to monitor genome conformation dynamics
of biological processes at high resolution. For example, Highthroughput chromosome conformation capture (Hi–C) experiments have been applied to study the dynamics of nuclear
organization during mitosis27,28 or meiosis29–31, during hormone
treatment32 and during induced neural or adipose cells
differentiations33,34 or cell reprogramming35. However, none of
the computational strategies developed so far can take full
advantage of these time-series datasets. Hence, approaches specifically designed for the simulation of time-dependent conformational changes (4D) are urgently needed.
To fill this gap, we introduce TADdyn, a computational
method allowing to model 3D structural transitions of chromatin
using time-resolved Hi–C datasets. We combine in TADdyn a
physics-based model of chromatin fiber18,36 with dynamic
restraint-based modelling. For any genomic locus, this integrated
strategy allows for simulating a plausible 4D trajectory that is
data-driven and at the same time satisfies basic physical properties of the chromatin fiber.
The potential of TADdyn to provide insights beyond the Hi–C
datasets is highlighted by the simulation of 21 loci of the mouse
genome during cell reprogramming of pre-B lymphocytes into
pluripotent stem cells (PSCs)35. By measuring structural and
dynamical properties from the simulations, we characterize the
interplay between 3D structure and gene transcription at an
extent unreachable from the experimental datasets alone. Interestingly, we find that transcription starting sites (TSS) of simulated loci embed into in a cage-like structure that favors contacts
with open and active regions located (even) several kilo-bases
(kb) away from the gene promoter. Hence, TADdyn simulations
are compatible with the formation of 3D hubs37 as a general
mechanism to modulate gene transcription.
Results
The TADdyn modelling strategy. TADdyn is based on the following methodological steps (“Methods” and Fig. 1): (i) collection
of experimental data, (ii) representation of selected chromatin
regions using a bead-spring polymer model, (iii) conversion of
experimental data into time-dependent restraints, (iv) application
of steered molecular dynamics to simulate the adaptation of
chromatin models to satisfy the imposed restraints, and (v)
analysis of the conformation dynamics. As discussed below, each
2
of these steps constitutes per se an extension of all the existing
restraint-based strategies for chromosome modelling and, in
particular, of TADbit38, a modelling tools previously developed in
our lab.
We applied TADdyn to a previously published in situ Hi–C
interaction time-series dataset (GEO accession number GSE96611).
We could use at once restraint-based modelling for seven distinct
time-points of in situ Hi–C experiments during C/EBPα priming
followed by Oct4, Sox2, Klf4 and Myc (OSKM)-induced
reprogramming of B cells to PSCs35. To collect statistics on
distinct expression dynamics, we focused on 20 different ~2
mega-bases (Mb) regions of the mouse genome encompassing a
total of 21 different loci (Supplementary Data 1). The selected
genes are representative of different time-dependent patterns of
transcriptional activity (Supplementary Fig. 1 and “Methods”),
which allowed us to study how various different transcription
dynamics interplay with changes in the 3 (...truncated)