Choice of binding sites for CTCFL compared to CTCF is driven by chromatin and by sequence preference

Nucleic Acids Research, Aug 2018

The two paralogous zinc finger factors CTCF and CTCFL differ in expression such that CTCF is ubiquitously expressed, whereas CTCFL is found during spermatogenesis and in some cancer types in addition to other cell types. Both factors share the highly conserved DNA binding domain and are bound to DNA sequences with an identical consensus. In contrast, both factors differ substantially in the number of bound sites in the genome. Here, we addressed the molecular features for this binding specificity. In contrast to CTCF we found CTCFL highly enriched at ‘open’ chromatin marked by H3K27 acetylation, H3K4 di- and trimethylation, H3K79 dimethylation and H3K9 acetylation plus the histone variant H2A.Z. CTCFL is enriched at transcriptional start sites and regions bound by transcription factors. Consequently, genes deregulated by CTCFL are highly cell specific. In addition to a chromatin-driven choice of binding sites, we determined nucleotide positions critical for DNA binding by CTCFL, but not by CTCF.

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/nar/article-pdf/46/14/7097/25509574/gky483.pdf

Choice of binding sites for CTCFL compared to CTCF is driven by chromatin and by sequence preference

Abstract The two paralogous zinc finger factors CTCF and CTCFL differ in expression such that CTCF is ubiquitously expressed, whereas CTCFL is found during spermatogenesis and in some cancer types in addition to other cell types. Both factors share the highly conserved DNA binding domain and are bound to DNA sequences with an identical consensus. In contrast, both factors differ substantially in the number of bound sites in the genome. Here, we addressed the molecular features for this binding specificity. In contrast to CTCF we found CTCFL highly enriched at ‘open’ chromatin marked by H3K27 acetylation, H3K4 di- and trimethylation, H3K79 dimethylation and H3K9 acetylation plus the histone variant H2A.Z. CTCFL is enriched at transcriptional start sites and regions bound by transcription factors. Consequently, genes deregulated by CTCFL are highly cell specific. In addition to a chromatin-driven choice of binding sites, we determined nucleotide positions critical for DNA binding by CTCFL, but not by CTCF. INTRODUCTION In recent years the multifunctional and highly conserved factor CTCF (CCCTC-binding factor) has been identified as a key player in 3D chromatin architecture and gene regulation (1–3). CTCF binds DNA through a combination of 11 zinc-fingers from its central DNA binding domain (4). At its binding sites it can interact with a variety of co-factors, most importantly the cohesin complex to mediate the formation of long-distance DNA interaction and DNA loops (5). Such looping events can then link three-dimensional genomic architecture to a functional output such as the regulation of genes through an enhancer or insulator (6). Utilizing techniques like 3C (chromatin conformation capture) and its genome-wide derivatives such as Hi-C, topologically associated domains (TADs) could be identified and CTCF was found to be enriched in the border areas of such domains (7). Disruption of CTCF binding and binding sites leads to changes in TAD patterns and has effects on proper gene expression programs (8). Taken together, CTCF is one of the central factors in bridging genome architecture to function. In contrast to the established role of CTCF, the cellular role of the only known CTCF-paralogue, CTCFL, remains to be solved. CTCFL was identified in 2002 (9) and is believed to result from a gene duplication event in the early amniotic evolution (10), with CTCF and CTCFL sharing a highly conserved 11 zinc finger (ZF) DNA binding domain. The N- and C-termini of the two proteins are different, with an amino acid similarity of <20% between mammalian versions (11). First reports described CTCFL expression to be testis specific and mutually exclusive to CTCF. Later, more detailed analysis could show CTCFL to be transiently expressed during spermatogenesis, prior to the onset of meiosis, overlapping with CTCF expression (12). Some functional differences regarding the two proteins have been identified, for instance it seems that only CTCF binds components of the Cohesin complex like Smc1 in mouse (12) or RAD21 in human (13). CTCFL also failed to substitute for a loss of CTCF in CTCF KO experiments (12). Further Knockout experiments of CTCFL showed it to be important in proper testicular development. This is exemplified by the deregulation of important testis-specific genes, such as Gal3st1 and Prss50 (12,14,15). Tissue specificity of CTCFL expression has been questioned (16) by showing a more widespread expression in normal and in cancer cells. Aberrant expression of CTCFL was identified in some cancers (17–19). Research to identify CTCFL as a biomarker for specific cancer types (20) and for therapeutical approaches has been followed up (21). With the advent of next-generation-sequencing many advances in the field of DNA binding factors have been made. Also for CTCFL, the genome-wide binding patterns have been started to be explored (12,13). Sites of CTCFL binding strongly overlap with CTCF sites and the identified DNA binding motifs of the two proteins are virtually identical (12,13,22). CTCFL seems to preferentially bind to genomic regions of active and open chromatin showing for example an enrichment at transcriptional start sites compared to CTCF (12). CTCFL binding is strongly associated with the presence of active histone modifications like H3K4me3 or H3K27ac (12,13). Most recently, it could be shown that CTCFL binds genomic sites characterized by the presence of two CTCF-motifs in close proximity allowing for simultaneous binding of CTCF and CTCFL (13). In mouse and human, the similarity between the ZF DNA binding domains of the two proteins is ∼70%, on the amino acid level (11), which also might explain some degree of differential binding. Thus, epigenetic marks and dual binding motifs contribute to binding specificity. However, it has not been analysed, whether epigenetic marks are solely responsible for the binding specificity such that a closed site in a particular tissue is not bound by CTCFL, but will be bound in another tissue with an open chromatin conformation. Here, we find that this is exactly the case. CTCF is binding irrespective of chromatin ‘openness’, whereas CTCFL binding is regulated by epigenetic marks characteristic for open chromatin. In addition, we find that not all CTCF sites can potentially be bound by CTCFL; rather, DNA-sequence specificity restricts CTCFL binding to a sub-set of CTCF sites. MATERIALS AND METHODS Cell culture and transfection Murine NIH3T3 and P19 cells as well as human K562 cells were grown at 37°C with 5% CO2 in Dulbecco's modified Eagle's medium supplemented with 10% (v/v) serum and 1% PenStrep. Differentiation of P19 cells was achieved by supplementing cells grown on adherent dishes with 10 μM retinoic acid. Transfections were performed on adherent cells using jetPEI reagent (Polyplus transfection), which was used in accordance to the manufacturer's instructions. To generate stable clones, cells were transfected with pBI-EGFP-FLAG-mCtcfl and pTA-N, a Tet-off system (Clonetech) turning off the expression of CTCFL in the presence of Doxycycline (2 μg/ml). The transfected cells were selected for puromycin resistance starting 24 h after transfection. The clones were selected in 96-well plates, expanded and characterized by immunoblotting, RT-qPCR and immunofluorescence. CTCFL expression was achieved by growing the cells in medium lacking Doxycycline for 48 h. ChIP-seq data analysis of K562 ENCODE data The K562 pre-aligned ChIP-seq data (hg19) were downloaded from ENCODE (23) via the UCSC genome browser portal (Supplementary Table S1) (24). CTCF and CTCFL peaks were called by MACS2 with standard settings (25). Peaks overlapping the ENCODE blacklisted regions for the hg19 regions were removed from the analysis. The set of peaks overlapping between both replicates were used for subsequent analyses. We defined five different categories: all CTCF sites, all CTCFL sites, CTCF/CTCFL co-bound sites as well as CTCF or CTCFL st (...truncated)


This is a preview of a remote PDF: https://academic.oup.com/nar/article-pdf/46/14/7097/25509574/gky483.pdf
Article home page: https://academic.oup.com/nar/article/46/14/7097/5025896

Bergmaier, Philipp, Weth, Oliver, Dienstbach, Sven, Boettger, Thomas, Galjart, Niels, Mernberger, Marco, Bartkuhn, Marek, Renkawitz, Rainer. Choice of binding sites for CTCFL compared to CTCF is driven by chromatin and by sequence preference, Nucleic Acids Research, 2018, pp. 7097-7107, Volume 46, Issue 14, DOI: 10.1093/nar/gky483