Comparative analysis of the genetic variability within the Q-type C2H2 zinc-finger transcription factors in the economically important cabbage, canola and Chinese cabbage genomes
Lawrence and Novak Hereditas (2018) 155:29
https://doi.org/10.1186/s41065-018-0065-5
RESEARCH
Open Access
Comparative analysis of the genetic
variability within the Q-type C2H2 zincfinger transcription factors in the
economically important cabbage, canola
and Chinese cabbage genomes
Susan D. Lawrence*
and Nicole G. Novak
Abstract
Background: Brassica oleracea, B. rapa and B. napus encompass many economically important vegetable and oil
crops; such as cabbage, broccoli, canola and Chinese cabbage. The genome sequencing of these species allows for
gene discovery with an eye towards discerning the natural variability available for future breeding. The Q-type
C2H2 zinc-finger protein (ZFP) transcription factors contain zinc finger motifs with a conserved QALGGH as part of
the motif and they may play a critical role in the plants response to stress. While they may contain from one to five
ZF domains (ZFD) this work focuses on the ZFPs that contain two zinc-fingers, which bind to the promoter of
genes, and negatively regulate transcription via the EAR motif. B. oleracea and rapa are diploid and evolved into
distinct species about 3.7 million years ago. B. napus is polyploid and formed by fusion of the diploids about
7500 years ago.
Results: This work identifies a total of 146 Q-type C2H2-ZFPs with 37 in B. oleracea, 35 in B. rapa and 74 in B.
napus. The level of sequence similarity and arrangement of these genes on their chromosomes have mostly
remained intact in B. napus, when compared to the chromosomes inherited from either B. rapa or oleracea. In
contrast, the difference between the protein sequences of the orthologs of B. rapa and oleracea is greater and
their organization on the chromosomes is much more divergent. In general, the 146 proteins are highly conserved
especially within the known motifs. Differences within subgroups of ZFPs were identified. Considering that B. napus has
twice the number of these proteins in its genome, RNA-Seq data was mined and the expression of 68 of the 74 genes
was confirmed.
Conclusion: Alignment of these proteins gives a snapshot of the variability that may be available naturally in Brassica
species. The aim is to study how different ZFPs bind different genes or how dissimilar EAR motifs alter the negative
regulation of the genes bound to the ZFP. Results from such studies could be used to enhance tolerance in future
Brassica breeding programs.
Keywords: Brassica, Q-type C2H2 zinc finger transcription factors, Cabbage, Canola
* Correspondence:
Invasive Insect Biocontrol and Behavior Lab, USDA-ARS, 10300 Baltimore Ave.,
BARC-West Bldg 007, Rm 301, Beltsville, MD 20705, USA
© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Lawrence and Novak Hereditas (2018) 155:29
Background
Q-type C2H2 zinc finger proteins (ZFP) are transcription factors. “Q-type” refers to the invariant QALGGH
sequence found in the zinc finger domains, and C2H2
characterizes the two cysteine and two histidine residues
found in each finger. These residues bind a zinc ion that
stabilizes the ZFP and allows binding specificity to a domain within the promoter of the gene it regulates. First
discovered in petunia by Takatsuji et al. [1], a total of 21
Q-type C2H2 ZFPs have been described in that species
[2]. Using in silico methods, Englbrecht et al. [3] described 3 groups of ZFPs in Arabidopsis; A, B and C,
with the C family divided into three additional groups
(C1, C2 and C3) depending on the number of spaces between the invariant histidines. There are 64 members in
the C1 family that contain either a single or a cluster of
two to five zinc finger domains (ZFDs). The C1 family
has three amino acids between the histidines and contains many proteins responsive to environmental stress
[4, 5]. In Arabidopsis, the 18 two fingered Q-type C2H2
ZFP proteins are members of the C1-2i and here will be
referred to as ZFPs. The Arabidopsis proteins cluster
into five groups named 2i-A-D with an outlier-X [3].
These ZFPs include a conserved domain containing the
amino acids DLN. It is similar to the first active repression motif described in plants [6], which was named the
ethylene-responsive element-binding factor (ERF)-associated amphiphilic repression (or EAR) domain. A role
for the EAR motif as an active repressor was also demonstrated in ZFPs of Arabidopsis [7]. Ectopic expression
of ZFPs can lead to an increase in tolerance to specific
stresses [5]. Subsequently additional studies identifying
all forms of C2H2 ZFPs have been undertaken in for example rice, foxtail millet, poplar and crocus [8–11]. Several other studies focused specifically on the Q-type
C2H2 TFs, for example in poplar, or bread wheat [12,
13]. Generally, these studies utilize the availability of a
published genome sequence, however, studies in bread
wheat (a hexaploid organism) and crocus relied on ESTs
from public databases [11, 13]. The work described in
this manuscript, catalogs the Q-type ZFPs in three Brassica species. Therefore, naturally occurring variants
within these proteins can be utilized in subsequent studies to identify how the altered sequences could affect
gene expression. ZFPs might be useful as tools for
breeding increased tolerance to biotic or abiotic stresses
encountered by cole crops.
In the current study, we identify ZFPs of economically important Brassica species, and analyze their structure and their expression using RNA-Seq data from
previously published work. To cast a wide net for capturing genetically related ZFPs, three related species
were examined and compared to Arabidopsis ZFPs.
Arabidopsis is also a member of the Brassicaceae and
Page 2 of 11
evolved from the same ancestral progenitor as the Brassica species. The Arabidopsis lineage split from this
ancestral progenitor approximately 20 million years ago
(mya) with a whole genome duplication occurring in
the Brassica lineage ~ 16 mya, which is reflected in a
doubling in gene number when comparing Brassica to
Arabidopsis [14]. Comparative mapping of Arabidopsis
to several other similar species provides evidence for an
ancestral Arabidopsis genome that has been divided
into 24 blocks [15]. These have been reshuffled in
present day Arabidopsis and these syntenic blocks have
also been identified in modern Brassica species. Maps
showing the reorganization of these blocks in B. rapa
and B. oleracea have been published along with the
genome sequences of these species [16, 17] (...truncated)