Four divergent Arabidopsis ethylene-responsive
element-binding factor domains bind to a target DNA
motif with a universal CG step core recognition and
different flanking bases preference
Shuo Yang
1
, Shichen Wang
1
, Xiangguo Liu
1
, Ying Yu
1
, Lin Yue
3
, Xiaoping Wang
1
and Dongyun Hao
1,2
1 Key Laboratory for Molecular Enzymology and Engineering of the Ministry of Education, Jilin University, Changchun, China
2 Biotechnology Research Centre, Jilin Academy of Agricultural Sciences (JAAS), Changchun, China
3 School of Physical Education, Northeast Normal University, Changchun, China
Introduction
The ethylene-responsive element-binding factor (ERF)
gene family of transcriptional factors is one of the
largest transcriptional factor gene families in the plant
kingdom [1,2]. The ERF domain was first identified as
a conserved motif of 58–59 amino acids in four DNA-
binding proteins from tobacco and was shown to bind
specifically to a GCC box [3]. After the completion of
the sequencing of the Arabidopsis genome [4], 124
genes were predicted to encode proteins belonging to
by the ERF domain with the DRE motif, which is probably determined
by the highly conserved residues presented in the DNA contact surface
among the whole AtERF family members. The different preferences at
flanking bases of individual ERF domains, which appear to be attrib-
uted to the subfamily- or subgroup-specific residues, may be essential
discrimination of the target binding motif from various similar sequences
by divergent AtERF domains.
Abbreviations
DBD, DNA binding domain; DRE, dehydration-responsive element; EMSA, electrophoretic mobility shift assay; ERE, ethylene-responsive
element; ERF, ethylene-responsive element-binding factor.
FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7177
The AtERF family is further divided into various
subgroups according to the homology of ERF
domains [5,6].
An ERF domain consists of a three-stranded anti-
parallel b-sheet and an a-helix, packed approximately
parallel to the b-sheet, with the seven thoroughly con-
served amino acids (Arg6, Arg8, Trp10, Glu16, Arg18,
Arg26 and Trp28) in the b-sheet contacting uniquely
with the bases of the target DNA at the major groove
(see Fig. 1A) [7]. Phylogenetic analyses of the ERF
domains of all members within the AtERF family
show that the residues Arg6, Glu16 and Trp28 are
completely conserved among all 124 members, whereas
more than 95% of members contain the Arg8, Arg18,
Arg26 residues [6].
From the results of the few AtERFs studied,
however, the conserved ERF domains do not seem to
prefer identical DNA consensus sequences. For ins-
tance, some AtERFs have been shown to bind in vitro
have previously demonstrated that various ERF
domains had divergences in their DNA recognition
modes [9], but, to date, additional supporting evidence
has been lacking. Indeed, little is still known regarding
the ways in which these differences are important for
the functionalities of members in the AtERF family,
the majority of which have not yet been studied.
In the present study, we selected four representatives
from different functional subgroups of the AtERF
family and characterized the in vivo and in vitro bind-
ing specificities of the four ERF domains for a
sequence containing the DRE motif. In addition, we
used a random sequence selection method to identify
the core recognition motifs preferred by each of the
four domains. A universal binding characteristic was
revealed, in addition to the individual features of vari-
ous ERF domains involved in recognition of the DRE
A
B
C
Fig. 1. (A) Solution structure of AtERF1–GCC box complex (PDB
code: 1GCC) [7]. The DNA-binding domain is shown in the sche-
matic; DNA is represented by tubes. The b-sheet of the ERF
domain is light blue and the seven conserved amino acid residues
reported to contact DNA bases directly are red; other conserved
amino acid residues that do not directly contact with DNA bases
are blue. (B) The DNA base sequence with position numbering
along the 16 bp fragment of DREwt. The bases in the core ACC-
GAC are in bold and boxed in gray. (C) Sequence alignment of four
ERF domains of AtERF1, AtERF4, AtEBP and CBF1. The secondary
The equilibrium dissociation constants (K
d
)of
AtERF1, AtERF4 and AtEBP for binding to the DRE
motif were within the level of typical monomeric
interaction, although the binding activities were in gen-
eral lower than those for binding to the GCC motif.
CBF1 appeared to bind to the DRE motif more
strongly than to the GCC motif, implying CBF1 may
prefer the DRE motif over the GCC motif. To further
confirm if these variations in binding affinity were
caused by binding instability as a result of nonspecific
interference, rather than the alternation of a binding
site, we carried out the competition binding assay
using a nonspecific competitor poly[dA-dT].poly[dA-
dT] in an electrophoretic mobility shift assay (EMSA).
Figure 2 shows that most of the AtERFs exhibited
similar stability in binding to either the GCC or the
DRE motif. The most remarkable feature arising from
the competition binding assay was the consistency of
the binding preference of the AtERFs with the EMSA
analysis. The three AtERFs, AtERF1 AtERF4 and
AtEBP, with higher sequence similarity to each other
than to CBF1, had similar binding preferences in
comparison with CBF1.
Verification of the binding characteristics of the
selected AtERFs with the DRE motif
To verify the detailed binding characteristics of the
four different AtERFs to a given consensus sequence
DRE, EMSAs were carried out with the DRE motif
S. Yang et al. Arabidopsis ERFs recognize a common CG step core
FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7179
base substitution at that position caused the greatest
decline in binding activity. AtEBP requested C8, G9
and C11 most frequently and with the moderate
requirements of C7. As for CBF1, the prerequisite
bases appeared to be C8 and G9, whereas the other
bases within the binding motif were only moderately
required to varying extents. In the four reactions, bases
C8 and G9 in the DRE motif were absolutely
requested by all AtERFs for specific binding, indicat-
ing that the recognition of the CG step was conserved
by various AtERFs and may be the universal binding
characteristic of different AtERFs in recognition with
the DRE motif. In addition, bases C7 and C11 within
the motif were required to different extents by AtERFs
from the divergent phylogentic subgroups, implying
that the recognition of these bases was the individual
feature of distinct AtERFs binding to the DRE motif.
In vivo DNA binding specificity of AtERFs by the
reporter–effector transient assay
To confirm if these binding specificities of AtERFs
observed in vitro were also capable of regulating the
DRE-mediated transcription within plant tissue, repor-
ter effect cotransformation assays were carried out. An
effector plasmid possessing the coding region of the
full-length AtERF1, AtEBP or CBF1 genes driven by
the CaMV 35S promoter, together with the luciferase
reporter gene containing four tandem copies of either
the DRE motif or its mutants at the upstream regula-
3
4
5
0
1
2
3
4
5
0
1
2
3
4
5
0
1
2
3
4
5
AtERF1
AtERF4
AtEBP
CBF1
5 6 7 8 9 10 11
ΔΔG (Kcal·mol
–1
) ΔΔG (Kcal·mol
–1
hexamer GCCGCC motif, which is consistent with the
results from previous studies [7,8]. Although the
AtERF4 required a relatively relaxed G or A at posi-
tion 2 of the hexamer G ⁄ aCCGCC, AtEBP selected a
binding motif of hexamer GCCGCC. The selected
motif of CBF1, AA ⁄ cCGAC, appears to agree with a
previous report [14]. Although each ERF domain
showed different binding preferences, all of the binding
sites selected by the AtERFs from the four subgroups
possessed a common CG core in the centre and a con-
served C at the last position (position 7). These moder-
ately divergent bases existed in the other positions
within the binding motifs, discriminating the members
from different subgroups.
The solution structure of the complex formed by the
ERF domain of AtERF1 with the GCC box (1GCC)
shows that two categories of residues within the domain
are considered to be important for specific DNA bind-
ing: one consists of the residues in the b-sheet directly
contacting the DNA bases; and the other is made up of
the numerous Ala residues in the a-helix and the hydro-
phobic residues with larger side chains in the b-sheet (in
particular Phe13, Phe32, Val27 and Ile17), which
appears to determine the geometry of the a-helix rela-
tive to the b-sheet [3–5,7–9,17]. A multiple alignment of
Arabidopsis ERF domains (Fig. 6) shows that a series
of residues (e.g. Gly4, Arg6, Arg8, Gly11, Glu16, Ile17,
Arg18, Arg26, Trp28, Leu29, Gly30, Ala38, Ala39,
Asp43 and Asn57) were almost absolutely conserved
among all members of the ERF family. Most of these
0
2
4
6
8
10
DREwt
0
2
4
6
8
10
DREt2
DREt1
DREt3
Binding motifs:
l
o r
t
n o C
1 F B C
1
F
R E t A
P B E t A
l
o r t
n
o C
a CG step core as the universal binding characteristic.
This may be the foundation of the formation of a sta-
ble ERF–DNA complex and the different flanking
position preferences by individual ERF domains may
be crucial for the precise regulation of their own target
genes by various ERFs.
Materials and methods
Preparation of ERF domain-containing proteins
The coding region of the ERF domain of CBF1 (Uni-
ProtKB: P93835) (amino acids 47–142), which contains 10
and 38 amino acids in the N- and C-terminal regions,
respectively, was prepared as described previously [9]. The
Fig. 5. AtERF4 suppresses the transcription of the luciferase repor-
ter gene driven by the DRE motif and it mutants. A multicopy of
the GAL4 binding sequence was inserted into the DRE:luciferase
reporter next to the 4· DRE motif. An extra effector was con-
structed carrying the coding sequences of the activation domain of
viral protein 16 and the yeast GAL4 DBD. The reporter and two
effectors in a ratio of 1 : 1 : 1 were cotransformed into plant tissue;
the remainder was the same as in Fig. 4.
Table 2. Selection of binding sites from a random oligonucleotide
pool by ERFs. Selections were performed using a 60 bp oligonu-
cleotide containing a randomized site of 10 bp. The selected
sequences were aligned computationally and the appearance of a
base at each position in a motif was presented as a percentage fre-
quency of all four kinds of base. The base with a frequency higher
than 50% (bold) was defined as the selected site. If the second
highest frequency base showed not less than half the highest fre-
quency (marked with an asterisk), it was defined as the second
possible site and is presented in lower case letter.
3 8.0 64.0 16.0 12.0 C
4 8.0 60.0 20.0 12.0 C
5 0.0 4.0 88.0 8.0 G
6 56.0 32.0 8.0 4.0 A ⁄ c
7 4.0 68.0 24.0 4.0 C
Arabidopsis ERFs recognize a common CG step core S. Yang
et al.
7182 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS
Fig. 6. Sequence alignment of ERF domains of members of the Arabidopsis ERF family. All ERF domain sequences were aligned and classi-
fied according to the results from the phylogenetic tree. The names of the ERF domains are represented by their gene locus numbers
except that the names of the four domains used in this study are represented by the transcriptional factor names. The secondary structure
indicated above the sequence and the seven conserved amino acid residues reported to contact DNA bases directly [7] are in red; other
conserved amino acid residues that do not directly contact DNA bases are in blue.
S. Yang et al. Arabidopsis ERFs recognize a common CG step core
FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7183
ERF domains of AtERF1 (UniProtKB: O80337), AtERF4
(UniProtKB: O80340) and AtEBP (UniProtKB: P42736)
with 10 and 8 amino acids in the terminal regions, respec-
tively, were prepared according to the previous work of Hao
et al. [8]. The PCR products were then cloned into the
pET16b plasmid (Novagen, Merck, Darmstadt, Germany)
Fig. 6. (Continued ).
Arabidopsis ERFs recognize a common CG step core S. Yang et al.
7184 FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS
and the corresponding proteins were expressed in
BL21(DE3) pLysS (Merck) Escherichia coli cells and puri-
fied using a His-Trap his-tagged protein purification kit
(Amersham Pharmacia Biotech, Uppsala, Sweden). The pro-
tein concentrations were determined using the bicinchoninic
acid protein assay kit (Pierce, Chester, UK) and further
described previously [9].
Selection of the DNA-binding site
A 60 bp single-stranded DNA RDM10, with 10 random-
ized oligonucleotides in the center, i.e. CTGTCAGTGAT
GCATATGAACGAATN
10
AATCAACGACATTAGGATC
CTTAGC was synthesized. A 100 ng sample of RDM10
was radiolabelled during synthesis of double-stranded DNA
using [
32
P]dATP[aP] with the E. coli Klenow fragment
(New England Biolabs, Ipswich, MA, USA). The selections
were performed after incubation with the individual ERF
domains (25–100 ng) followed by EMSA. Briefly, each
binding reaction was carried out in a 10 lL binding buffer
[25 mm Hepes-KOH (pH 7.5), 40 mm KCl, 0.1 mm EDTA,
0.1 mgÆmL
)1
BSA, 10% glycerol and 1 lg double-stranded
poly(dI–dC)] and 25–100 ng of individual ERF domain.
The bound oligonucleotides were gel purified, extracted with
phenol ⁄ chloroform and precipitated with ethanol. The puri-
fied DNAs were radiolabelled during amplification by PCR
using 5¢ and 3¢ primers in the presence of [
32
P]dATP[aP].
This product was used for the next round of selection follow-
ing the same protocol. After seven cycles of selection, the
retarded DNA band of the final selection was cut off, puri-
References
1 Okamuro JK, Caster B, Villarroel R, Van Montagu M
& Jofuku KD (1997) The AP2 domain of APETALA2
defines a large new family of DNA binding proteins in
Arabidopsis. Proc Natl Acad Sci USA 94, 7076–7081.
2 Riechman JL & Meyerowitz EM (1998) The AP2 ⁄ ERE-
BP family of plant transcription factors. Biol Chem 279,
633–646.
S. Yang et al. Arabidopsis ERFs recognize a common CG step core
FEBS Journal 276 (2009) 7177–7186 ª 2009 The Authors Journal compilation ª 2009 FEBS 7185
3 Ohme-Takagi M & Shinshi H (1995) Ethylene-inducible
DNA binding proteins that interact with an ethylene-
responsive element. Plant Cell 7, 173–182.
4 The Arabidopsis Genome Initiative (2000) Analysis of
the genome sequence of the flowering plant Arabidop-
sis thaliana. Nature 408, 796–815.
5 Sakuma Y, Liu Q, Dubouzet JG, Abe H, Shinozaki K
& Yamaguchi-Shinozaki K (2002) DNA-binding speci-
ficity of the ERF ⁄ AP2 domain of Arabidopsis DREBs,
transcription factors involved in dehydration and
cold-inducible gene expression. Biochem Biophys Res
Commun 290, 998–1009.
6 Nakano T, Suzuki K, Fujimura T & Shinshi H (2006)
Genome-wide analysis of the ERF gene family in
Arabidopsis and rice. Plant Physiol 140, 411–432.
7 Allen MD, Yamasaki K, Ohme-Takagi M, Tateno M &
Suzuki M (1998) A novel mode of DNA recognition by
a beta-sheet revealed by the solution structure of the
GCC-box binding domain in complex with DNA.
EMBO J 17, 5484–5496.
gene expression. Plant Mol Biol 24, 701–713.
15 Prabakaran P, An J, Gromiha M, Selvaraj S, Uedaira
H, Kono H & Sarai A (2001) Thermodynamic database
for protein-nucleic acid interactions (ProNIT). Bioinfor-
matics 17, 1027–1034.
16 Triezenberg SJ, Kingsbury RC & McKnight SL (1988)
Functional dissection of VP16, the trans-activator of
herpes simplex virus immediate early gene expression.
Genes Dev 2, 718–729.
17 Liu Y, Zhao TJ, Liu JM, Liu WQ, Liu Q, Yan YB &
Zhou HM (2006) The conserved Ala37 in the ERF ⁄ AP2
domain is essential for binding with the DRE element
and the GCC box. FEBS Lett 580, 1303–1308.
18 Gill SC & von HippelPH (1989) Calculation of protein
extinction coefficients from amino acid sequence data.
Anal Biochem 182, 319–326.
19 Liu Q, Kasuga M, Sakuma Y, Abe H, Setsuko M,
Yamaguchi-Shinozaki K & Shinozaki K (1998)
Two transcription factors, DREB1 and DREB2,
with an EREBP ⁄ AP2 DNA binding domain
separate two cellular signal transduction pathways
in drought- and low-temperature-responsive gene
expression, respectively, in Arabidopsis. Plant Cell 10,
1391–1406.
20 Thompson JD, Gibson TJ, Plewniak F, Leanmougin F
& Higgins DG (1997) The CLUSTAL_X windows
interface: flexible strategies for multiple sequence align-
ment aided by quality analysis tools. Nucleic Acids Res
25, 4876–4882.
21 Guo A, He K, Liu D, Bai S, Gu X, Wei L & Luo J