!
17!
the E-proteins, ID1 was shown to positively regulate the cell cycle by inhibition of an
E-protein transcribed gene, the cyclin dependent kinase (CDK) inhibitor, p21.
Downregulation of p21 caused a cascade of signaling events that ultimately led to the
activation of genes required for S phase progression (Prabhu, et al., 1997). In a
different experiment, stable transfection of ID2 in U2OS, a human osteosarcoma cell
line, resulted in an increase of cells in S phase detected by flow cytometry (Iavarone,
et al., 1994).
Constitutively expressed ID genes in immortalized fibroblast cells was shown to
cause cytoskeletal disorganization and loss of adhesion (Deed, et al., 1993). ID
genes had also been shown to immortalize primary mouse fibroblasts when co-
transfected with Bcl2 (Norton, et al., 1998) and in particular, ID1 was able to
immortalize primary human keratinocytes leading to the activation of telomerase and
inhibition of pRb, a known tumour suppressor (Alani, et al., 1999).
Best illustrated in breast cancer, overexpression of ID1 caused mammary
epithelial cells to invade the basement membrane and had been shown to be highly
associated with more aggressive tumours (Desprez, et al., 1998). Constitutive
expression of ID1 in a non-invasive breast cancer cell line produced uncontrolled
growth and increased invasion (Lin, et al., 2000). In addition, ID1 was shown to be
involved in the regulation of steroid-hormone-responsive growth in breast cancer
cells, a loss of which led to uncontrolled growth of breast cancer cells.
1.8 Properties and roles of ID2
!
ID2 was first cloned in 1991 and functioned to inhibit bHLH-domain containing
transcription factors in a similar capacity as the other IDs (Langlands, et al., 1997,
Sun, et al., 1991). Full-length monomeric ID2 has 134 residues and a calculated
molecular weight of 15kDa. The HLH domain of ID2 predicted by Pfam centered
around residues 24-76. Expression of ID2 was prevalent in early development in
!
18!
would be very similar. At a sequence level, the modeled ID3 and ID2 homodimers
shared conserved hydrogen bonds at Y44, L50, Y72, Q77 as well as core
hydrophobic residues M39, L46, L49, M62, I69, I72, L75. At a structural level, the
predicted ID-HLH topology was the same as other bHLH-containing proteins so it
was expected that they would bind to all bHLH-containing proteins of the same
structure.
However, studies showed that ID2 did not form heterodimers with all bHLH-
containing proteins; rather, it selectively interacted with the Group A HLH-containing
proteins E47 and E12 as well as MYOD1 but not the Group B USF1 (Sun, et al.,
1991) nor the bHLH-z structures like MYC and MAX (Figure 4). When ID2 was
cloned, the authors wrote that it did not homodimerize well (Sun, et al., 1991). Others
reported ID2 homodimer to be insoluble and tending to aggregate, especially at high
concentrations (Colombo, et al., 2006). This could be a reason for the sparse
structural information on ID2.
!
20!
1.9 Aim and Scope of Project
!
From previous studies, it was clear that the bHLH-containing proteins played
crucial roles in early development, neurogenesis, myogenesis and cancer. The HLH
domain was found to be well conserved throughout evolution with ID2 having an
ortholog in Drosophila (emc gene). A special class of HLH-containing proteins, the
IDs were especially interesting due to their lack of a basic domain along with their
availability in almost all eukaryotic cells. Members of this family have been known to
regulate other Group A bHLH-containing proteins such as E47 (TCF3) and MyoD
(MYOD1) but very little was known about why they were so specific in their
interactions given the structural similarities to each other and to all the different
groups of HLH-containing proteins. Compounded with this was the fact that IDs were
short-lived proteins, as they functioned to regulate cell fate and were required to
(Invitrogen) and plated on LB agar plates containing 100 µg/ml kanamycin. Single
colonies were used to inoculate 5 ml Luria Broth (LB) containing 50 µg/ml kanamycin
and allowed to grow shaking overnight at 37°C. The overnight culture was
centrifuged at 720 g in an Eppendorf 5804R with A-4-44 swing-bucket rotor and the
pellet used for plasmid isolation using QiaPrep Spin Plasmid Miniprep Kit. The entry
clone was subsequently subcloned via the Gateway LR reaction (Invitrogen)
according to the manufacturer’s protocols into several expression vectors containing
sequences for different affinity and solubility tags namely, pDest-17 (His6), pETG-
20A (His6-TrxA), pDest-565 (His6-GST), pDest-HisMBP (His6-MBP), pETG-60A
(His6-NusA). The expression clones were transformed into BL21 (DE3) Competent E.
coli cells (Invitrogen) and plated on LB agar plates containing 100 µg/ml Ampicillin.
Single colonies were isolated and grown in 5 ml LB + 100 µg/ml Ampicillin and
allowed to grow overnight at 37°C, shaking. The same protocol used to isolate the
entry clones was used for the expression plasmids. Inserts were confirmed by
sequencing (1
st
base, ). In addition, 2 ml glycerol stocks of
the expression clones were stored (1 ml 70% glycerol + 1 ml overnight culture) at -
80°C for future use.
!
23!
Table 2: ID2 constructs and their theoretical biochemical properties estimated by ProtParam
(Wilkins, et al., 1999). Constructs described in detail (yellow highlight)
Construct
cDNA
(bp)
AA
start
AA
end
82
82
8.8
9.3
4470
N-HLH113
339
1
113
113
9.2
12.7
4470
HLH24-82-L
219
24
82
73
6.1
8.3
4470
N-HLH82-L
288
1
82
96
8.8
10.8
4470
primer sequences used for BP cloning.
Construct
Forward 5’
Reverse 3’
Full Length
ATGAAAGCCTTCAGTCCCGT
GAGG
TCAGCCACACAGTGCTTTGC
TGTC
HLH24-82
CGGAGCAAAACCCCTGTGG
ACGAC
ATGCGAGTCCAGGGCGATCT
GCA
N-HLH82
ATGAAAGCCTTCAGTCCCGT
GAGG
ATGCGAGTCCAGGGCGATCT
GCA
N-HLH113
ATGAAAGCCTTCAGTCCCGT
GAGG
ACAGGATGCTGATATCCGTG
TTGAG
HLH24-82-L
CGGAGCAAAACCCCTGTGG
ACGAC
GATGCGAGTCCAGGGCGATC
TGCA
N-HLH82-L
ID2_Q76A_R
Y I L D L A I A L D S
TAC ATC TTG GAC CTG GCG ATC GCC CTG GAC TCG
CGA GTC CAG GGC GAT CGC CAG GTC CAA GAT GTA
Residue 68-79
ID2_Y71A_Q76A_F
ID2_Y71A_Q76A_R
V I D A I L D L A I A L
GTC ATC GAC GCC ATC TTG GAC CTG GCG ATC GCC CTG
CAG GGC GAT CGC CAG GTC CAA GAT GGC GTC GAT GAC
Residue 71-81
ID2_Q76D_F
ID2_Q76D_R
Y I L D L D I A L D S
TAC ATC TTG GAC CTG GAT ATC GCC CTG GAC TCG
CAG CGA GTC CAG GGC GAT ATC CAG GTC CAA GAT
Residue 65-75
ID2_Y71F_F
ID2_Y71F_R
L Q H V I D F I L D L
CTG CAG CAC GTC ATC GAC TTC ATC TTG GAC CTG
CAG GTC CAA GAT GAA GTC GAT GAC GTG CTG CAG
!
!
!
!
!
!
!
!
Residue 42-52
ID2_K47R_F
ID2_K47R_R
-C Y S K L R E L V P S-
TGCTACTCCAAGCTCAGGGAGCTGGTGCCCAGC
GCTGGGCACCAGCTCCCTGAGCTTGGAGTAGCA
Residue 35-47
ID2_Y37D_D41G_F
ID2_Y37D_D41G_R
-L L D N M N G C Y S K L K-
CTGCTAGACAACATGAACGGCTGCTACTCCAAGCTCAAG
CTTGAGCTTGGAGTAGCAGCCGTTCATGTTGTCTAGCAG
Residue 35-47
ID2_Y37D_D41H_F
ID2_Y37D_D41H_R
-L L D N M N H C Y S K L K-
CTGCTAGACAACATGAACCACTGCTACTCCAAGCTCAAG
CTTGAGCTTGGAGTAGCAGTGGTTCATGTTGTCTAGCAG
ID2 loop region mutants
Residue 50-60
ID2_Q55A_F
ID2_Q55A_R
-V P S I P Q N K K V S-
GTGCCCAGCATCCCCGCGAACAAGAAGGTGAGC
GCTCACCTTCTTGTTCGCGGGGATGCTGGGCAC
Residue 50-60
ID2_Q55R _F
ID2_Q55R _R
-V P S I P Q N K K V S-
ID3_R60Q_R
-V P G V P Q G T Q L S-
GTACCCGGAGTCCCGCAAGGCACTCAGCTTAGC
GCTAAGCTGAGTGCCTTGCGGGACTCCGGGTAC
Residue 61-71
ID3_Q66A_F
ID3_Q66A_R
-G T Q L S A V E I L Q-
GGCACTCAGCTTAGCGCGGTGGAAATCCTACAG
CTGTAGGATTTCCACCGCGCTAAGCTGAGTGCC
Residue 61-71
ID3_Q66K_F
ID3_Q66K_R
-G T Q L S K V E I L Q-
GGCACTCAGCTTAGCAAGGTGGAAATCCTACAG
CTGTAGGATTTCCACCTTGCTAAGCTGAGTGCC
!
!
26!
2.3 Protein expression optimization
To test for protein expression and solubility of the different expression clones,
factors known to affect protein expression were varied. Experiments were performed
in 5 ml small-scale experiments. Variable factors included media (Luria Broth (LB)
and Terrific Broth (TB)), induction temperatures and times (17°C for 18 hrs, 25°C for
5 hrs, 30°C for 3 hrs), IPTG concentrations (0.2 mM – 1 mM) and solubility tags.
Glycerol stock scrapes were used to inoculate 5 ml LB overnight at 37°C. 2%
overnight inoculums were added to 5 ml fresh LB or TB and grown shaking at 37°C
till an OD
600
2
HPO
4
-7H
2
O, 3.1 g/L KH
2
PO
4,
0.5 g/L NaCl, 0.5
g/L MgSO
4,
0.1 mM CaCl
2,
5 g/L NH
4
Cl, 20% d-Glucose) minimal media and
centrifuged again at 720 g in an Eppendorf 5804R with A-4-44 swing-bucket rotor for
5 min. The pellet was resuspended in 2 ml M9 media and added to 150 ml M9 media
and allowed to grow overnight at 37°C in a shaker incubator. 5ml of overnight culture
was added to fresh M9 media in a ratio of 1:100 till OD
600
reached 0.6 at 37°C (~6hrs).
Amino acid mix containing 100 mg K, F, and T; 50 mg I, L, and V, and 60 mg Se-Met
per liter was added and mixed for 10 min at 37°C. The culture was induced with
0.4mM IPTG and allowed to grow at 18°C for 18 hours.
2.6 Cell Harvesting
Cells were harvested by ultracentrifugation in Nalgene plastic 50 ml tubes at
11,952 g in a Sorvall SS-34 rotor for 10 min at 4°C. The pellets were resuspended in
cold lysis buffer (50 mM Tris-HCL pH 8.0, 300 mM NaCl, 30 mM Imidazole) and
protein from N-HLH82-L construct as this had the highest resolution dataset. Single
stranded forward and reverse 5’-Cy5-labelled probes for e-box-containing
!
29!
(underlined) MCK promoter sequence 5’-GGATCCCCCCAACACCTGCTGCCTGA
and mutant e-box probe 5’-GGATCCCCCCAAACTGGTCTGCCTGA (Sigma, Proligo)
with their exact reverse complements were annealed in a BioRad thermal cycler.
Purified E47 (residues 545-606) was used alone and in combination with ID2-N-HLH
after serial dilution and incubated for 10 min at room temperature in binding buffer
(20 mM Tris-HCL pH 8.0, 50 mM KCl, 1 mM DTT, 1 mM EDTA, 10% glycerol, 0.1 mg
ml
-1
BSA). 2 μM Cy5-labelled probe was added for an additional 15 min at room
temperature to a final reaction volume of 20 μL. Samples were electrophoresed on a
6% Tris-glycine native polyacrylamide gel in 1xTris-glycine (25 mM Tris pH 8.3, 192
mM Glycine) buffer at 4°C for 130 min at 300 V and imaged using a Typhoon
phosphor-imager (Amersham Biosciences).
2.9 Crystallization
Initial screens were done by an automated robot liquid-dispenser (Innovadyne) in
a 96-well format via sitting-drop vapour diffusion by combining 200 nl protein with 200
nl precipitant solution equilibrated over a 50 μl reservoir of precipitant. Screening kits
from Qiagen and Hampton Research were used and the best crystals were found in
Qiagen’s Cation Suite for all constructs after 4-5 days at 18°C. Hits were found in
conditions 4.5M Ammonium Acetate (grid ID: E8) for HLH24-82-L-Se-Met, 0.1 M
MES pH 6.5, 2.0 M Potassium Acetate (grid ID: G8) for N-HLH82-L and 0.1 M MES
pH 6.5, 2.5 M Lithium Acetate (grid ID: C9) for HLH24-82-L. Conditions were
optimized manually by hanging drop vapor diffusion using 1 μl of protein solution
mixed with 1 μl precipitant solution and allowed to grow at room temperature, 18°C
and 4°C. Optimal manual setup temperature was found to be 18°C at the same
the Appendix.
!
31!
CHAPTER 3: RESULTS and DISCUSSION
(Expression to X-ray Data Collection) 3.1 Cloning and Small-scale Protein Expression
The motivation behind the different ID2 constructs was based on reports that ID2
did not homodimerize well (Sun, et al., 1991) and that the homodimer was known to
be insoluble and to aggregate at high concentrations (Colombo, et al., 2006). Many
published in vitro experiments used ID proteins that retained solubility tags or were
made at low concentrations for biochemical studies. The challenge was to express
enough soluble ID2 that was stable at high concentrations in order to conduct
crystallization trials.
To alleviate the instability issues, constructs (Table 2) were created based on the
published properties of ID2. Combined domain prediction sites pFam (protein family
database) (Finn, et al., 2010) and The Simple Modular Architecture Research Tool
(SMART) (Schultz, et al., 1998) predicted that the HLH region ranged from residues
28-81 (Table 6). Previous experiments found that the HLH domain alone was enough
for dimerization (Ellenberger, et al., 1994, Ma, et al., 1994) but additional residues
surrounding it were required for stability (Liu, et al., 2000). Beyond the HLH region,
the sequence diverged except for small pockets of similarity. An example was the
canonical D-box (destruction box) motif (RxxLxxxN) located C-terminal of the HLH at
residues 100-107 in ID2 and conserved in ID1 and ID4 (Lasorella, et al., 2006). This
motif was shown to be a target for APC (anaphase promoting complex) to bind,
hence signaling the protein for degradation (Lasorella, et al., 2006). In addition,
mutation of the D-box increased the half-life of ID2 10-fold without compromising its
ability to dimerize (Meng, et al., 2009).
To avoid degradation via the D-box and increase stability but include minimally
the HLH and some surrounding residues, a construct was created to start at residue
24 and end at residue 82 (HLH24-82) (Table 2). Additionally, the full-length protein,
as well as a construct including the D-box (N-HLH113) were made based on
Superfamily (Reinke, et al., 1991) domain prediction (Table 6) to see how they
performed in comparison. As expected, results of the small-scale expression studies
showed that the full-length and N-HLH113 constructs either did not produce any
soluble protein or were completely absent even with different solubility tags, induction
temperatures and media (Figure 6).
!
33!
Figure 6: Representative small-scale protein expression tests.
SDS-PAGE on 12% gels: Insoluble (P) and soluble (S) fractions were alternated with 30°C and
17°C induction temperatures. Each gel denotes a different expression vector (labeled at the
bottom of gel). All experiments used LB media and were induced with 0.2 mM IPTG. The order of
the samples was the same for all gels apart from the positions of the marker. Red boxes denote
where expected bands should be.
(A) before induction (lane U), marker (lane M), full-length insoluble (lane P) at 30°C, full-length
soluble (lane S) at 30°C, full-length insoluble (lane P) at 17°C, full-length soluble (lane S) at 17°C,
HLH24-82-L insoluble (lane P) at 30°C, HLH24-82-L soluble (lane S) at 30°C, HLH24-82-L insoluble
(lane P) at 17°C, HLH24-82-L soluble (lane S) at 17°C, N-HLH113 insoluble (lane P) at 30°C, N-
HLH113 soluble (lane S) at 30°C, N-HLH113 insoluble (lane P) at 17°C, N-HLH113 soluble (lane S)
at 17°C. Expressed only in insoluble fraction for His6 tag
(B) Expressed only in insoluble fraction for His6-Trx tag
(C) Expressed only in insoluble fraction for His6-MBP tag
(D) Red arrow shows soluble fraction of ID2 (HLH24-82-L) induced at 17°C for His6-GST tag
(E) No expression with His6-NusA tag !
35!
terminal polypeptide stabilizer were subsequently used for large-scale protein
expression and purification.
Figure 7: Stability of HLH24-82-L containing polypeptide stabilizer over 6 days at room
temperature (25°C). SDS-PAGE 12% gel: marker (lane M), Day 0 (lane 1), Day 1 (lane 2), Day 3
(lane 3), Day 6 (lane 4)
3.2 Protein Expression and Purification
Optimal expression conditions were found using Luria broth induced with 0.2 mM
IPTG for 18 h at 17°C with the expression vector pDest-565 containing an N-terminal
His6-GST-TEV tag that included each of inserts HLH24-82-L and N-HLH82-L.
HLH24-82-L was used in a seleno-methionine replacement experiment for
anomalous dispersion. Typical yields ranged between 1.5 – 2 mg of pure ID2 per litre
of bacterial culture. The chromatography profiles of all 3 constructs were virtually
identical so a representative set of profiles is shown in Figure 8 (A, C, E). The affinity
chromatography profile utilizing the His6 tag to trap the fusion ID2 protein was
performed and immediately desalted (Figure 8A). All fractions were pooled (Figure 8
B, lane 3; G, lane 2; J, lane 3) and the tag cleaved off with TEV (Figure 8 B, lane 4; G,
lane 3; J, lane 4). The resulting mixture was used as the starting material for ion
exchange chromatography (Figure 8C) to remove the tag from the protein of interest