MINIREVIEW
Deciphering enzymes
Genetic selection as a probe of structure and mechanism
Kenneth J. Woycechowsky and Donald Hilvert
Laboratorium fu
¨
r Organische Chemie, Swiss Federal Institute of Technology, ETH-Ho
¨
nggerberg, Zu
¨
rich, Switzerland
The efficient engineering of enzymes with novel activities
remains an ongoing challenge. Towards this end, genetic
selection techniques provide a method for finding rare
solutions to catalytic problems that requires only a limited
foreknowledge of structure–function relationships. We
have used genetic selections to extensively probe the
structure and mechanism of chorismate mutases. The
insights gained from these investigations will aid future
enzyme design efforts.
Keywords: chorismate mutase; functional selection; protein
engineering; protein folding.
Introduction
The incredible catalytic power of enzymes is well-documen-
ted [1,2], but its source remains elusive. Enzymes catalyze
a vast array of reactions with high specificity, under mild
conditions [3]. These properties make enzymes potentially
useful for organic synthesis [4,5]. Still, our current under-
standing of protein structure–function relationships remains
insufficient for the de novo design of enzymes with tailored
catalytic activities [6].
of their detection. In principle, any enzyme activity can be
selected for in vivo, provided that catalysis of the desired
reaction can be linked to cell growth.
One general strategy for in vivo enzyme selection is the
introduction of a metabolic requirement for the desired
activity. A genetic selection system for chorismate mutase
(CM) activity provides an example of this strategy (Fig. 1)
[10]. CMs catalyze the Claisen rearrangement of chorismate
to prephenate, which is the first committed step in the
biosynthesis of phenylalanine and tyrosine [11]. In this
system, a strain of Escherichia coli was engineered in which
the genes encoding the bifunctional CM–prephanate dehy-
dratase and CM–prephenate dehydrogenase protein com-
plexes were replaced by genes encoding monofunctional
versions of the dehydratase and the dehydrogenase. The
growth of this strain on minimal media lacking phenyl-
alanine and tyrosine requires an added source of CM
activity. This source can be provided by transformation
with a plasmid carrying a gene encoding the enzyme. This
selection system has been used to reveal structural and
mechanistic requirements for enzyme catalysis of this
reaction.
Selection for restructured enzymes
Catalysis requires the fulfilment of exacting structural
criteria; only properly folded proteins are active. Protein
folding is dictated by amino acid sequence [12]. In an
ensemble of proteins composed from the standard set of 20
amino acids with completely random sequences, however,
the chance of encountering a significantly structured
Correspondence to D. Hilvert, Laboratorium fu
opportunities and great challenges for enzyme engineering.
Genetic selections of Methanococcus jannaschii CM
(MjCM) [14] were used to assess the tolerance of a protein
fold to secondary structures of varying sequence [15].
MjCM belongs to the AroQ class of CMs whose members
adopt a homodimeric, a-helical bundle fold (Fig. 2) [16].
Each monomer consists of three a-helices and two turns.
Three libraries were constructed and subjected to in vivo
selection for CM activity (Fig. 3): first, the N-terminal helix
alone was randomized, secondly, the two C-terminal helices
were randomized simultaneously, and finally, positives from
the first two libraries were randomly combined to give
proteins whose sequences had been varied over all three
Fig. 1. Selection system for chorismate mutase activity in Escherichia coli. An E. coli strain (KA12) was engineered in which the genes encoding the
bifunctional enzymes chorismate mutase–prephenate dehydratase and chorismate mutase–prephenate dehydratase were deleted. Monofunctional
versions of the dehydratase and the dehydrogenase are provided by plasmid pKIMP-UAUC. Random gene libraries are introduced into this strain
and the ability of a cell harboring an individual library member to form a colony on minimal agar media lacking added phenylalanine and tyrosine
reports on the chorismate mutase activity of the encoded protein [10].
Fig. 2. Structure of AroQ chorismate mutases. E. coli chorismate
mutase, the prototypical AroQ chorismate mutase, is shown in a
ribbon diagram representation [16]. AroQ chorismate mutases form
homodimers of intimately entwined a-helices. The three helices of one
subunit are indicated. A transition state analog inhibitor is bound in
each active site and is represented in ball-and-stick form.
Fig. 3. Design of the three binary-patterned libraries of Methanococcus
jannaschii chorismate mutase. The residues within the secondary
structural elements of M. jannaschii chorismate mutase were changed
to a random distribution of only eight different amino acids: four polar
and four hydrophobic [15]. An individual sequence position was ran-
domised using the four amino acid set of similar polarity to the wild
)
complementation rate of library 3. In this library, the
sequences of helix 1 (itself) and helices 2 and 3 (together) were
each functional in a context where C- and N-terminal halves
of the protein, respectively, had the wild type sequence. The
successful packing of these preselected segments against each
other required an equally extensive search of sequence space
as did the selection for proper folding of the initially
randomized helices with the wild type template. A sequential
strategy of randomizing helix 1 first and then randomizing
helices 2 and 3 of an active variant from the first library (or
vice versa), might prove more efficient than the convergent
library approach outlined in Fig. 3.
In this study of MjCM, about 80% of the protein sequence
was subjected to randomization. Functional enzymes were
found with less than 50% sequence identity to the wild type.
While active catalysts were rare in these libraries, their
presence demonstrates the ability of this protein fold to
tolerate extensive substitutions. Harnessing this structural
plasticity should be advantageous for enzyme redesign.
Examination of the selected sequences revealed that some
positions are more important than others and thus showed
stronger preferences for one particular residue. For exam-
ple, Ile14, Asn84 and Lys85 are all highly conserved in
the active variants. This lack of permissiveness is perhaps
unsurprising given that these residues probably contact the
substrate and transition state during catalysis; active site
sequences tend to be highly conserved. Additionally, Asp15
and Asp18 were also relatively nonpermissive. While they
probably do not directly contact the substrate or transition
tripeptide sequences were functional. When Lys64, the
solvent-exposed C-terminal residue of helix 2, was included
in the randomization, the fraction of functional sequences
dropped to 50%, but all four residues showed similar, high
tolerances to substitution. Despite this high permissiveness,
and in contrast to a previous study on the sequence
requirements for a turn in cytochrome b
562
[25], a close
examination of the sequence data showed a subtle, but
strong, bias for hydrophilic amino acids in these positions.
This bias may have gone undetected in cytochrome b
562
because that study, which found a similar low stringency for
an interhelical turn sequence, relied on an assay for structure
that was probably less sensitive than functional selection.
The thermodynamic benefit resulting from minimizing the
water accessible surface area of hydrophobic residues placed
at these solvent-exposed positions may lead to aggregation
or to local conformational disruptions.
Fig. 4. The turn between helices 2 and 3 in E. coli chorismate mutase.
Random mutagenesis of Lys64, Ala65, His66 and His67 followed by
selection for chorismate mutase activity showed that these solvent
exposed positions are highly permissive. In contrast, a similar experi-
ment including Leu68, which is buried, instead of Lys64 produced
much fewer complementing sequences. Apparently, tertiary contacts
necessitate a hydrophobic amino acid at position 68, preferably one
with a branched aliphatic side chain [24].
1632 K. J. Woycechowsky and D. Hilvert (Eur. J. Biochem. 271) Ó FEBS 2004
A markedly different result was obtained when Leu68,
library into the CM selection system followed by screening of
the positives using size-exclusion chromatography allowed
the identification of a monomeric variant of MjCM that
retained nearly 30% of the wild type activity (Fig. 5) [27].
Statistical analysis indicated that < 0.05% of the sequences
produced well-behaved monomers, a surprisingly small
fraction given the broad sequence tolerance of interhelical
turn sequences noted above. The tertiary structural context
may place imposing constraints on this turn sequence.
Genetic selection has proved useful in generating other
changes in CM quaternary structure. In a similar strategy to
that described above, a randomized sequence of four to
seven residues was inserted between Ala23 and Leu24 in the
N-terminal helix of the mesostable EcCM (Fig. 5) [28].
Selection of these libraries showed that functional turn
sequences were again rare, giving complementation rates of
< 0.5% in all cases.
While EcCM variants with four or seven amino acid
insertions gave unstable monomers that were prone to
precipitation, a five amino acid insertion surprisingly gave
a stable, well-behaved hexamer [28]. The sequence of the
insertion was nonpolar, suggesting that oligomerization
through hydrophobic interactions may be an easy way to
increase enzyme stability. This hexameric variant, however,
suffered a 200-fold decrease in catalytic efficiency. In
contrast, the unstable monomeric variants had near wild
type activity. Over the limited area of sequence space
covered by these libraries, there may be a trade-off between
protein stability and catalytic activity.
The AroQ CMs can retain function despite large changes
static interactions in catalysis by Bacillus subtilis CM
(BsCM).
BsCM is a member of the AroH class of CMs. This class
adopts a trimeric, pseudo-a/b barrel fold [38] (Fig. 6). AroH
and AroQ CMs share some common active site features.
For example, in the crystal structures with an oxabicyclic
transition state analog (TSA), both enzymes show multiple
cationic groups (Arg and Lys) interacting with the carb-
oxylates and the ether oxygen [16,38]. Additionally, both
Fig. 5. Topological rearrangement of dimeric AroQ chorismate mutase
into a monomer. Insertionofaflexibleloopintohelix1,whichspans
the dimer, allows the N-terminal portion of the helix to bend back on
itself and thus form a complete active site within a monomeric four-
helix bundle. The insertion site is indicated by a horizontal red line.
Ó FEBS 2004 Deciphering enzymes (Eur. J. Biochem. 271) 1633
possess a Glu residue that hydrogen bonds to the hydroxyl
group of the TSA. Despite their different folds, both
enzymes are likely to utilize similar catalytic mechanisms.
The transition state for the chorismate mutase reaction is
highly polarized [39]. In the structure of BsCM complexed
with TSA [38], Arg90 seems poised to stabilize developing
negative charge during the C-O bond cleavage (Fig. 7).
An R90A variant exhibits a more than 10
6
-fold decrease in
activity [40].
To further assess the role of this residue in catalysis,
libraries were constructed in which both Arg90 alone and
Arg90 and Cys88 together were randomized [10]. Selection
revealed that, when the rest of the protein sequence is held
-fold less active
than wild type [40]. To examine its role in catalysis, Glu78
was randomized alone and together with Cys75 [42]. Unlike
the strict requirement for Arg90, several other residues were
able to directly replace Glu78, although the selection
produced a bias for residues capable of hydrogen bonding.
Interestingly, Asp was unable to substitute for Glu78,
providing a further indication of the subtle interactions that
dictate active site structure and function. When positions 75
and 78 were varied in tandem, however, an Asp at position
75 was able to substitute for Glu78, provided Ala, Ser, Met
or Val was present at position 78. As functional solutions
lacking an anion were found, the interaction of Glu78 with
the transition state carbocation is not clear-cut. The crucial
role of Glu78 may be to orient the substrate through a
hydrogen bond with the hydroxyl group of chorismate
[42a,42b].
Enzyme catalysis is a dynamic process. Yet, the import-
ance of highly mobile, crystallographically unresolved
residues is often overlooked. At the C-terminus of BsCM,
residues 111–115 adopt a 3
10
helix and the following
11 residues have poorly defined structure (Fig. 6). This
C-terminal tail lies close to the entrance of the substrate
binding pocket and therefore may be important for
catalysis. In the absence of structural information, however,
it is difficult to postulate functional roles for individual
residues. To help provide a functional definition for these
residues, libraries of BsCM variants were constructed using
cat
, but significant increases in
K
m
, which precludes their direct participation in catalysis.
Instead, these residues probably contribute to catalytic
efficiency via uniform binding of the substrate and trans-
ition state.
The versatility of functional selection
So far, we have focused on genetic selection of CMs. Other
selection systems have also proved useful for investigating
enzyme structure and mechanism, and have been recently
reviewed elsewhere [8]. Expanding the lessons learned with
CMs, a few recent examples of selections with eight-
stranded b/a-barrel [or triose phosphate isomerase (TIM)
barrel] enzymes have examined the structural requirements
for this fold and the active site differences that separate
members of an enzyme superfamily.
The TIM barrel is the most frequently encountered
enzyme fold [44], and its natural catalytic versatility demon-
strates its enormous potential for enzyme engineering. The
robustness of triose phosphate isomerase (the prototypical
TIM barrel enzyme) to substitutions was examined by
combinatorial mutagenesis and selection for activity using a
TIM-deficient strain of E. coli [45]. In this experiment, 182
residues outside of the TIM active site were mutated to one of
seven amino acids (using binary polar/nonpolar patterning
similar to that described above for MjCM) and introduced
into the selection system. Analysis of complementing
sequences shows that, while most individual sequence
3
-fold
lower catalytic efficiency compared with E. coli OSBS.
Interestingly, this variant retains residual activity for the
MLE II reaction and may therefore resemble a catalytically
promiscuous intermediate of a natural divergent evolution-
ary process.
Both MLE II and OSBS are members of the TIM barrel-
containing enolase superfamily, and therefore both enzymes
catalyze a common chemical step during catalysis of their
respective reactions (Fig. 8) [49]. The gain of function seen
for the MLE II variants, which can be considered as an
extreme case of changing substrate specificity, still represents
the first successful interconversion of catalytic activities
within the well-characterized enolase superfamily. This
result extends prior work that used random mutagenesis
and selection to change substrate specificity without chan-
ging the overall reaction [50].
A rationally designed variant of
L
-Ala-
D
/
L
-Glu epimerase
(a third member of the enolase superfamily, Fig. 8),
containing a mutation (D297G) analogous to that of the
E223G MLE II, also exhibited measurable OSBS activity,
albeit 100-fold lower than that of the selected MLE II
variant [48]. The generality of this single mutation in
mutations were identified that led to a 150-fold increase in
aztreonam resistance. Two of these mutations matched
those found in clinical isolates. Such rapid laboratory
evolution may be useful to better anticipate the natural
evolution of bacterial antibiotic resistance. Hydrolysis of
aztreonam requires a change in the substrate specificity of
TEM-1 b-lactamase. The chance of finding the three
mutations that effected this change was estimated at one
in 10
10
. Genetic selection was used to beat these odds and
find a functional active site with altered structure.
The power of selection is undeniable. Depending on the
application, however, screening can also be useful for
identifying enzymes with novel activities. High-throughput
technology is advancing the size of libraries that can be
thoroughly screened, providing an ever more appealing
alternative to selection [52,53].
Conclusions and outlook
Enzyme function is the product of multiple, subtle inter-
actions within the protein structure. Functional selection of
randomized libraries provides a general, sensitive and
efficient probe of these interactions. The use of selection
techniques with CMs has allowed us to explore the limits of
structure and function for these enzymes.
Although we have learned much from the studies
described above, questions remain. CM, TIM and OSBS
all catalyze reactions with fairly high background rates [2].
Would the complementation rates found in the studies with
these enzymes be lower if the uncatalyzed reactions were
and the power of enzymes as catalysts. Acc. Chem. Res. 34,
938–945.
3. Walsh, C. (2001) Enabling the chemistry of life. Nature 409,
226–231.
4. Koeller, K.M. & Wong, C H. (2001) Enzymes for chemical
synthesis. Nature 409, 232–240.
5. Schmid, A., Dordick, J.S., Hauer, B., Kiener, A., Wubbolts, M.
& Wittholt, B. (2001) Industrial biocatalysts for today and
tomorrow. Nature 409, 258–268.
6. DeGrado, W.F., Summa, C.M., Pavone, V., Nastri, F. & Lom-
bardi, A. (1999) De novo design and structural characterization of
proteins and metalloproteins. Annu. Rev. Biochem. 68, 779–819.
7. Arnold, F.H. (1998) Design by directed evolution. Acc. Chem.
Res. 31, 125–131.
8. Taylor, S.V., Kast, P. & Hilvert, D. (2001) Investigating and
engineering enzymes by genetic selection. Angew. Chem. Int. Ed.
40, 3311–3335.
9. Kast, P. & Hilvert, D. (1997) 3D structural information as a guide
to protein engineering using genetic selection. Curr. Opin. Sruct.
Biol. 7, 470–479.
10. Kast, P., Asif-Ullah, M., Jiang, N. & Hilvert, D. (1996) Exploring
theactivesiteofchorismatemutasebycombinatorialmutagen-
esis and selection: the importance of electrostatic catalysis. Proc.
NatlAcad.Sci.USA93, 5043–5048.
11. Haslam, E. (1993) Shikimic Acid: Metabolism and Metabolites.
Wiley, New York.
12. Anfinsen, C.B. (1973) Principles that govern the folding of pro-
tein chains. Science 181, 223–230.
13.Bowie,J.U.,Reidhaar-Olson,J.F.,Lim,W.A.&Sauer,R.T.
(1990) Deciphering the message in protein sequences: tolerance to
24. Macbeath, G., Kast, P. & Hilvert, D. (1998) Exploring sequence
constraints on an interhelical turn using in vivo selection for
catalytic activity. Protein Sci. 7, 325–335.
25. Brunet, A.P., Huang, E.S., Huffine, M.E., Loeb, J.E., Weltman,
R.J. & Hecht, M.H. (1993) The role of turns in the structure of an
a-helical protein. Nature 364, 355–358.
26. Bennett, M.J., Sclunegger, M.P. & Eisenberg, D. (1995) 3D
domain swapping: a mechanism for oligomer assembly. Protein
Sci. 4, 2455–2468.
27. Macbeath, G., Kast, P. & Hilvert, D. (1998) Redesigning enzyme
topology by directed evolution. Science 279, 1958–1961.
28. Macbeath, G., Kast, P. & Hilvert, D. (1998) Probing enzyme
quaternary structure by mutagenesis and selection. Protein Sci. 7,
1757–1767.
29. Addadi, L., Jaffe, E.K. & Knowles, J.R. (1983) Secondary tritium
isotope effects as probes of the enzymic and nonenzymic
conversion of chorismate to prephenate. Biochemistry 22, 4494–
4501.
30. Copley, S.D. & Knowles, J.R. (1985) The uncatalyzed Claisen
rearrangement of chorismate to prephenate prefers a transition
state of chairlike geometry. J. Am. Chem. Soc. 107, 5306–5308.
31. Guilford, W.J., Copley, S.D. & Knowles, J.R. (1987) On the
mechanism of the chorismate mutase reaction. J. Am. Chem. Soc.
109, 5013–5019.
32. Lyne, P.D., Mulholland, A.J. & Richards, W.G. (1995) Insights
into chorismate mutase catalysis from a combined QM/MM
simulation of the enzyme reaction. J. Am. Chem. Soc. 117, 11345–
11350.
33. Kienho
¨
41. Kast, P., Grisostomi, C., Chen, I.A., Li, S., Krengel, U., Xue, Y.
& Hilvert, D. (2000) A strategically positioned cation is crucial
for efficient catalysis by chorismate mutase. J. Biol. Chem. 275,
36832–36838.
42. Kast, P., Hartgerink, M., Asif-Ullah, M. & Hilvert, D. (1996)
Electrostatic catalysis of the Claisen rearrangement: probing the
role of Glu78 in Bacillus subtilis chrorismate mutase by genetic
selection. J. Am. Chem. Soc. 118, 3069–3070.
42a. Worthington, S.E., Roitberg, A.E. & Krauss, M. (2001) An
MD/QM study of the chorismate mutase-catalyzed Claisen
rearrangement reaction. J. Phys. Chem. B 105, 7087–7095.
42b. Ranaghan, K.E., Riddler, L., Szefczyk, B., Sokalski, W.A.,
Hermann, J.C. & Mulholland, A.J. (2004) Transition state sta-
bilization and substrate strain in enzyme catalysis: ab initio
QM/MM modelling of the chorismate mutase reaction. Org.
Biomol. Chem. 2, 968–980.
43. Gamper, M., Hilvert, D. & Kast, P. (2000) Probing the role of the
C-terminus of Bacillus subtilis chorismate mutase by a novel
random protein termination strategy. Biochemistry 39, 14087–
14094.
44. Reardon, D. & Farber, G.K. (1995) The structure and evolution
of a/b barrel proteins. FASEB J. 9, 497–503.
45. Silverman, J.A., Balakrishnan, R. & Harbury, P.B. (2001)
Reverse engineering the (b/a)
8
barrel fold. Proc.NatlAcad.Sci.
USA 98, 3092–3097.
46. Ho
¨
cker, B., Ju
error-prone DNA polymerase I. Proc. Natl Acad. Sci. USA 100,
9727–9732.
52. Olsen, M., Iverson, B. & Georgiou, G. (2000) High-throughput
screening of enzyme libraries. Curr. Opin. Biotechnol. 11, 331–
337.
53. Arnold, F.H. (2001) Combinatorial and computational chal-
lenges for biocatalyst design. Nature 409, 253–257.
54. Taylor, E.A., Palmer, D.R.J. & Gerlt, J.A. (2001) The lesser
Ôburden borneÕ by o-succinylbenzoate synthase: an ÔeasyÕ reaction
involving a carboxylate carbon acid. J. Am. Chem. Soc. 123,
5824–5825.
55. Looger,L.L.,Dwyer,M.A.,Smith,J.J.&Hellinga,H.W.(2003)
Computational design of receptor and sensor proteins with novel
functions. Nature 423, 185–190.
Ó FEBS 2004 Deciphering enzymes (Eur. J. Biochem. 271) 1637