Directed Evolution of Selective Enzymes - Manfred T. Reetz - E-Book

Directed Evolution of Selective Enzymes E-Book

Manfred T. Reetz

0,0
138,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Authored by one of the world's leading organic chemists, this authoritative reference provides an overview of basic strategies in directed evolution and introduces common gene mutagenesis, screening and selection methods.
Throughout the text, emphasis is placed on methodology development to maximize efficiency, reliability and speed of the experiments and to provide guidelines for efficient protein engineering. Professor Reetz highlights the application of directed evolution experiments to address limitations in the field of enzyme selectivity, substrate scope, activity and robustness. He critically reviews recent developments and case studies, takes a look at future applications in the field of organic synthesis, and concludes with lessons learned from previous experiments.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 600

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

Preface

Chapter 1: Introduction to Directed Evolution

1.1 General Definition and Purpose of Directed Evolution of Enzymes

1.2 Brief Account of the History of Directed Evolution

1.3 Applications of Directed Evolution of Enzymes

References

Chapter 2: Selection versus Screening in Directed Evolution

2.1 Selection Systems

2.2 Screening Systems

2.3 Conclusions and Perspectives

References

Chapter 3: Gene Mutagenesis Methods

3.1 Introductory Remarks

3.2 Error-Prone Polymerase Chain Reaction (epPCR) and Other Whole-Gene Mutagenesis Techniques

3.3 Saturation Mutagenesis: Away from Blind Directed Evolution

3.4 Recombinant Gene Mutagenesis Methods

3.5 Circular Permutation and Other Domain Swapping Techniques

3.6 Solid-Phase Combinatorial Gene Synthesis for Library Creation

3.7 Computational Tools

References

Chapter 4: Strategies for Applying Gene Mutagenesis Methods

4.1 General Guidelines

4.2 Rare Cases of Comparative Studies

4.3 Choosing the Best Strategy when Applying Saturation Mutagenesis

4.4 Techno-Economical Analyses of Saturation Mutagenesis Strategies

4.5 Combinatorial Solid-Phase Gene Synthesis: An Alternative for the Future?

References

Chapter 5: Selected Examples of Directed Evolution of Enzymes with Emphasis on Stereo- and Regioselectivity, Substrate Scope, and/or Activity

5.1 Explanatory Remarks

5.2 Collection of Selected Examples from the Literature 2010 up to 2016

References

Chapter 6: Directed Evolution of Enzyme Robustness

6.1 Introduction

6.2 Application of epPCR and DNA Shuffling

6.3 B-FIT Approach

6.4 Iterative Saturation Mutagenesis (ISM) at Protein–Protein Interfacial Sites for Multimeric Enzymes

6.5 Ancestral and Consensus Approaches and their Structure-Guided Extensions

6.6 Computationally Guided Methods

References

Chapter 7: Directed Evolution of Promiscuity: Artificial Enzymes as Catalysts in Organic Chemistry

7.1 Introductory Background Information

7.2 Tuning the Catalytic Profile of Promiscuous Enzymes by Directed Evolution

7.3 Conclusions and Perspectives

References

Chapter 8: Learning from Directed Evolution

8.1 Background Information

8.2 Case Studies Featuring Mechanistic, Structural, and/or Computational Analyses of the Source of Evolved Stereo- and/or Regioselectivity

8.3 Additive versus Non-additive Mutational Effects in Fitness Landscapes

References

Index

End User License Agreement

Pages

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

303

304

305

306

307

308

Guide

Cover

Table of Contents

Preface

Begin Reading

List of Illustrations

Chapter 1: Introduction to Directed Evolution

Scheme 1.1 The basic steps in directed evolution of enzymes. The rectangles represent 96 well microtiter plates that contain enzyme variants, the red dots symbolizing hits.

Scheme 1.2 Pedigree of

ebgA

alleles in evolved strains [13b]. Strain 1B1 carries the wild type allele,

ebgAO

. Strains on line one have a single mutation in the

ebgA

gene; those in line two have two mutations in

ebgA

; those in line three have three mutations in

ebgA.

All strains are

ebgR

. Strains enclosed in rectangles were selected for growth on lactose; those enclosed in diamonds were selected for growth on lactulose; those in circles were selected for growth on lactobionate. This pedigree shows only the descent of the

ebgA

gene; that is, strains SJ-17, A2, 5A2, and D2 were not derived directly from IBI, but their

ebgA

alleles were derived directly from the

ebgA

allele carried in IBI.

Scheme 1.3 Logic of Darwinian evolution in the laboratory according to Eigen and Gardiner [17].

Scheme 1.4 Early example of directed evolution of thermostability with kanamycin nucleotidyltransferase (KNT) serving as the enzyme and a mutator strain as the random mutagenesis technique in an iterative manner [22].

Scheme 1.5 Mixed oligonucleotide mutagenesis of the gene MATa1 from

Saccharomyces cerevisiae

[25].

Figure 1.1 Catalytic efficiency of WT subtilisin E and variant PC3 as catalysts in the hydrolytic cleavage of

N

-succinyl-

l

-Ala-

l

-Ala-

l

-Pro-

l

-Met-

p

-nitroanilide [41].

Scheme 1.6 Steps in the recombinant technique of splicing by overlap extension (SOE), illustrated here using two different genes [43].

Scheme 1.7 DNA shuffling starting from a single gene encoding a given enzyme.

Scheme 1.8 Concept of directed evolution of stereoselective enzymes with (

R

)- or (

S

)-selective mutants being accessible on an optional basis [69].

Scheme 1.9 Hydrolytic kinetic resolution of

rac

-

1

catalyzed by the lipase from

Pseudomonas aeruginosa

(PAL) [69a].

Scheme 1.10 First example of directed evolution of a stereoselective enzyme [69a]. The model reaction involves the hydrolytic kinetic resolution of

rac

-

1

catalyzed by the lipase PAL, four rounds of epPCR being used as the gene mutagenesis method.

Chapter 2: Selection versus Screening in Directed Evolution

Scheme 2.1 Screening versus selection in directed evolution [2].

Scheme 2.2 Chemistry involved in the directed evolution of N-acyl amino acid racemase (NAAAR) with the aim of increasing its activity for dynamic kinetic resolution of amino acids [9].

Figure 2.1 Agar plate harboring 96

E. coli

colonies in the presence of 8 mM of an epoxide after 8 days of incubation [11]. The four spots (colonies) indicate the presence of active epoxide hydrolase mutants.

Scheme 2.3 Genetic selection in the directed evolution of enantioselective enzymes [2].

Scheme 2.4 Concept of growth-based selection method employing pro-antibiotic substrates [12].

Scheme 2.5 A genetic selection system for directed evolution of enantioselective enzymes in kinetic resolution [13].

Scheme 2.6 Genetic selection system utilizing a pseudo-racemate (

S

)-

1

/(

R

)-

4

in the CALB-catalyzed hydrolytic kinetic resolution [13]. (Note that the designation of absolute configuration upon going from (

S

)-

1

to (

R

)-

2

or from (

R

)-

4

to (

S

)-

2

switches according to the priority rules of the CIP convention).

Scheme 2.7 LipA-catalyzed hydrolytic kinetic resolution of

rac

-

6

[14]. (Notice the switch in the designation of absolute configuration due to a change in priority of the substituents in accord with the Cahn–Ingold–Prelog (CIP) nomenclature.)

Scheme 2.8 Compounds used in devising a selection system for the evolution of S-selective LipA variants as catalysts in the hydrolytic kinetic resolution of

rac

-

9

[14].

Scheme 2.9 Dual selection system based on the use of phosphonate inhibitors immobilized on the solid carrier SIRAN [17b].

Scheme 2.10 HRP-catalyzed radical polymerization of

l

- or

d

-tyrosine and Alexa-488 derivatives [18].

Scheme 2.11 Model hydrolytic kinetic resolution catalyzed by the esterase EstA and used in FACS-based assessment of enantioselectivity [19].

Scheme 2.12 Schematic representation of coupling reactions ensuring covalent attachment of tyramide species on the surface of

E. coli

cells [19]; E, esterase; P, peroxidase.

Scheme 2.13 Differentially labeled enantiomers of 2-MDA tyramide ester substrates (

S

)-

14

and (

R

)-

15

[19].

Figure 2.2 FACS-based high-throughput analysis of enantioselectivity [19]. (a) Overlay of flow-cytometry analyses of esterase-displaying cells that were incubated for 60 min with either (

S

)- or (

R

)-enantiomer of tyramide ester. (b) EstA library sort. The green window indicates the sorting gate. (c, d) FACS histogram of WT EstA (c) and clone 2-R-43 (d) after 5 min incubation with a 1 : 1 mixture of both enantiomeric substrates and fluorescence staining. The inlet shows the percentage of cells within the respective green and red gate.

Scheme 2.14 Hydrolytic kinetic resolution utilizing the pseudo-racemate (

R

)-

16

/(

S

)-

18

catalyzed by esterase PFE in which an energy source (

17

) and a cell poison (dibromide

19

) are generated [21].

Scheme 2.15 Illustration of gene selection in compartmentalized oil-in-droplet emulsions [25].

Figure 2.3 Ultrahigh-throughput sequence-function mapping based on deep mutational scanning [27g]. (a) General overview of mapping protocol; (b) droplet-based microfluidic screening leading to the recovery of functional sequences from the initial random mutagenesis library; (c) frequency of 3083 amino acid exchanges in the unsorted and sorted glycosidase libraries; and (d) reproducibility of the sequence-function mapping protocol with two independent replicates showing good agreement in amino acid frequencies.

Figure 2.4 (a) Time course of the lipase-catalyzed hydrolysis of two enantiomeric

p

-nitrophenyl esters (

S

)- and (

R

)-

11

(Scheme 2.11) separately using WT enzyme with poor stereoselectivity. (b) Time course of the lipase-catalyzed hydrolysis of the two enantiomeric

p

-nitrophenyl esters

11

using an enzyme variant with enhanced (

S

)-enantioselectivity [20a].

Figure 2.5 On-plate pre-test for lipase activity based on halos that form upon hydrolysis of tributyrin [31]. White dots represent bacterial colonies harboring active lipases; those having no (clear) black background contain inactive mutants.

Scheme 2.16 Periodate-coupled fluorogenic assay designed for assessing the enantioselectivity of hydrolases [33]. (Badalassi

et al.

[33]. Reproduced with permission of John Wiley & Sons.)

Scheme 2.17 Enzyme-coupled assay for assessing the activity of lipases or esterases, measurement of the apparent enantioselectivity also being possible when using (

R

)- and (

S

)-substrates separately [34a].

Scheme 2.18 Utilization of isotopically labeled pseudo-eantiomers in high-throughput screening of mutant libraries generated by directed evolution [36]. (a) Asymmetric transformation of a mixture of pseudo-enantiomers involving cleavage of the functional groups FG and labeled functional groups FG*. (b) Asymmetric transformation of a mixture of pseudo-enantiomers involving either cleavage or bond formation at the functional group FG; isotopic labeling at R2 is indicated by the asterisk. (c) Asymmetric transformation of a pseudo-meso substrate involving cleavage of the functional groups FG and labeled functional groups FG*. (d) Asymmetric transformation of a pseudo-prochiral substrate involving cleavage of the functional group FG and labeled functional group FG*.

Scheme 2.19 Medium-throughput unit containing two GC instruments and one PC used in screening PAL-mutants generated by directed evolution for enhanced enantioselectivity [40].

Scheme 2.20 General protocol for screening by pooling defined-cell cultures overexpressing enzyme variants [43b]. (a) Pick and inoculate individual colonies. (b) Induce expression of variants. (c) Recover by centrifuging individual cell pellets and combine all eight cell pellets belonging to the same column. (d) Lyse cells and incubate 12 biotransformations per plate by adding the appropriate reagents. (e) Extract product with organic solvent and analyze organic layer by GC; hits will be identified in this step by setting an appropriate threshold. (f) Using as a reference the plate from step (e) return to master plate in step (a) and re-examine the columns of interest, if there is any, by adding reagents to each well separately.

Scheme 2.21 Merging efficient mutagenesis strategies for generating smaller but higher quality mutant libraries with increased ee-assay capacity on the basis of multiplexing GC and/or HPLC [47].

Chapter 3: Gene Mutagenesis Methods

Scheme 3.1 Illustration of epPCR [5].

Scheme 3.2 Mutational bias of ep-PCR in the case of the lipase from

Bacillus subtilis

[11]. The substitution of one nucleotide per codon results in nine new triplets which may encode four to seven different amino acids depending on the type of codon. (a) The example shows that the mutation of the codon AAC coding for asparagine can yield a maximum of seven different amino acids, whereas the mutation of the codon CGA coding for arginine can yield a maximum of four different amino acids. (b) Low frequencies of transversions G → T, C → A, G → C, and C → G result in a further decrease of diversity: for codon AAC, six different amino acid exchanges may occur, and for the GC-rich codon CGA just a single new amino acid exchange is expected. Background color coding: white shows codons that encode new amino acids, gray indicates silent mutations, or the formation of stop codons, and black shows codons that would require the formation of an unfavored basepair exchange (G → T, C → A, G → C, or C → G). The bold letters indicate nucleotides exchanged by ep-PCR.

Scheme 3.3 The four basic stages of SeSaM [32a]. Step 1: Generation of a pool of DNA fragments characterized by a random size distribution; step 2: Enzymatic elongation of DNA fragments using the universal base deoxyinosine; step 3: PCR-based full-length gene synthesis using a single-stranded template and a reverse primer which amplifies the new strand; and step 4: Replacement of deoxyinosine by one of the four standard nucleotides by PCR.

Scheme 3.4 Illustration of random insertion/deletion (RID) mutagenesis for the construction of a library of mutant genes [38]. Step 1: (1) The fragment obtained by digesting the original gene with

Eco

RI and

Hin

dIII is ligated to a linker and (2) the product is then digested with

Hind

III to make a linear dsDNA with a nick in the antisense chain. Step 2: The gene fragment is cyclized with T4 DNA ligase to make a circular dsDNA with a nick in the antisense chain. Step 3: The circular dsDNA is treated with T4 DNA polymerase to produce a circular ssDNA. Step 4: The circular ssDNA is randomly cleaved at single positions by treating with Ce(IV)–EDTA complex. Step 5: The linear ssDNAs, which have unknown sequences at both ends, are ligated to the 5′-anchor and the 3′-anchor, respectively. Step 6: The DNAs that are linked to the two anchors at both ends are amplified by PCR. Step 7: The PCR products are treated with

Bci

VI, leaving several bases from the 5′-anchor, at the 5′-end. The

Bci

VI treatment also deletes a specific number of bases at the 3′-end. Step 8: The digested products are treated with Klenow fragment to make blunt ends and cyclized again with T4 DNA ligase. The products are treated with

Eco

RI and

Hin

dIII, and the fragments are cloned into an

Eco

RI-

Hin

dIII site of modified pUC18 (pUM).

Scheme 3.5 Illustration of saturation mutagenesis based on the QuikChange™ (Stratagene/Agilent) protocol [52].

Scheme 3.6 General illustration of megaprimer PCR [55].

Scheme 3.7 Steps in site-directed mutagenesis by overlap extension PCR which can also be used for randomization at single residues or sites composed of more than one amino acid position [54]. Lines with arrows represent the dsDNA and synthetic oligonucleotides with the arrows indicating the 5′ to 3′ orientation. Small black rectangles denote the site of mutagenesis. Lower-case letters refer to oligos while the PCR products are indicated by pairs of upper-case letters corresponding to the oligo primers which are employed to generate the product. The box represents the intermediate steps at which the denatured fragments anneal at the overlap and are extended b 3′ by the DNA polymerase (dotted line). Further PCR amplification occurs by additional primers “a + d.”

Scheme 3.8 Efficient method for saturation mutagenesis useful for cases of difficult-to-amplify templates [56], the scheme showing variation of the antiprimer position. The gene is represented in blue, the vector backbone in gray, and the formed megaprimer in black. In the first stage of the PCR, both the mutagenic primer (positions randomized represented by a red square) and the antiprimer (or another mutagenic primer, shown to the right) anneal to the template and the amplified sequence is used as a megaprimer in the second stage. Finally, the template plasmids are digested using

Dpn

I, and the resulting library is transformed in bacteria. The scheme to the left illustrates the three possible options in the choice of the megaprimer size for a single site randomization experiment. The scheme to the right represents an experiment with two sites being simultaneously randomized.

Scheme 3.9 Illustration of (a) CAST sites comprising randomization sites A, B, C, and so on [66]. (Reetz

et al.

[66]. Reproduced with permission of John Wiley & Sons.). (b) ISM scheme for 2-, 3-, and 4-site systems involving 2, 6, and 24 upward pathways, respectively [67, 68].

Figure 3.1 Library coverage calculated for NNK codon degeneracy at sites consisting of one, two, three, four, and five amino acid positions (aas, amino acids) [69a].

Figure 3.2 Library coverage calculated for NDT degeneracy at sites consisting of one, two, three, four, or five amino acid positions [69a].

Figure 3.3 Correlation between oversampling factor

O

f

and percent library coverage [69a].

Figure 3.4 Probabilities of “full coverage” and of discovering at least one of the top

k

protein variants in variant space as a function of the library size when randomizing sites comprising one, two, three, and four amino acid positions in the case of NNK codon degeneracy [23e].

Figure 3.5 Patrick-Firth versus Nov statistical metrics [73]. Relationship between the expected completeness (i.e., library coverage) algorithm by Patrick and Firth as computed by GLUE-IT, and the concept of finding at least one of the

n

th-best variants (with a 95% probability) by Nov as computed by TopLib. This mathematical relationship is independent of the number of positions randomized and of the randomization scheme.

Figure 3.6 Screening effort required for different randomization schemes regarding sites composed of two or three amino acid residues [75b]. The choice of codon degeneracy dictates the sampling size for a desired statistical coverage of the library. For a 95% library coverage targeting two amino acid residues (red lines), 3068 samples have to be screened in the case of NNK/S, whereas only 1450 are necessary when applying the 22c-trick (53% lower screening effort). However, if the assumed capacity of medium-throughput systems is limited to 5000 samples, the library coverage drops to 71% when using NNN degeneracy. Similarly, when targeting three amino acid residues (blue lines) and limiting the sample size to 5000 colonies or transformants, the library coverage changes drastically to 38, 14, and 2% in the case of the 22c-trick, NNK/S, and NNN, respectively.

Figure 3.7 Distribution of nucleotide bases in the randomized residue Leu426 of CHMO [75b]. The percentual distribution of nucleotides is shown in pie diagrams for each of the three randomized bases using the 22c-trick (left) and NNK (right) degeneracies. (a) Theoretical expected distribution. (b) Experimental distribution calculated from the sequencing of 89 and 130 individual clones from the 22c-trick and NNK libraries, respectively. (c) Experimental quick quality control from colony pooling. The nucleotide base guanidine (G) is depicted in black, adenosine (A) in green, threonine (T) in red, and cytosine (C) in blue.

Figure 3.8 Searching sequence space by single-gene shuffling versus family shuffling [78b].

Scheme 3.10 A schematic example of biased mutation-assembling, assuming a basis set of three mutations [94]. The circle, triangle, and square each represent one mutation. A block represents a portion of the gene containing one mutation and represents a recombination unit. The double-headed arrows represent overlapping sequences between adjacent blocks and these overlapping sequences hybridize during PCR recombination.

Scheme 3.11 Illustration of ISOR; the use of biotinylated DNA and purification by capture onto streptavidin-coated beads is optional [96].

Scheme 3.12 General concept of ADO [99] with two strategies for the linking of fragments being possible (cases I and II). In case I the two genes A and B to be virtually shuffled are aligned; the different colored stars refer to information that encoded different amino acids, while oligonucleotide fragments with both colored stars in the same position of the parent gene denote the synthetic oligonucleotide fragment with degenerate nucleotides. The gray blocks denote conserved regions of sequence that can be used as the linking part with homologous recombination. Case II shows no homology between flanking oligos, which can be assembled by ligation between ssDNA with an unknown terminal sequence.

Scheme 3.13 Schematic overview of CALB engineering by circular permutation and subsequent incremental truncation of the newly created surface loop in cp283, the most active variant among the lipase permutants [105e].

Figure 3.9 Model reaction and library design for comparing traditional PCR-based saturation mutagenesis libraries with Slone libraries [73]. (a) Testosterone hydroxylation by P450BM3 mutants. (b) Active site of P450BM3 mutant F87A. The three CAST sites and the F87A residue are highlighted. The structure was modeled by docking computations using the Schrödinger software and the picture was created with PyMol. (c) Diversity design of the combinatorial P450BM3-F87A libraries used in this study. Library A consists of three simultaneously randomized positions, whereas library B and C consist of two. PCR-based libraries use either the nonredundant NDC codon (library A + C) or the redundant NNK codon (library B). (d) Sloning-based libraries encode the same set of amino acids using the displayed codon usages. Gray codons are present in both designs.

Figure 3.10 Screening results comparing PCR with Sloning libraries [73]. Total testosterone conversion (%HPLC) of the six combinatorial libraries is shown as a function of either 15β-OHT or 2β-OHT regioselectivity. Colored entries show the data of the Sloning libraries, while gray entries represent the PCR library results. The green circle highlights a cluster corresponding to parental transformants in PCR libraries.

Scheme 3.14 SCHEMA disruption based upon a contact matrix representing interactions between amino acids in the three-dimensional structure of a protein (illustrated here with a simplified model) [140a]. (a) Disruptions in a simplified model and (b) contact matrix to be adjusted for the sequence identity of the parent enzymes.

Scheme 3.15 Formal representation of ProSAR [142, 143].

Scheme 3.16 Steps when applying ASRA to directed evolution [145].

Figure 3.11 Optimal reordering of the

E

-value enantioselectivity landscapes with 60 min reaction time [145]. (a) Color heat map for the enantioselectivity landscape (

E

-values) of 95 randomly sampled mutants plotted with a random amino acid ordering. Each color square represents one mutant with red indicating a high

E

-value and blue corresponding to a low

E

-value (see color bar on the far right). White squares are unsampled proteins. (b)

E

-value landscape of the 95 mutants using the ASRA-identified optimal amino acid ordering. The result predicts that proteins with high

E

-values are most likely located in the lower right corner. The mutant at position 16/20 (circled in red in both (a) and (b) of the reordered landscape turned out to be the same as the mutant at position 20/19; the wrong protein was accidentally placed in this position in the experiment. (c)

E

-value landscape for 45 newly sampled mutants, guided by the ordering in (b). (d) E value distribution for the 95 initial random mutants. (e) Reordered

E

-value landscape for the 94 mutants (excluding the erroneous mutant at position 16/20 in (b). (f)

E

-value landscape for the 45 newly sampled mutants, based on the ordering in the enantioselectivity factor

E

.

Chapter 4: Strategies for Applying Gene Mutagenesis Methods

Scheme 4.1 Two choices when attempting to optimize thermostability and activity of an enzyme. (a) Engineer thermostability and then activity. (b) Engineer both thermostability and activity simultaneously.

Scheme 4.2 Preferred approach for the simultaneous optimization of two catalyst properties A and B [3]. Black star indicates the desired variant; blue and green dashed lines, stringent thresholds; blue and green rectangles, relaxed thresholds; blue and green filled circles, best mutant for property A and B, respectively, which are not used in further mutagenesis; red-crossed blue and green circles, variants with improved property A or B; red-crossed black circles, mutants with improved A and B property. Black dashed arrows, second round of mutagenesis.

Scheme 4.3 The strategy of

in vitro

coevolution (substrate walking) for engineering novel protein functions [15, 16]. The wild-type (WT) protein function and the novel protein function are separated by an inactive region of sequence space, which may be filled by two intermediate functions (

I

1

and

I

2

) that are amenable to conventional directed evolution. The arrows illustrate a potential evolutionary path leading to the novel protein function.

Scheme 4.4 Model reactions in the directed evolution of a fucosidase from a galactosidase [17, 18].

Scheme 4.5 DNA shuffling process used in the directed evolution of a fucosidase starting from a galactosidase [17].

Figure 4.1 Structure of BGAL active site [19] used as a guide in designing saturation mutagenesis at amino acid positions 201, 540, and 604 [18].

Figure 4.2 Selected BGAL variants resulting from saturation mutagenesis at a site composed of amino acid positions 201, 540, and 604 [18].

Scheme 4.6 Model reaction used in the directed evolution of PAL [13, 20, 22–25].

Scheme 4.7 Alternating saturation mutagenesis at different positions with epPCR in the quest to enhance the enantioselectivity of PAL in the model reaction

rac

-

6

→ (

S

)-

7

+

3

[23].

Figure 4.3 Binding pocket of PAL [27] for the acid part of

rac

-

6

(green) showing the geometric position of amino acids 160–163 (blue), which were randomized simultaneously by saturation mutagenesis to enhance enantioselectivity [20]. Ser82 (red), as part of the catalytic triad Asp/His/Ser, attacks the carbonyl function nucleophilically with rate- and stereoselectivity-determining formation of a short-lived oxyanion.

Scheme 4.8 Extended CMCM in the evolution of an (

S

)-selective variant X in the hydrolytic kinetic resolution of

rac

-

6

[20]. Green star, position 20; purple star, position 161; yellow star, position 234; red circle, position 53; orange circle, position 180; and blue circle, position 272.

Scheme 4.9 Summary of all comparative studies of PAL as a catalyst in the hydrolytic kinetic resolution of

rac

-

6,

including the result of the final study based on ISM (far right) [13].

Figure 4.4 Schematic representation of amino acid residues considered for saturation mutagenesis [13], based on the X-ray structure of WT-PAL [27]: sites A (Met16/Leu17, green), B (Leu159/Leu162, blue), and C (Leu231/Val232, yellow) around the active site Ser82 (stick representation in red) in the acid-binding pocket (purple circle). The red circle marks the alcohol-binding pocket, in the case at hand harboring the

p

-nitrophenyl moiety of

rac

-

6

. At the top of picture, helix and loop in wheat (right, Asp113-Leu156) and light pink (left, Pro203-Asn228) represent lid 1 and lid 2, respectively.

Scheme 4.10 Biocatalytic route to sitagliptin phosphate using a transaminase evolved by applying ISM, epPCR, and DNA shuffling [16].

Scheme 4.11 Model compound (

11

) used in substrate walking based on

in vitro

coevolution [16].

Scheme 4.12 Model hydrolytic kinetic resolution of the glycidyl ether

rac

-

15

catalyzed by ANEH [8].

Figure 4.5 CAST sites A–E [8] of the epoxide hydrolase from

Aspergillus niger

(ANEH) chosen on the basis of the X-ray crystal structure of the WT [36]. (a) Defined randomization sites A (orange), B (blue), C (gray), D (green), and E (yellow). (b)Top view of tunnel-like ANEH binding pocket showing sites A–E (blue) and the catalytically active Asp192 (red).

Figure 4.6 Complete experimental exploration of a 24-pathway ISM system involving the ANEH-catalyzed hydrolytic kinetic resolution of

rac

-

15

(Scheme 4.12). (a) Portion of the 24-pathway ISM scheme showing the 12 best pathways leading to ANEH variants displaying

E

> 78 (

S

) and (b) portion of the 24-pathway ISM scheme showing the 12 least productive pathways leading to ANEH variants with

E

= 28–78 (

S

) [31].

Figure 4.7 Fitness pathway landscape featuring the 24 trajectories leading from WT ANEH to the respective final variants with enhanced enantioselectivity at the end of each pathway as specified by the respective

G

values [31]. Solid line: typical pathway in which each mutant library contains at least one variant displaying enhanced enantioselectivity; dotted line: typical pathway in which at least one library is devoid of an improved variant, in which case an inferior mutant was employed in the subsequent ISM step, thereby escaping from the local minimum.

Figure 4.8 Free energy profiles of the 24 ISM pathways in the directed evolution of ANEH as pictured in a front view of the fitness-pathway landscape [31]. In the green pathways all relevant saturation mutagenesis libraries contain improved variants (enhanced enantioselectivity) in the model reaction (Scheme 4.12); the eight red pathways denote those in which at least one library in the step evolutionary process is devoid of any improved variants (local minimum).

Figure 4.9 First derivative of

G

at every stage of each of the 24 ISM pathways in the directed evolution of ANEH (see Figure 4.7 and 4.8). (a) View from top of fitness pathway landscape and (b) view from the side [31].

Figure 4.10 Toward universal blood [42]. (a) Carbohydrate antigenic determinants of A-, B-, and H-antigens. The H-antigen is present on glycans of the O blood-group, and typically nonantigenic except in rare cases. (b) Site of cleavage of A- and B-antigens by GH98 EABase enzymes from type 2 chains of erythrocytes. (c) Various chain types to which A-antigens are present on erythrocytes and other cell types. (d) Structure of the fluorogenic substrate MUType1Apenta. (e) First- and second-sphere randomization sites chosen for iterative saturation mutagenesis (ISM), guided by the X-ray structure of Sp3GH98. First sphere: Tyr 560 and Trp561; second sphere: Tyr 530, Asn559, Ile 562, Asn592, and Lys624.

Figure 4.11 Evolutionary pathways of Sp3GH98 based on iterative saturation mutagenesis (ISM) and one final round of epPCR (upper right) [42].

Scheme 4.13 Two different approaches to the use of reduced amino acid alphabets in saturation mutagenesis, if necessary followed ISM [1b, 10, 12, 34].

Scheme 4.14 Oxidative kinetic resolution catalyzed by PAMO mutants [34a].

Scheme 4.15 Sequence alignment of BVMOs (441–444 loop in gray box) [34a].

Scheme 4.16 PAMO-catalyzed oxidative kinetic resolution of 2-alkyl substituted cyclohexanone derivatives [51].

Scheme 4.17 Substrates investigated in the saturation mutagenesis based directed evolution of CALA using mutant F149Y/I150N/F233G as template [34b].

Figure 4.12 Binding pocket of CALA showing tetrahedral intermediate with substrate

21

and nine residues for potential saturation mutagenesis [34b]. The original WT residues are underlined.

Scheme 4.18 Hydrolytic desymmetrization of meso-epoxides catalyzed by LEH and mutants thereof [54, 55a, 56a,b].

Figure 4.13 Large randomization site defined by 10 amino acid positions (green) chosen on the basis of the crystal structure of LEH [55c] with the catalytic residues being shown in pink [56a,b].

Scheme 4.19 Primer design and library construction using valine as the sole building block and the 10 randomization positions in LEH according to Figure 4.13 [56a].

Figure 4.14 Best hits discovered in a mutant library created by a single saturation mutagenesis experiment using valine as the sole building block at a 10-residue randomization site in LEH serving as the catalyst in the hydrolytic desymmetrization of epoxide

25b

(Scheme 4.18) [56a].

Scheme 4.20 Application of best variants of alcohol dehydrogenase TbSADH as catalysts in the asymmetric reduction of difficult-to-reduce ketones, evolved by application of triple code saturation mutagenesis (TCSM) [56c].

Scheme 4.21 Flow diagram of structure-based directed evolution via ISM [59].

Figure 4.15 Total cost as a function of screening cost, when randomizing a single position using five randomization schemes. Primer cost is cprimer = 1 [60].

Figure 4.16 Cost space partitioned into regions according to the optimal randomization scheme (a single randomized position, assuming 100% yield, and no WT bias) [60].

Chapter 6: Directed Evolution of Enzyme Robustness

Figure 6.1 The six stabilizing mutations evolved in the insect α-carboxylesterase from

Lucilia cuprina

[14].

Figure 6.2 Catalytic performance of a variant of the feruloyl esterase from

Aspergillus niger

in the degradation of steam-exploded corn stalk as biomass [16a].

Figure 6.3 Workflow in the directed evolution of a highly improved xylanase [20].

Figure 6.4 NMR spectra recorded for native and thermally treated

15

N-labeled Lipase A mutant XI. (a) 1D

1

H spectra. (b–d) 2D [

15

N,

1

H]-HSQC spectra of mutant XI Lipase A: (b) native; (c) recovered after 60 °C treatment; (d) recovered after 80 °C treatment [28].

Figure 6.5 Results of limited ISM exploration starting from the best mutant, GUY-003 (site B), and the worst mutant, GUY-007, in the initial round of saturation mutagenesis at sites A–F. In all cases, NDT codon degeneracy was used except when performing saturation mutagenesis at site D, in which case NNG codon degeneracy was applied [30].

Figure 6.6 B-FIT based thermostabilization of endoglucanase I from

Trichoderma reesei

[32]. Disulfide bonds are shown in blue.

N

-glycosylation sites are shown in magenta. Mutagenesis sites are shown in red and are labeled as follows: A (aa 284–287), B (aa 301–302), C (aa 113, 115), D (aa 238), E (aa 230), F (aa 323), and G (aa 291). Mutations at site C and site E resulted in improved TrEGI enzyme variants. PDB code 1EG1.

Figure 6.7 Thermostabilization of PcDTE following application of ISM [42]. (a) Thermostability, expressed as the

T

50

20

value, of all variants involved in this study: PcDTE wild-type (red bar), hits obtained in the first SM round (black bars), variants 2–8 obtained by ISM (blue bars), and variant 8C obtained by combination of the eight mutations from the first round (green bar). Mutation D164E was excluded in combinations as no improved variant could be identified during ISM. (b) Residual activity curves of WT PcDTE, variants 1–8, and variant 8C, fitted to a second-order sigmoidal function [42].

Figure 6.8 SCHEMA-based noncontiguous recombination library design [63]. (a) A graph view of the blue block and neighboring residues, with nodes representing residues, and edges representing residue – residue contacts. Colored, dashed lines define the graph partitions for each block. Contacts to residues from other blocks (highlighted) are broken upon recombination. (b) The 12-block design displayed on the structure of P2 (1Q9H.pdb). Each block (labeled A–L) is represented by a different color, and conserved residues are in gray. (c) The 12-block design displayed on the numbered sequence alignment of the catalytic domains of the three parental enzymes.

Figure 6.9 FRESCO strategy for protein thermostabilization [64].

Figure 6.10 Positions of 12 stabilizing mutations as revealed by the crystal structure of the LEH-F1b and P dimers [66]. Mutations introducing surface-located positively charged residues are indicated in blue, surface-located negative charges are shown in red, and buried hydrophobic residues in black. Proline residues in loops are in purple and disulfide bonds in yellow. Mutations are indicated once per dimer.

Figure 6.11 Workflow of the FireProt method. Individual steps involved in the energy- and evolution-based approaches [68].

Figure 6.12 Schematic representation of VisualCNA [74]. (a) Illustration of the technique's iterative work flow for optimization of protein thermostability. (b) PyMOL window showing the 3D protein structure at the melting point. Rigid clusters are shown as uniformly colored semi-transparent bodies. Constraints due to hydrogen bonds, salt bridges, and hydrophobic contacts are shown as red, magenta, and green sticks, respectively. A mutation is shown in yellow stick representation. Flexible regions are shown in gray. (c) The VisualCNA Analyze panel shows a comparison of multiple graphs from wild-type (black) and mutant (red) analyses. (1) Global indices with transition points are indicated as vertical lines. (2) Local index with a red circle indicates the mutation and a horizontal red line shows the unfolding state. (3) Difference stability map between wild-type and mutant. (4) Likelihood of a residue of being a structural weak spot with the mutant is shown in red.

Figure 6.13 Divide and combine approach to protein thermostabilization featuring unfolding equilibria of a three-state protein [81]. (a) Ribbon cartoons represent the conformation of apoflavodoxin in the three states populated in its thermal unfolding equilibrium. The native state is represented by the crystal structure of the WT protein (pdb id: 1ftg); the intermediate state by the solution structure of the F98N variant (pbd id: 2kqu), and the unfolded state by one of the 2000 conformations calculated for the unfolded ensemble using the ProtSA server. The low temperature transition (T

1

) signals the unfolding of the less stable region leading to an equilibrium intermediate. The higher temperature transition (T

2

) represents the unfolding of the intermediate, leading to the unfolded state. The free energy difference between the native and the intermediate conformation (Δ

G

NI

) is termed

relevant stability of the protein

while that between the intermediate and the fully unfolded conformation (Δ

G

IU

) is termed

residual stability of the protein

. (b) Simplified scheme depicting a protein with two structural regions of different stability (less stable region in cyan, and more stable one in pink) and the likely effects of mutations on T

1

and T

2

are shown. Type 1 mutations, those introduced in the unstable region or at its interface with the more stable one, will mainly modify the relevant stability of the protein. Type 2 mutations, those introduced in the more stable region, will only modify the residual stability of the protein.

Chapter 7: Directed Evolution of Promiscuity: Artificial Enzymes as Catalysts in Organic Chemistry

Scheme 7.1 Whitesides system comprising a biotinylated achiral diphosphine/Rh-complex noncovalently bound to avidin, which was used as the catalyst in the asymmetric olefin-hydrogenation of

N

-acyl acrylic acid [20]. Later streptavidin was employed as host [21, 22].

Scheme 7.2 Systematization for generating artificial metalloenzymes as hybrid catalysts [21b,c, 25]. L, synthetic ligand; M, transition metal; D, donor atoms of side-chains of appropriate amino acids such as aspartate or cysteine which bind transition metals M directly.

Figure 7.1 Water-soluble Cu(II)-phthalocyanine used in bioconjugation to serum albumins [29].

Figure 7.2 Model of 1-HSA [29] based on the crystal structure of human serum albumin (HSA) harboring Fe-protoporphyrin dimethyl ester [30].

Scheme 7.3 Diels–Alder reaction of azachalcones

2

with

3

leading to endoproducts

4

[29].

Scheme 7.4 Schematic representation for the generation of a myoglobin-based chromium-salen hybrid catalyst for the asymmetric sulfoxidation of thioanisole [31].

Figure 7.3 Manganese Schiff base complex introduced into apo-myoglobin [32].

Scheme 7.5 Chemical modification of tHisF mutant Cys9Ala/Asp11Cys by means of Michael additions that lead to bioconjugates A, and S

N

2-reactions that provide bioconjugates B [25].

Scheme 7.6 Model Diels–Alder cycloaddition used in Rosetta-design [37].

Scheme 7.7 Model reaction used in the directed evolution of the Whitesides system [21].

Figure 7.4 Model of the biotinylated diphosphine-Rh-complex in streptavidin [21].

Scheme 7.8 Directed evolution of stereoselectivity of a promiscuous enzyme based on the Whitesides system, iterative saturation mutagenesis (ISM) being employed as the genetic tool and the Rh-catalyzed hydrogenation of substrate

8

with formation of

9

serving as the model reaction [21].

Scheme 7.9 Fingerprint display of the results for the chemogenetic optimization of the reduction of ketones 1 and 3 in the presence of biotin-sepharose-immobilized artificial metalloenzymes [η

6

-(arene)RuH(Biot-

p

-L)] ⊂ Sav mutant. Catalytic runs which could not be performed (insufficient soluble protein expression are represented by white triangles) [43]. Substrates, reduction products, and operating conditions used for the designed evolution of artificial transfer hydrogenases. η

6

-arene = benzene,

p

-cymene; Sav mutant: K121X, L124X, S112A K121X, S112K K121X, S112A L124X, S112K K124X. The catalytic runs were performed at 558° C for 64 h using the mixed buffer NaO

2

CH (0.48 m), B(OH)

3

(0.41 m), and 3-(

N

-morpholino)propanesulfonic acid (MOPS, 0.16 m) at pH initial 6.25. Ru/substrate/formate ratio 1 : 100 : 4000.

Scheme 7.10 X-ray crystal structure of [η

6

-(benzene)RuCl(Biot-

p

-L)] ⊂ S112K Sav. (a) Close-up view (only monomer B (blue) occupied by the biotinylated catalyst (ball-and-stick representation); monomers A (green), C (orange), and D (yellow)). (b) Highlight of amino acid sidechain residues displaying short contacts with Ru. The absolute configuration at ruthenium is S. (c) Superimposition of the structure of [η

6

-(benzene)RuCl(Biot-p-L)] ⊂ S112K Sav with the structure of biotin ⊂ core streptavidin (PDB reference code 1STP, only monomers A and B displayed for clarity; biotin: white stick, core streptavidin: white tube). (d) Ru–Ca distances extracted from the X-ray structure of [η

6

-(benzene)RuCl(Biot-p-L)] ⊂ S112K Sav; monomers: A, black; B, blue; C, green; and D, red [43].

Scheme 7.11 Mechanistically essential amino acid residues in

Agrobacterium radiobacter

epoxide hydrolase (EchA). (a) Formation and liberation of the alkyl enzyme intermediate derived from styrene oxide as substrate and

Pseudomonas fluorescens

esterase (PFE). (b) Formation and liberation of the acetyl enzyme intermediate derived from phenyl acetate as substrate [8].

Scheme 7.12 Protein-catalyzed Kemp elimination

10

11

.

Figure 7.5 (a) The KE07 design, showing the TIM barrel scaffold of HisF (PDB accession code 1THF), the modeled 5-nitrobenzisoxazole substrate (red), and the 13 residues that were replaced to create the designed Kemp eliminase active site (green). (b) Details of the active site of the designed KE07. Shown are the 5-nitrobenzisoxazole substrate (cyan), the catalytic base (Glu101), the general acid/H-bond donor (Lys222), and the stacking residue (Trp50) [45b].

Figure 7.6 Allosteric regulation of AlleyCat7. Experimental conditions: initial concentrations: 130 nm protein, 100 mm NaCl, 20 mm HEPES buffer, pH 7.0, 0.1 mm CaCl

2

, 0.1 mm substrate. At 300 s EDTA (ethylenediaminetetraacetic acid) was added to the final concentration of 0.2 mm, followed by addition of CaCl

2

at 540 s to the final concentration of 0.3 mm and, again, EDTA at 840 s to the final concentration of 0.5 mm.

Scheme 7.13 P450-catalyzed insertion of nitrenes into nonactivated C—H bonds [53].

Scheme 7.14 Promiscuous reactivity of P450-BM3. Top: Envisioned catalytic cyclopropanation; below: Intermediate (Compound I) known to be the active species in the catalytic P450-BM3 catalyzed epoxidation of olefins, serving as a conceptual guide in devising Fe-catalyzed cyclopropanation [56].

Figure 7.7 An artificial olefin metathese based on anchoring a Grubbs–Hoveyda Ru-catalyst covalently to the protein nitrobindin [64a].

Chapter 8: Learning from Directed Evolution

Scheme 8.1 Hydrolytic kinetic resolution of

rac

-

1

catalyzed by ANEH [4, 5].

Scheme 8.2 Mechanism of ANEH [4, 6, 7].

Figure 8.1 Kinetic analysis of variant LW202 as catalyst in separate reactions of (

R

)- and (

S

)-

1

, where

v

R

and

v

S

are the initial rates of hydrolysis of (

R

)- and (

S

)-

1

at different substrate concentrations [

S

R

] or [

S

S

] [4].

Figure 8.2 Definition of the distance

d

in the rate-determining step of the ANEH-catalyzed reaction of

rac

-

1

[4].

Figure 8.3 Interpretation of crystal structures of WT ANEH and evolved variants by manually docking (

R

)- and (

S

)-1 into binding pockets, A, B, C, D, E, and F representing the originally designed randomization sites in the ISM process. (a) Favored (

S

)-1 in WT ANEH binding pocket; (b) disfavored (

R

)-1 in binding pocket of WT ANEH; (c) favored (S)-1 in variant LW202; and (d) disfavored (

R

)-1 in variant LW201 [4].

Scheme 8.3 Binding modes in the active site of ene-reductases. (left) Traditional (normal) binding mode and (right) flipped binding mode.

Figure 8.4 Location of substrate 2-hydroxymethyl cyclopentenone in OYE1 mutant Trp116Ile within the observed electron density (0.4σ contour level). (a) Attempted poor fit by a single substrate orientation. Red and green arrows indicate regions of negative and positive electron density peaks, respectively, in the difference map (not shown). (b) Successful fit by two substrate populations. C-atoms in binding mode 1 are pictured in green, those in binding mode are shown in light blue [16a].

Figure 8.5 Schematic representation of the role of OYE1 variants characterized by mutations at position 116 [16b].

Figure 8.6 Superposition of CmOYE structures in the absence (green) and presence (magenta) of

p

-HBA in the catalytic pockets. The structures shown in green and magenta represent open and closed forms of CmOYE (loop 6), respectively. Amino acid residues in the catalytic sites, FMN (yellow), and

p

-HBA (gray), are shown as stick models [20].

Scheme 8.4 Two-step biocatalytic conversion of ketoisophorone to (4

R

,6

R

)-actinol. Biocatalytic synthesis of (4

R

,6

R

)-actinol from ketoisophorone is performed by CmOYE (or ScOYE2) and LVR. CmOYE and ScOYE2 show less catalytic activity in the reduction of (4

S

)-phorenol than in the other reactions [20].

Figure 8.7 Computed transition state geometries of the lowest energy pathways for hydride transfer: (a) in the normal pose and (b) in the flipped pose [21].

Scheme 8.5 Model hydrolytic kinetic resolution of

rac

-

3

catalyzed by PFE [23].

Figure 8.8 3D homology model of PFE. The catalytic triad is shown in gray (Ser94, His 251, Asp222), mutation sites are highlighted in black (Val76, Ala98, and Ala175). The model was created by using PyMOL and amino-acid exchanges were introduced with the “Wizard/Mutagenesis” feature [23].

Figure 8.9 Alignment of WT PFE (light gray, Gly98 labeled) and variant V2A (dark gray, Ala98 labeled). The extended loop of the helix is highlighted [23].

Scheme 8.6 Compound I as the catalytically active high-spin intermediate in CYP-catalyzed oxidative hydroxylation.

Scheme 8.7 Ideal pose of a substrate for smooth oxidative hydroxylation initiated by H-atom abstraction and formation of an intermediate short-lived radical

R that undergoes rapid C—O bond formation. The ideal O—H—C angle has been computed to be about 130

°

[26].

Figure 8.10 The substrate-binding cavity of P450 BM3 F87A/A328V (a) and P450 BM3 wild type (b) in complex with cyclododecane after 3 ns of unrestraint MD simulation. The mutated positions are depicted in red. Positions 87 (left) and 328 (right) stabilize the substrate in the active site cavity. The activated oxygen of the heme is shown in orange [33].

Scheme 8.8 P450-BM3 catalyzed oxidative hydroxylation of testosterone [30a].

Figure 8.11 Computed pose of testosterone (5) explaining 2β-selectivity (mutant R47I/T49I/F87A) and the respective pose leading to 15β-selectivity (mutant R47Y/T49F/V78L/A82M/F87A) [29a, 30a].

Figure 8.12 Changes in transition-state stabilization energies for the multiple mutant versus the sum of the component mutants [36c]. The data represent mutants from subtilisin, tyrosyl-tRNA synthetase, trypsin, DHFR, and glutathione reductase. The dashed line has a slope of 1 representing perfect additivity, and the solid line corresponds to the best fit of the data.

Scheme 8.9 Model reaction catalyzed by the lipase PAL [38, 39].

Scheme 8.10 Best ISM pathway B → A leading to the triple mutant 1B2 (Leu162Asn/Met16Ala/Leu17Phe) displaying a selectivity factor of

E

= 594 in the hydrolytic kinetic resolution of

rac

-

8

with preferential formation of (

S

)-9 [38].

Scheme 8.11 Mechanism of lipase-catalyzed hydrolysis of esters.

Figure 8.13 Comparison of the oxyanions with bound (

S

)-substrate at the catalytically active Ser82 of WT PAL (a) versus best variant 1B2 (b) [38].

Scheme 8.12 Systematization of additive and non-additive mutational effects in protein engineering, in this scheme using two sets of mutations A and B, illustrated by employing enantioselectivity as the catalytic parameter. (a) Classical additive mutational effect; (b) non-additive mutational effect in which set B shows lower than expected enantioselectivity but in the same direction; and (c) non-additive effect in which mutational set B shows reversed enantioselectivity.

Figure 8.14 Fitness pathway landscape showing the 24 pathways leading from WT PAMO (bottom) to best (

R

)-selective variant ZGZ-2 in asymmetric sulfoxidation, a typical trajectory lacking local minima (green pathway) and one having local minima (red) being featured [40a].

Figure 8.15 Fitness pathway landscape in the frontal view of Figure 8.4 of all 24 trajectories leading from WT PAMO to variant ZGZ-2 characterized by four point mutations [40a]. Green notations indicate energetically favored pathways, whereas red notations represent disfavored trajectories having local minima. Letters in red in the dendrogram denote a local minimum after the introduction of this mutation.

List of Tables

Chapter 3: Gene Mutagenesis Methods

Table 3.1 Theoretical number of variants in a library obtained for a protein consisting of 181 amino acids (lipase A from

Bacillus subtilis

) with one to five amino acid exchanges per enzyme molecule [11]

Table 3.2 Codon usage (left panel), and theoretical and actual numbers of enzyme variants to be obtained upon ep-PCR mutagenesis of

B. subtilis

lipase LipA (right panel) [11]

Table 3.3 Oversampling necessary for 95% library coverage as a function of NNK versus NDT codon degeneracy and the number of amino acid positions in a randomization site [68]

Chapter 4: Strategies for Applying Gene Mutagenesis Methods

Table 4.1 Statistical consequences as a function of grouping single CAST residues into randomization sites [13]

Table 4.2 Choice of codon degeneracies at each position in the 441–444 loop of PAMO

Table 4.3 Combinatorial use of amino acids as building blocks employed in saturation mutagenesis at the nine-residue randomization site of CALA (Figure 4.12) [34b]

Table 4.4 Quick quality control and

Q

-values

Table 4.5 Summary of P450-BM3 sequencing results obtained from 96 single colonies formed on agar plates per library [60]

Chapter 5: Selected Examples of Directed Evolution of Enzymes with Emphasis on Stereo- and Regioselectivity, Substrate Scope, and/or Activity

Table 5.1 Typical directed evolution studies of enzymes for enhanced stereo- and/or regioselectivity, activity, shifted substrate scope, selected from the literature 2010 up to 2016

Chapter 6: Directed Evolution of Enzyme Robustness

Table 6.1 Specific activity,

T

50

, and 1-propanol stability of WT and mutant BPO-A1 haloperoxidases [18]

Table 6.2 Thermal stability results in comparison to computational predictions [51].

a

Chapter 7: Directed Evolution of Promiscuity: Artificial Enzymes as Catalysts in Organic Chemistry

Table 7.1 Optimization of the Diels–Alder reaction of

2a

with

3

in the presence of Cu(II)-salts and BSA in water.

a

Table 7.2 Summary of the directed evolution of the Kemp eliminase KE70 [45b]

Chapter 8: Learning from Directed Evolution

Table 8.1 Results of MD calculations [4]

Table 8.2 Stereoselective reduction of Baylis–Hillman adducts catalyzed by variants of OYE1 produced by saturation mutagenesis at position 116 [16a]

Table 8.3 Computed activation barriers and reaction energies for rate-determining hydride transfer from FMNH to 2-cyclohexenone in kcal mol

−1

[21]

Table 8.4 Catalytic profiles of WT PFE, variant V2A generated by epPCR and variants resulting from deconvolution of the latter

Manfred T. Reetz

Directed Evolution of Selective Enzymes

Catalysts for Organic Chemistry and Biotechnology

 

 

 

 

Author

 

Manfred T. Reetz

MPI für Kohlenforschung

Kaiser-Wilhelm-Platz 1

45470 Mülheim

Germany

 

and

 

Philipps-Universität Marburg

Fachbereich Chemie

Hans-Meerwein-Straße 4

35032 Marburg

Germany

 

Cover

Enzyme structure - http://dx.doi.org/10.2210/pdb3g02/pdb

All books published by Wiley-VCH are carefully produced. Nevertheless, authors, editors, and publisher do not warrant the information contained in these books, including this book, to be free of errors. Readers are advised to keep in mind that statements, data, illustrations, procedural details or other items may inadvertently be inaccurate.

 

Library of Congress Card No.: applied for

 

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.

 

Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at <http://dnb.d-nb.de>.

 

© 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Boschstr. 12, 69469 Weinheim, Germany

 

All rights reserved (including those of translation into other languages). No part of this book may be reproduced in any form – by photoprinting, microfilm, or any other means – nor transmitted or translated into a machine language without written permission from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law.

 

Print ISBN: 978-3-527-31660-1

ePDF ISBN: 978-3-527-65549-6

ePub ISBN: 978-3-527-65548-9

Mobi ISBN: 978-3-527-65547-2

oBook ISBN: 978-3-527-65546-5

 

Cover Design Schulz Grafik-Design, Fußgönheim, Germany

Preface

Directed evolution is a term that is used in two distinctly different research areas: (i) The genetic manipulation of functional RNAs, a discipline initiated by S. Spiegelmann half a century ago and extending to the present day in the laboratories of J. W. Szostak, J. F. Joyce, and others and (ii) the genetic manipulation of genes (DNA) with the aim to engineer the catalytic profiles of enzymes as catalysts in organic chemistry and biotechnology, especially stereoselectivity. This monograph focuses on the latter field. It begins with an introductory chapter that features the basic principles of directed evolution, and is followed by a chapter on screening and selection methods. Critical analyses of recent developments constitute the heart of the monograph. Rather than being comprehensive, emphasis is placed on methodology development in the quest to maximize efficiency, reliability, and speed when performing this type of protein engineering. The primary applications concern the synthesis of chiral pharmaceuticals, fragrances, and plant protecting agents.

The directed evolution methods and strategies featured in this book can also be used when engineering metabolic pathways, developing vaccines, engineering antibodies, creating genetically modified yeasts for the food industry, engineering proteins for pollution control, developing photosynthetic CO2 fixation, genetically modifying plants for agricultural and medicinal purposes, engineering CRISPR-Cas9 nucleases for genome editing, and modifying DNA polymerases for forensic purposes and for accepting non-natural nucleotides. A few studies of these applications are included here.

This monograph is intended not only for those who are interested in learning the basics of directed evolution of enzymes, but also for advanced researchers in academia and industry who seek guidelines for performing protein engineering efficiently.

I wish to thank Dr Zhoutong Sun for reading Chapters 3 and 4 and discussing some of the issues related to molecular biology. Thanks also goes to Dr Gheorghe-Doru Roiban and Dr Adriana Ilie for editing all the chapters and constructing some of the figures. Any errors that may remain are the responsibility of the author.

Manfred T. Reetz

MarburgJanuary 2016

Chapter 1Introduction to Directed Evolution

1.1 General Definition and Purpose of Directed Evolution of Enzymes

Enzymes have been used as catalysts in organic chemistry for more than a century [1a], but the general use of biocatalysis in academia and, particularly, in industry has suffered from the following often encountered limitations [1b–d]:

Limited substrate scope

Insufficient activity

Insufficient or wrong stereoselectivity

Insufficient or wrong regioselectivity

Insufficient robustness under operating conditions.

Sometimes, product inhibition also limits the use of enzymes. All of these problems can be addressed and generally solved by applying directed evolution (or laboratory evolution as it is sometimes called) [2]. It mimics Darwinian evolution as it occurs in Nature, but it does not constitute real natural evolution. The process consists of several steps, beginning with mutagenesis of the gene encoding the enzyme of interest. The library of mutated genes is then inserted into a bacterial or yeast host such as Escherichia coli or Pichia pastoris, respectively, which is plated out on agar plates. After a growth period, single colonies appear, each originating from a single cell, which now begin to express the respective protein variants. Multiple copies of transformants as well as wild-type (WT) appear, which unfortunately decrease the quality of libraries and increase the screening effort. Colony harvesting must be performed carefully, because cross-contamination leads to the formation of inseparable mixtures of mutants with concomitant misinterpretations. The colonies are picked by a robotic colony picker (or manually using toothpicks), and placed individually in the wells of 96- or 384-format microtiter plates that contain nutrient broth. Portions of each well-content are then placed in the respective wells of another microtiter plate where the screening for a given catalytic property ensues. In some (fortunate) cases, an improved variant (hit) is identified in such an initial library, which fulfills all the requirements for practical application as defined by the experimenter. If this does not happen, which generally proves to be the case, then the gene of the best variant is extracted and used as a template in the next cycle of mutagenesis/expression/screening (Scheme 1.1). This mimics “evolutionary pressure,” which is the heart of directed evolution.

Scheme 1.1 The basic steps in directed evolution of enzymes. The rectangles represent 96 well microtiter plates that contain enzyme variants, the red dots symbolizing hits.

In most directed evolution studies further cycles are necessary for obtaining the optimal catalyst, each time relying on the Darwinian character of the overall process. A crucial feature necessary for successful directed evolution is the linkage between phenotype and genotype. If a library in a recursive mode fails to harbor an improved mutant/variant, the Darwinian process ends abruptly in a local minimum on the fitness landscape. Fortunately, researchers have developed ways to escape from such local minima (“dead ends”) (see Section 4.3).