Maintaining Mission Critical Systems in a 24/7 Environment - Peter M. Curtis - E-Book

Maintaining Mission Critical Systems in a 24/7 Environment E-Book

Peter M. Curtis

0,0
117,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The new edition of the leading single-volume resource on designing, operating, and managing mission critical infrastructure Maintaining Mission Critical Systems in a 24/7 Environment provides in-depth coverage of operating, managing, and maintaining power quality and emergency power systems in mission critical facilities. This extensively revised third edition provides invaluable insight into the mission critical environment, helping professionals and students alike understand how to sustain continuous functionality, minimize the occurrence of costly unexpected downtime, and guard against power disturbances that can damage any organization's daily operations. Bridging engineering, operations, technology, and training, this comprehensive volume covers each component of specialized systems used in mission critical infrastructures worldwide. Throughout the text, readers are provided the up-to-date information necessary to design and analyze mission critical systems, reduce risk, comply with current policies and regulations, and maintain an appropriate level of reliability based on a facility's risk tolerance. Topics include safety, fire protection, energy security, and the myriad challenges and issues facing industry engineers today. Emphasizing business resiliency, data center efficiency, cyber security, and green power technology, this important volume: * Features new and updated content throughout, including new chapters on energy security and on integrating cleaner and more efficient energy into mission critical applications * Defines power quality terminology and explains the causes and effects of power disturbances * Provides in-depth explanations of each component of mission critical systems, including standby generators, raised access floors, automatic transfer switches, uninterruptible power supplies, and data center cooling and fuel systems * Contains in-depth discussion of the evolution and future of the mission critical facilities industry * Includes PowerPoint presentations with voiceovers and a digital/video library of information relevant to the mission critical industry Maintaining Mission Critical Systems in a 24/7 Environment is a must-read reference and training guide for architects, property managers, building engineers, IT professionals, data center personnel, electrical & mechanical technicians, students, and others involved with all types of mission critical equipment.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 1063

Veröffentlichungsjahr: 2020

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright Page

Foreword

Preface

Acknowledgements

1 An Overview of Reliability and Resiliency in Today’s Mission Critical Environment

1.1 Introduction

1.2 Risk Assessment

1.3 Capital Costs versus Operation Costs

1.4 Critical Environment Workflow and Change Management

1.5 Testing and Commissioning

1.6 Documentation and Human Factor

1.7 Education and Training

1.8 Corporate Knowledge Transfer – the Means to Securing Tomorrow’s Critical Infrastructure

1.9 Operation and Maintenance

1.10 Employee Certification

1.11 Standards and Benchmarking

1.12 What is a Mission Critical Engineer

1.13 Conclusion

1.14 An Overview of Reliability and Resiliency in Today’s Mission Critical Environment ‐Questions to Consider

2 Energy and Cyber Security and its Effect on Business Resiliency

2.1 Introduction

2.2 Risks Related to Information Security

2.3 Electro Magnetic Pulse and Solar Flares

2.4 How Risks Are Addressed

2.5 Use of Distributed Energy Resources and Generation

2.6 Documentation and Its Relation to Information Security

2.7 Smart Grid

2.8 Conclusion

2.9 Energy Security and Its Effect on Business Resiliency ‐ Questions to Consider

3 Mission Critical Engineering with an Overview of Green Technologies

3.1 Introduction

3.2 Companies’ Expectations: Risk Tolerance and Reliability

3.3 Identifying the Appropriate Redundancy in a Mission Critical Facility

3.4 Improving Reliability, Maintainability, and Proactive Preventative Maintenance

3.5 The Mission Critical Facilities Manager and the Importance of the Boardroom

3.6 Quantifying Reliability and Availability

3.7 Design Considerations for the Mission Critical Data Center

3.8 The Evolution of Mission Critical Facility Design

3.9 Human Factors and the Commissioning Process

3.10 Short Circuit & Coordination Studies

3.11 Introduction to Direct Current in the Data Center

3.12 Containerized Systems Overview

3.13 Mission Critical Engineering with an Overview of Green Technologies ‐ Questions to Consider

4 Mission Critical Electrical System Maintenance & Safety

4.1 Introduction

4.2 The History of the Maintenance Supervisor and the Evolution of the Mission Critical Facilities Engineer

4.3 Internal Building Deficiencies and Analysis

4.4 Evaluating Your System

4.5 Choosing a Maintenance Approach

4.6 Safe Electrical Maintenance

4.7 Maintenance of Typical Electrical Distribution Equipment

4.8 Being Proactive in Evaluating the Test Reports

4.9 Designing for Safety and Reliability

4.10 Conclusion

5 Standby Generators

5.1 Introduction

5.2 The Necessity for Standby Power

5.3 Emergency, Legally Required, and Optional Systems

5.4 Standby Systems That Are Legally Required

5.5 Optional Standby Systems

5.6 Understanding Your Power Requirements

5.7 Management Commitment and Training

5.8 Standby Generator Systems Maintenance Procedures

5.9 Documentation Plan

5.10 Emergency Procedures

5.11 Cold Start

5.12 Non‐Linear Load Concerns

5.13 Conclusion

6 Fuel Systems Design and Maintenance

6.1 Introduction

6.2 Brief Discussion on Diesel Engines

6.3 Bulk Storage Tank Selection

6.4 Codes and Standards

6.5 Recommended Practices for all Tanks

6.6 Fuel Distribution System Configuration

6.7 Day Tank Control System

6.8 Diesel Fuel and a Fuel Quality Assurance Program

6.9 Conclusion

7 Power Transfer Switch Technology, Applications, and Maintenance

7.1 Introduction

7.2 Transfer Switch Technology and Applications

7.3 Types of Power Transfer Switches

7.4 Control Devices

7.5 Design Features

7.6 Additional Characteristics and Ratings of ATS

7.7 Installation & Commissioning, Maintenance, and Safety

7.8 General Recommendations

7.9 Conclusion

8 The Static Transfer Switch

8.1 Introduction

8.2 Overview

8.3 Typical Static Switch One Line

8.4 STS Technology and Application

8.5 Testing

8.6 Conclusion

9 The Fundamentals of Power Quality

9.1 Introduction

9.2 Electricity Basics

9.3 Transmission of Power

9.4 Understanding Power Problems

9.5 Tolerances of Critical Loads

9.6 Power Monitoring

9.7 The Impact of Alternative Energy Generation

9.8 Conclusion

10 UPS Systems: Applications and Maintenance with an Overview of Green Technologies

10.1 Introduction

10.2 Purpose of UPS Systems

10.3 General Description of UPS Systems

10.4 Components of a Static UPS System

10.5 Online ‐ Line Interactive UPS Systems

10.6 Offline (Standby)

10.7 The Evolution of Static UPS Technology

10.8 Rotary UPS Systems

10.9 Redundancy, Configurations, and Topology

10.10 Energy Storage Devices

10.11 UPS Maintenance & Testing

10.12 Static UPS and Maintenance

10.13 UPS Management

10.14 Conclusion

11 Data Center Cooling Systems

11.1 Introduction

11.2 Background Information

11.3 Cooling within Datacom Rooms

11.4 Cooling Process

11.5 Cooling Final Dissipation

11.6 The Refrigeration Process

11.7 Components Inside Datacom Room

11.8 Conclusion

12 Data Center Cooling Efficiency, Concepts, & Technologies

12.1 Introduction

12.2 Heat Transfer Inside Data Centers

12.3 Cooling and Other Airflow Topics

12.4 Design Approaches for Data Center Cooling

12.5 Additional Considerations

12.6 Hardware & Associated Efficiencies

12.7 Best Practices

12.8 Efficiency Problem Solving

12.9 Conclusion

12.10 Conversions, Formulas, Guidelines

13 Raised Access Floors

13.1 Introduction

13.2 Design Considerations

13.3 Safety Concerns

13.4 Panel Cutting (For all Steel Panels or Cement Filled Panels that do not Contain an Aggregate)

13.5 Access Floor Maintenance

13.6 Troubleshooting

13.7 Additional Design Considerations

13.8 Conclusion

Chapter 14: Fire Protection in Mission Critical Infrastructures

14.1 Introduction

14.2 Hazard Analysis

14.3 Alarm and Notification

14.4 Early Warning Detection

14.5 Fire Suppression

14.6 Systems Design

14.7 Fire Detection

14.8 Fire Suppression Systems

14.9 Conclusion

Chapter 15: Managing Through Pandemics

15.1 Executive Summary: COVID‐19’s Impact on Critical Infrastructure Globally

15.2 Architectural Solutions and Air Purification Systems

15.3 Building Equipment Solutions and Technology

15.4 Operations, Maintenance and Training

15.5 Site Protection: Safeguarding the Staff and Visitors

15.6 The Workforce of Tomorrow

15.7 Assessment Tasks ‐ HVAC and Air Handling Units Filter Upgrades

15.8 Managing Through Pandemics ‐Questions to Consider

15.9 Conclusion

Appendix A: Policies and Regulations

A.1 Introduction

A.2 Industry Policies & Regulations

A.3 Data Protection

A.4 Encryption

A.5 Business Continuity Plan (BCP)

A.6 Conclusion

Appendix B: Consolidated List of Key Questions

Appendix C: Airflow Management (A System Approach)

C.1 Introduction

C.2 Control is the Key

C.3 Obtaining Control

C.4 Air Management Technologies

C.5 Conclusion

Glossary

References

Index

Books in the IEEE Press Series on Power Engineering

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Levels of Risk Impact to Facilities

Table 1.2 Critical Areas

Table 1.3 Critical Systems

Table 1.4 Managing Loss of Critical Personnel

Table 1.5 Documentation Issues

Table 1.6 Managing During Critical Events

Chapter 2

Table 2.1 Recent Significant Power Outages

(Various Sources – Google Alert – M

...

Table 2.2 Major Solar Flare Events

Chapter 3

Table 3.1 Law of Nines

Table 3.2 The Cost of Downtime

Table 3.3 Uptime Tiers

Chapter 4

Table 4.1 Types of Personal Protection

Table 4.2 Typical Personal Protective Equipment Examples

Table 4.3 Glove and Boot Classification and the Corresponding Voltage.

Table 4.4 Equipment Testing

Chapter 9

Table 9.1 Causes and effects of power problems

Table 9.2 The frequency and phase sequence of whole number harmonics.

Table 9.3 Power line disturbances and characteristics

Chapter 10

Table 10.1 Bandgap Energy Levels

Table 10.2 SiC UPS Heat Loss Energy Savings (@500KVA)

Table 10.3 Advantages and Limitations of Lead‐acid Batteries

Table 10.4 Advantages and Disadvantages of a Flywheel

Table 10.5 Procedures and details used for a semi‐annual checks and services for...

Chapter 11

Table 11.1 Differences between Precision and Comfort Cooling

Table 11.2 Issues surrounding the use of Airside Economizers

Chapter 12

Table 12.1 Table of PUE and DCiE efficiencies.

Table 12.2 Data Center Cooling Best Practices

Table 12.3 Scenario 1 – Data center overheating but has adequate cooling capacit...

Table 12.4 Scenario 2 – Data center overcooled but still has hotspots

Chapter 13

Table 13.1 Panel Load Chart

Table 13.2 Load calculation table for electronic pallet jack

Chapter 14

Table 14.1 Most Common Solutions Used in Li‐Ion Cells

Table 14.2 NFPA standards that govern applications of fire protection technologi...

Table 14.3 Heat Detector Spacing Reduction Based on Ceiling Height

Table 14.4 Advantages and Disadvantages of Sprinkler Systems

Table 14.5 Advantages and disadvantages of double‐interlock pre‐action water ...

Table 14.6 Advantages and Disadvantages of IG‐541

Table 14.7 Advantages and Disadvantages of IG‐55

Table 14.8 Advantages and Disadvantages of HFC‐22ea

Table 14.9 Advantages and Disadvantages of HFC‐125

Table 14.10 Advantages and Disadvantages of FK‐5‐1‐12

Table 14.11 Advantages and Disadvantages of HFC‐23

1

Table A.1 Key Points of Sarbanes‐Oxley Act (SOX)

Table A.2 CIP‐002 through CIP‐009 Cybersecurity framework

Table A.3 Basel II Compliance Benefits

3

Table C.1 Average maximum inlet temperature per load

Table C.2 Temperatures about the mean value per load

Table C.3 Temperature spread based on the Standard deviation

List of Illustrations

Chapter 1

Figure 1.1 Hidden Costs of Operations

Figure 1.2 Typical screenshot of Smart

WALK™

dashboard

Figure 1.3 Smart

WALK™

mobile screenshot

Figure 1.4 Screenshot of Smart

TEAM®

Learning Management System

Chapter 2

Figure 2.1 US Primary Energy Sources.

Figure 2.2 Number of Breaches and records exposed from 2005 to 2019

Figure 2.3 Fuel Sources for Electricity Generation in the U.S. in 2018

Figure 2.4 Potential Causes of Load Interruption or Downtime

Figure 2.5 The Tiers of the Electric Grid from Generation to Chip. Derived f...

Figure 2.6 Solar Flare.

Figure 2.7 EMP Waveform – MIL‐STD‐461G Test Method RS105

Figure 2.8 RS105 Transient Generator and Transmission Line

Figure 2.9 Damped Sinusoidal Transient – MIL‐STD‐461G Test Method CS1116

Figure 2.10 Smart

WALK™

mobile device

Figure 2.11 The Smart Grid Network and its features.

Chapter 3

Figure 3.1 “Seven steps” is a continuous cycle of evaluation, implementation...

Figure 3.2 Sample SCS Screenshot

Figure 3.3 Sample TCC Curve Analysis

Figure 3.4 Traditional AC Distribution

Figure 3.5 DC Distribution

Figure 3.6 Electronic Ballast

Figure 3.7 Absorption Chiller

Figure 3.8 Typical Fuel Cell

Figure 3.9 Microturbine CCHP System

Figure 3.10 DC Monitoring Equipment

Figure 3.11 Smart

TEAM

™ mobile screenshot

Figure 3.12 Open rear door of containerized Data Center

Chapter 4

Figure 4.1 Theory of Predictive Maintenance

Figure 4.2 Hazard Risk Category (HRC) Arc Rating (Reference of Chicago Prote...

Figure 4.3 Arc Flash Boundaries

Figure 4.4 A small thermographic camera and a typical installation

Figure 4.5 Sample IR scanning tracking

Chapter 5

Figure 5.1 Generator

Figure 5.2 Load Bank Testing

Figure 5.3 Generator Control Cabinet

Chapter 6

Figure 6.1 Basic installation practices for all tanks, whether aboveground o...

Figure 6.2 Typical fuel storage and distribution system flow diagram.

Figure 6.3 Poorly arranged system.

Figure 6.4 System using the same components as in Figure 6.3 but arranging t...

Figure 6.5 System using individual components to prevent a single point of f...

Figure 6.6 System that further increases reliability by adding redundancy fo...

Chapter 7

Figure 7.1 Break Before Make ATS

Figure 7.2 Basic ATS Enclosure

Figure 7.3 Break Before Make ATS with Isolation Bypass

Figure 7.4 ATS Enclosure Equipped with Isolation Bypass

Figure 7.5 Closed Transition ATS

Figure 7.6 Closed Transition ATS with Isolation Bypass

Figure 7.7 Closed Transition ATS Enclosure

Figure 7.8 Delayed Transition ATS

Figure 7.9 Delayed Transition ATS with Isolation Bypass

Figure 7.10 Delayed Transition Transfer Switch Enclosure

Figure 7.11 Soft Load Power Transfer Switch

Chapter 8

Figure 8.1 Typical Static Switch One Line

Chapter 9

Figure 9.1 Cost of downtime

Figure 9.2 Power Factor Beer Mug Analogy

Figure 9.3 Power Factor Waveforms

Figure 9.4 Typical electric system

Figure 9.5 This undistorted sine wave is also known as the fundamental wavef...

Figure 9.6 Three‐phase power is produced from the rotating windings of a gen...

Figure 9.7 Types and relative frequency of power quality disturbances.

Figure 9.8 Example part of an IEC 61000‐4‐30 Class A Edition 3 compliance ce...

Figure 9.9 Transients shown in waveform

Figure 9.10 Waveforms of RMS variations

Figure 9.11 Motor‐start waveform signature.

Figure 9.12 Voltage swell timeline.

Figure 9.13 Unbalance timeline

Figure 9.14 Notching waveform

Figure 9.15 The fundamental and the 5

th

harmonic.

Figure 9.16 The harmonic spectrum indicating a problem.

Figure 9.17 The additive effect of the triplen harmonics.

Figure 9.18 Harmonic distortion waveform.

Figure 9.19 Interruption timeline.

Figure 9.20 CBEMA curve

Figure 9.21 ITIC curve

Figure 9.22 A portion of data obtained from the Distribution Power Quality M...

Figure 9.23 Load problems – Power Quality Troubleshooting Checklist

Figure 9.24 Building distribution problems – Power Quality Troubleshooting C...

Figure 9.25 Facility transformer and main service equipment problem – Power ...

Chapter 10

Figure 10.1 Double Conversion: In normal operation

Figure 10.2 Double Conversion: On Battery

Figure 10.3 Double Conversion: Static Bypass

Figure 10.4 Double Conversion: Internal Bypass

Figure 10.5 Typical 3‐phase Rectifier Schematic utilizing SCRs as the power ...

Figure 10.6 Conventional Delta Conversion Online UPS

Figure 10.7 Conventional Two‐Level inverter topology and Variable Width Puls...

Figure 10.8 Three‐Level inverter topology and Variable Width Pulse Train

Figure 10.9 Silicon Valence Band

Figure 10.10 SiC 73% Reduction Power Losses

Figure 10.11 SiC Heat Loss Energy Savings

Figure 10.12 Diesel UPS

Figure 10.13 N UPS Configuration

Figure 10.14 N+1 UPS Configuration

Figure 10.15 Isolated Redundant UPS Configuration

Figure 10.16 N+2 UPS Configuration

Figure 10.17 2N UPS Configuration

Figure 10.18 2n+1 Configuration

Figure 10.19 Distributed Redundant UPS Configuration

Figure 10.20 Typical Wet Cell Battery

Figure 10.21 Typical VRLA Battery

Figure 10.22 Flooded Cell Battery Room

Figure 10.23 Cutaway of a flywheel

Figure 10.24 An integrated 300 kVA flywheel/UPS

Chapter 11

Figure 11.1 2011 ASHRAE Environmental Guidelines

Figure 11.2 Heat transfer in a simple graphical format.

Figure 11.3 CRAC Unit – Illustrating the Compressors in Place within the Uni...

Figure 11.4 Illustration of CRAH unit with the Chilled Water Valve

Figure 11.5 Heat transfer for air‐cooled heat rejection equipment.

Figure 11.6 Heat transfer for a cooling tower.

Figure 11.7 Generic chiller diagram.

Figure 11.8 Air‐cooled chiller diagram.

Figure 11.9 Typical packaged air‐cooled chiller.

Figure 11.10 Illustration of the CRAC – DX refrigeration loop

Figure 11.11 Water‐cooled chiller diagram.

Figure 11.12 Typical water‐cooled chiller.

Figure 11.13 Schematic overview of a generic cooling tower flow.

Figure 11.14 Direct cooling towers on an elevated platform.

Figure 11.15 Direct of open circuit cooling tower schematic flow diagram....

Figure 11.16 Indirect cooling tower schematic flow diagram.

Figure 11.17 (a) Typical Ice Storage System (b) Typical Chilled Water Storag...

Figure 11.18 Schematic overview of a generic air‐cooled condenser.

Figure 11.19 Side inlet and side outlet, air‐cooled condenser.

Figure 11.20 Bottom inlet and top outlet, air‐cooled condenser.

Figure 11.21 Typical self‐contained condensing unit.

Figure 11.22 Air Side Economizer Alternatives Mild Temperature Operation

Figure 11.23 Air Side Economizer Alternatives Cold Day Operation

Figure 11.24 Air Side Economizer Alternatives Summer/Hot Day Operation

Figure 11.25 Heat wheel for data center application.

Figure 11.26 Simple overview of the heat exchanger process.

Figure 11.27 Shell and tube heat exchanger.

Figure 11.28 Exploded isometric showing plate and frame heat exchanger opera...

Figure 11.29 Installed plate and frame heat exchangers.

Figure 11.30 Typical Computer Room Air Conditioning Unit.

Figure 11.31 CRAC unit located outside the electronic equipment room.

Figure 11.32 Downflow CRAC unit airflow path.

Figure 11.33 Upflow CRAC unit airflow path.

Figure 11.34 Ducted Upflow CRAC unit airflow path.

Figure 11.35 Typical Hot Aisle / Cold Aisle Layout

Figure 11.36 Return Air Ceiling

Figure 11.37 Sealing the Hot Aisle

Figure 11.38 Sealing the Cold Aisle

Figure 11.39 Chimney Cabinets with ducting to the dropped ceiling

Figure 11.40 In Row Cooling Units and the Isolation Configuration Assembly

Figure 11.41 Rear Door Heat Exchanger

Chapter 12

Figure 12.1 Comparison of IT rack inlet temperatures for an inefficiently co...

Figure 12.2 Typical CRAC cooling efficiency increase (normalized at 75°F/23....

Figure 12.3 PUE calculation

Figure 12.4 DCiE Calculation

Figure 12.5 Air cooling of a data center equipment rack.

Figure 12.6 Poor equipment layout causes

mixing

within the data center.

Figure 12.7 CRAC cooling behavior for various heat loads.

Figure 12.8 Example of a psychometric chart

Figure 12.9 Active Airflow Management

Figure 12.10 Under floor air mover designed for data center application.

Chapter 13

Figure 13.1 Typical raised floor.

Figure 13.2 A design or working load is a single load applied on a small are...

Figure 13.3 Rolling loads are applied by wheeled vehicles carrying loads acr...

Figure 13.4 Uniform loads are applied uniformly over the entire surface of t...

Figure 13.5 The ultimate load capacity of a floor panel is reached when it h...

Figure 13.6 Impact loads occur when objects are accidentally dropped on the ...

Figure 13.7 Typical bolted stringer understructure and cable tray.

Figure 13.8 Tate Access Floor’s Perf 1000 with airflow chart.

Figure 13.9 Tate Access Floor’s GrateAire

TM

with an airflow chart.

Figure 13.10 For areas where cables pass through to the plenum, grommets or ...

Figure 13.11 Suction cup lifter.

Figure 13.12 Perf panel lifter.

Figure 13.13 Typical grounding method.

Figure 13.14 Using a consolidation point.

Figure 13.15 Additional support pedestal locations.

Figure 13.16 Interior cutout procedure.

Figure 13.17 Cutout protection.

Figure 13.18 Tilted pedestal.

Figure 13.19 Typical Energy Efficient Data Center Design.

Figure 13.20 Slab floor data center, random rack orientation with up‐flow CR...

Figure 13.21 Slab floor data center, Rack temperatures, and airflow at 6 fee...

Figure 13.22 Raised‐floor data center, random rack orientation with downflow...

Figure 13.23 Raised‐floor data center, Rack temperatures and airflow at 6 fe...

Figure 13.24 Raised‐floor data center, Hot Aisle/ Cold Aisle with downflow C...

Figure 13.25 Raised‐floor data center, Rack temperatures and airflow at 6 fe...

Figure 13.26 Hot Aisle/Cold Aisle with ducted ceiling plenum.

Figure 13.27 Hot Aisle/ Cold Aisle with ducted ceiling plenum. (6 feet).

Figure 13.28 Hot Aisle/Cold Aisle with ducted ceiling plenum. Temperature an...

Figure 13.29 Close coupled liquid cooling for four 11KW racks

Figure 13.30 Close coupled liquid cooling airflow.

Figure 13.31 Comparison of rack intake temperature and return air setpoints ...

Chapter 14

Figure 14.1 Conventional Fire Alarm Control Panel.

Figure 14.2 Vortex Hybrid Fire Suppression System.

Figure 14.3 Detector density increases (spacing is decreased) along with hig...

Figure 14.4 High‐low layouts of smoke detectors include some detectors 3 ft ...

Figure 14.5 Air‐sampling Smoke detection.

Figure 14.6 Typical heat detectors.

Figure 14.7 Typical flame detectors. (Left: Triple I.R., Right: UV/IR combin...

Figure 14.8 Pre‐action/dry‐pipe, or double‐interlock systems are the best te...

Figure 14.9 Galvanized pipe damaged by corrosion.

Figure 14.10 Typical water mist system.

Figure 14.11 Class I water mist system.

Figure 14.12 Typical IG‐541 system.

Figure 14.13 Typical HFC‐227ea system

Chapter 15

Figure 15.1 The UV Electromagnetic Spectrum.

Figure 15.2 UV Disinfection in Hospitals.

Figure 15.3 IR Scans.

Figure 15.4 SmartWALK® Robot

Figure 15.5 Temperature Monitoring Ring

Figure 15.6 SmartWALK™ Dashboard

1

Figure 1 Business Continuity Disciplines

Figure 2 Laws and Regulations that must be adhered to include

Figure 3 Infrastructure Dashboard with Asset Management

Figure 4 NIST Cyber Security Framework

Figure 5 Production Development, Acceptance, Operations, Support, and Recove...

Figure 6 ISO 27000 Information Security Management System Guidelines

Figure 7 NIPP Risk Management Framework

Figure 8 Security Operations Center overview

Figure 9 Safeguarding data through encryption

Figure 10 Overview of Data Sensitivity and Vital Records Management

Figure 11 Business Continuity Management disciplines and integration

Figure 12 Contingency Recovery Planning

Figure 13 Enterprise Resilience and Corporate Certification

Figure 14 Emergency Operations Center

Guide

Cover

Table of Contents

Begin Reading

Pages

ii

iii

iv

xvii

xviii

xix

xxi

xxii

xxiii

xxiv

xxv

xxvi

xxvii

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

595

596

597

598

599

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

IEEE Press445 Hoes LanePiscataway, NJ 08854

IEEE Press Editorial BoardEkram Hossain, Editor in Chief

Jón Atli Benediktsson

David Alan Grier

Elya B. Joffe

Xiaoou Li

Peter Lian

Andreas Molisch

Saeid Nahavandi

Jeffrey Reed

Diomidis Spinellis

Sarah Spurgeon

Ahmet Murat Tekalp

Maintaining Mission Critical Systems in a 24/7 Environment

Third Edition

Peter M. Curtis

Copyright © 2021 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging‐in‐Publication Data:

Names: Curtis, Peter M., author. | John Wiley & Sons, Inc., publisher.Title: Maintaining mission critical systems in a 24/7 environment / Peter M. Curtis.Other titles: IEEE Press series on power engineering.Description: Third edition. | Hoboken, New Jersey : Wiley‐IEEE Press, [2021] | Series: IEEE Press series on power engineering | Includes bibliographical references and index.Identifiers: LCCN 2020038739 (print) | LCCN 2020038740 (ebook) | ISBN 9781119506119 (cloth) | ISBN 9781119506126 (adobe pdf) | ISBN 9781119506140 (epub)Subjects: LCSH: Reliability (Engineering).Classification: LCC TA169 .C87 2021 (print) | LCC TA169 (ebook) | DDC 620/.00452–dc23LC record available at https://lccn.loc.gov/2020038739LC ebook record available at https://lccn.loc.gov/2020038740

Cover Design: WileyCover Images: © Sam Robinson/Getty Images, courtesy of Peter M. Curtis

Foreword

Our lives, livelihoods, and way of life are increasingly dependent on computers and data communication, and this dependence increasingly relies on data centers, where servers, mainframes, storage devices, and communication gear are brought together. In short, we are becoming a datacentric or data center society.

We are all witnessing the extraordinary expansion of the Internet, from social media to search engines, games, content distribution, and e‐commerce. The advent of cloud computing, artificial intelligence and machine learning, virtual and augmented reality, blockchain and Internet of Things will further amplify the importance of the data center as a hub for the new wired world. As we enter the Fourth Industrial Revolution, every aspect of civilization, every region of the globe, will see an acceleration of our society’s Digital Transformation. The COVID‐19 pandemic of 2020 has further reinforced the extent our world is relying on technology. Consequently, there is an ever‐increasing demand on our information infrastructure, especially our data centers, changing the way we design, build, use, and maintain these facilities. However, the industry experts have been slow to document and communicate the vital processes, tools, and techniques needed to do this.

Not only is ours a dynamic environment, but it also is complex and requires an understanding of electrical, mechanical, fire protection, and security systems, reliability concepts, operating processes, and much more. I realized the great benefit Peter Curtis' book will bring to our mission critical community soon after I started reviewing the manuscript. I believe this is the first attempt to provide a comprehensive overview of all the interrelated systems, components, and processes that define the data center space, and the results are remarkable.

Data center facilities are shaped by a paradox: critical infrastructure support systems and the facilities housing them are designed to last 15 years or more, whereas the IT equipment typically has a life of about three years. Thus, every few years, we are faced with major IT changes that dramatically alter the computer technology and invariably impact the demand for power, heat dissipation, and the physical characteristics of the facility design and operation. In addition, the last few years have seen a growing focus on energy efficiency and sustainability, reflecting society's effort to reduce its carbon footprint and reverse global warming. Data centers are particularly targeted by these efforts because they are such huge users of power and because they are scrutinized by the public more than most other sectors.

It is no secret that one of the most difficult challenges facing our industry is our ability to objectively assess risk and critical facility robustness. In general, we lack the metrics needed to quantify reliability and availability, the ability to identify and align the function or business mission of each building with its performance expectation.

Other industries, particularly aircraft maintenance and nuclear power plants, have spent years developing analytical tools to assess systems resiliency, and the work has yielded substantial performance improvements. In addition, the concept of reliability is sometimes misunderstood by professionals serving the data center industry. Curtis' efforts to define and explain reliability concepts will help improve performance in the mission critical space.

Further, the process of integrating all of the interrelated components‐programming space allocation, design, redundancy level planning, engineered systems quality, construction, commissioning, operation, documentation, contingency planning, personnel training, and so on‐to achieve reliability objectives is clear and well‐reasoned. The book plainly demonstrates how and why each element must be addressed to achieve reliability goals. Although this concept appears obvious, it often is not fully understood or accepted, especially in our changing IT world, where we are facing the challenge of finding the right balance between complexity, automation, and human intervention.

The comprehensive review of essential electrical and mechanical systems populating these facilities, from uninterruptible power supplies to cooling systems, from generators, transfer switches to fire protection systems, yields great benefits, not only from a functional standpoint, but also because it provides the necessary maintenance and testing data needed for effective system operation. And, maybe most importantly, Curtis recognizes and deals with the vital human factor, “… perhaps the most poorly understood aspect of process safety and reliability management.”

This third edition is covering some very important new developments in the Data Center design and operation including but not limited to modern cooling solutions, advanced UPS technologies, additional considerations on reliability, resiliency, cyber and physical security, green power, electrical systems maintenance and safety.

As the present generation of professionals that have dedicated their career to designing, operating, and maintaining critical facilities are starting to retire, the next group replacing them needs training, mentoring, and, more importantly, education. This volume will go a long way towards this goal.

I am confident that time will validate the approach and ideas covered here. Meanwhile, we will all profit from this admirable effort to bring a better understanding to this complex and fast‐changing environment.

August, 2020PETER GROSS

Preface

The evolution of our digital society has sped up our lives significantly that a critical event unfolds with extreme rapidness with the exponential growth in technology and constant societal transformation our critical infrastructure is more susceptible and vulnerable to catastrophic and escalating failures. Consequently, we now need to incorporate a new mindset and intelligent decision‐making structures, processes, and tools to manage from a normal day to unpredictable events such as power outages, natural disasters, virus outbreaks, such as COVID‐19, or manmade incidences. This book facilitates bridging the gap between a critical event, a mission critical engineer, and the critical infrastructure that needs to be in place while working to manage throughout the incident with safety, confidence, and situational awareness.

This text intends to provide the foundations of mission critical infrastructure from an engineering and operations perspective. It should be noted that this book is a work in progress and does not include every detail. It does, however, provide foundational topics as well as more advanced subjects of critical infrastructure that are relevant to society today. Topics such as Reliability, Resiliency, Energy Security, UPS Systems, Standby Generators, Automatic and Static Transfer Switches, Power Quality, Data Center Cooling, Efficiency, Air Flow Management, Fire Systems, Raised Floors, Mission Critical Engineering, Safety, Cyber Security, Electromagnetic Pulse, IoT, Machine Learning, Analytics, and Green Practices are discussed in layman’s terms.

It is imperative that anyone wishing to enter into this mission critical industry has the necessary foundation to operate critical systems with the goal of reducing operational risk, improving safety, and decreasing greenhouse gases. Not having the proper standardized training in place will lead to lost lives and every type of cascading failure due to poor operational decisions and a lack of situational awareness. The purpose of this book is to transition today’s facilities and IT engineers into the growing mission critical industry and properly equip them with resources to deal with the hazards and challenges of a field that is a lynchpin for the reliability and resiliency of our ever‐evolving digital society. In the next 7–10 years, half of our workforce will retire, while the industry will at least double in size. It is imperative that an appropriate knowledge transfer occurs for the next generation engineer who will take over the reins.

I have dedicated my career to providing the means for effective knowledge transfer and an appropriate learning environment through a software platform named SmartWALK®. In addition, I have also developed a customizable learning management system known as SmartTEAM™. SmartTEAM™ creates the foundation of corporate and institutionalized customized training, knowledge transfer, awareness, and methodology through various immersive and collaborative content. This learning environment promotes continuous learning and improvement for professionals in the industry throughout their careers.

To learn more about the accompanying study tools offered, go to https://pmcgroupone.com/ and head over to the Technology tab to find SmartTEAM™. If you would like to make a comment regarding new material for the next edition, please feel free to contact me at the below‐mentioned addresses.

August 2020Peter M. CurtisFounder | CEOPMC Group One, [email protected]

Acknowledgements

Creating this book could not be possible through the effort of only one person. I have attended various conferences throughout my career, including: AFE, AFCOM, BOMI, Data Center Dynamics, IFMA, 7/24 Exchange, etc. and harvested insight offered by many Mission Critical professionals from all walks of the industry. I am grateful for the professional relationships that were built at these conferences, seminars, and courses that I have taught and contributed to. This allowed the sharing of knowledge, know‐how, information, and experiences upon which this book is based. I am appreciative to IEEE/Wiley for taking on this project almost 20 years ago. The format that initially began as material for their first online educational course transcended into an entire manuscript passionately.

Professionals in the Mission Critical field have witnessed the evolution from a fledgling 40 hour a week operation, into the 24/7 environment that our digital society demands today. The people responsible for the growth and maintenance of the industry have amassed an invaluable cache of knowledge and experience along the way. Compiling this information into a book provides a way for those new to this industry to tap into the years of experience that have emerged since the industry's humble beginnings decades ago.

This book's intended audience includes every business that understands the consequences of downtime and seeks to improve its own operational efficiency, business resiliency, and safety. Reviewed by members of senior management, technicians, vendors, manufacturers, and contractors alike, this book gives a comprehensive, 360‐degrees perspective on the Mission Critical Industry as it stands today. Its importance lies in its use as a foundation toward a seamless transition to the next stages of education and training for the mission critical industry as it as least doubles in size and loses half its workforce during 7–10 years.

I am thankful to the people and organizations for their help, support, and contributions that have enabled this information to be shared with the next generation of mission critical engineers and business continuity professionals. The goal of this book is to provide the best tools and technology to succeed and keep society safe and secure when the unexpected occurs. This only happens with the proper training in which the outcome is situational awareness and confidence when a critical event unfolds.

I am grateful to my wife Belinda‐Leigh for her unwavering support and dedication to my passion and purpose, which is to keep our digital society “Always On” and all people around the globe safe and secure. She has been at my side during the most important and transitionary period of my life, allowing me to intensely focus on Engineering, Technology, Analytics, Research, and Education.

Chapter Contributors

Don Beaty, P.E,

DLB Associates

(

Chapter 11

– Data Center Cooling Systems)

Charles Berry,

PMC Group One, LLC

(

Chapter 10

– UPS Systems &

Chapter 12

– Data Center Cooling Efficiency)

Tom Bronack, CBCP (Appendix A – Policies and Regulations)

Dan Catalfu,

Tate Access Floors

(

Chapter 13

– Raised Access Floors)

Howard L. Chesneau,

Fuel Quality Services, Inc.

(

Chapter 6

– Fuel Systems and Design)

George E. Ello, Long Island Power Authority (

Chapter 2

—Energy Security)

Edward English III,

Fuel Quality Services, Inc.

(

Chapter 6

— Fuel Systems and Design)

Brian K. Fabel, P.E.,

ORR Protection Systems

(

Chapter 14

– Fire Protection)

James P. Fulton, PhD.,

Suffolk Community College

(Appendix C – Airflow Management)

John Golde, P.E.,

Golde Engineering

, PC (

Chapter 3

– Mission Critical Electrical System Maintenance and Safety)

Walter Phelps,

Degree Controls, Inc

. (

Chapter 12

– Data Center Cooling Efficiency)

Dean Richards,

Mitsubishi Electric Power Products

(

Chapter 10

– UPS Systems)

Ron Ritorto, P.E.,

Mission Critical Fuel Systems

(

Chapter 6

– Fuel Systems and Design)

Technical Reviewers and Editors

Scott Alwine,

Tate Access Floors, Inc

. (

Chapter 13

– Raised Access Floors)

Bill Campbell,

Emerson Network Power

(

Chapter 10

– UPS Systems)

Greg Caronia, (Appendix A – Policies and Regulations)

Steve Carter,

Orr Corporation

(

Chapter 14

– Fire Protection)

Thomas Corona,

Jones Lang LaSalle

(

Chapter 4

– Mission Critical Electrical System Maintenance and Safety &

Chapter 12

– Data Center Cooling Efficiency)

Peter Davie, P.E.,

PSEG

(

Chapter 2

– Energy and Cyber Security and its Effect on Business Resiliency &

Chapter 4

– Maintenance and Safety)

John C. Day,

PDI Corp.

(

Chapter 8

– Static Transfer Switches)

John DeAngelo,

Power Service Concepts, Inc.

(

Chapter 10

– UPS Systems)

John Diamond,

DAS Associates

(

Chapter 5

– Standby Generators &

Chapter 10

– An Overview of UPS Systems)

Doug Dethmers,

East Penn Manufacturing Company

(

Chapter 10

– UPS Systems)

George E. Ello,

Long Island Power Authority

(

Chapter 2

– Energy and Cyber Security and its Effect on Business Resiliency &

Chapter 4

– Maintenance and Safety)

Aisha Farooque,

PMC Group One, LLC

(

Chapter 15

– Managing Through Pandemics)

Michael Fluegeman, P.E.,

PlanNet Consulting

(

Chapter 7

– Power Transfer Switch Technology &

Chapter 10

– UPS Systems)

Steve Guzzardo,

HP

(

Chapter 1

– Reliability and Resiliency)

Richard Greco, P.E.,

California Data Center Design Group

(

Chapter 3

– Mission Critical Engineering)

Patrick Herrley (

Chapter 5

– Standby Generators)

Ross M. Ignall,

Dranetz

(

Chapter 9

– Fundamentals of Power Quality)

David Krenzer, Victaulic (

Chapter 14

– Fire Protection in Mission Critical Infrastructures)

Ellen Leinfuss,

Dranetz‐BMI

(

Chapter 9

– Fundamentals of Power Quality)

Teresa Lindsey,

BITS

– Chapter Questions

Wai‐Lin Litzke,

Brookhaven National Labs

(Appendix A – Policies and Regulations)

Michael Mallia,

AFCO Systems

(Appendix C – Air Flow Management)

Kevin McCarthy,

EDG2 Inc

. (

Chapter 10

– UPS Systems)

Joseph McPartland III,

American Power Conversion

(

Chapter 10

‐ UPS Systems)

John Mezic,

PMC Group One, LLC

(

Chapter 3

– Mission Critical Engineering,

Chapter 10

– UPS Systems, Appendix B – Mission Critical Questions & Appendix C – Airflow Management – A Systems Approach)

Stefan Miesbach,

SIEMENS

(

Chapter 2

– Energy and Cyber Security and its Effect on Business Resiliency)

Mark Mills –

Digital Power Group

Samuel Morales Garcia, PMC Group One, LLC (

Chapter 1

– Reliability and Resiliency)

David P. Mulholland,

PDI

(

Chapter 8

– Static Transfer Switches)

Gary Olsen, P.E.,

Cummins

(

Chapter 5

– Standby Generators)

Ted Pappas,

Keyspan Engineering (

Chapter

3

Mission Critical Engineering)

Anthony Pinkey,

Layer Zero Power Systems, Inc

. (

Chapter 8

– Static Transfer Switches)

Anthony Pinkey,

Mitsubishi Power

(

Chapter 10

– UPS Systems)

Walter Poggi,

Retlif Testing Laboratories

(

Chapter 2

– Energy and Cyber Security and its Effect on Business Resiliency)

Martin Robinson, IRISS (

Chapter 4

– Mission Critical Electrical Systems Maintenance and Safety)

Richard Rotanz,

Applied Science Foundation for Homeland Security

(Appendix A – Policies and Regulations)

Dan Sabino,

PMC Group One, LLC

, (

Chapter 7

– Power Transfer Switch Technology)

Douglas H. Sandberg,

GHI Group

(

Chapter 7

– Power Transfer Switch Technology)

Ron Shapiro, P.E.,

Cosentini Mission Critical

(

Chapter 4

– Mission Critical Electrical Systems Maintenance and Safety)

Terri Sinski,

Strategic Planning Partners

(Appendix A – Policies and Regulations)

Robert Sullivan (

Chapter 11

– Data Center Cooling Systems)

David Taylor,

Victaulic

(

Chapter 14

– Fire Protection in Mission Critical Infrastructures)

Kenneth Uhlman, P.E.,

Eaton/Cutler Hammer

– Technical Discussions

Steve Vechy,

Enersys

(

Chapter 10

– UPS Systems)

Thank you, Dr. Robert Amundsen, Director of the Energy Management Graduate Program at New York Institute of Technology, who gave me my first teaching opportunity in 1994. It has allowed me to continually develop professionally, learn, and pollinate many groups with the information presented in this book.

I'd like to thank two early pioneers of this industry for defining what Mission Critical really means to me and the industry. I am appreciative for the knowledge they have imparted to me. Borio Gatto for sharing his engineering wisdom, guidance and advice with me and Peter Gross, P.E. for his special message in his contribution all the book Foreword’s and well as expanding my views of the Mission Critical world.

I'd also like to recognize Joseph F. McPartland for his body of work with regards to educating industry professionals on the National Electrical Code.

Thank you to my good friends and colleagues for their continued support, technical dialogue, feedback, advice, etc.; over the years: Vecas Gray, P.E., Herb Tracy, Ron Schindel, Steve Davies, Mark Keller, Esq., Abramson & Keller, Joseph Cappiello, PMC Group One, LLC.

I'd like to thank my TAB board of directors and investors for their advice and guidance during the time I was building the framework for all the new technologies.

I would like to express gratitude to all the contributors, students, mentors, interns and organizations that I have been working with and learning from over the years for their assistance, guidance, advice, research, and organization of the information presented in this book: John Altherr, P.E., Nada Anid, PhD, Elijan “Al” Avdic, Tala Awad, Anna Benson, Charles Berry, William Callan, Harry Cannon, Nancy Camacho, Joseph Cappiello, Ralph Ciardulli, P.E., Charles Cottitta, Guy Davi, Kenneth Davis, Brad Dennis, Stephen Worn, David Dunatov, Andres Fortino, PhD, Ralph Gunther, P.E., Kevin Heslin, Patrick Hoehn, Lois Hutchinson, Al Law, John Nadon, P.E., John Mezic, P.E. John Montana, Jay O'Neill, Rey Parma, Shawn Paul, Arnold Peterson, P.E., Victoria Pierre‐Louis, Richard Realmuto, P.E, Michael Recher, P.E., Adam Redler, Edward Rosavitch, P.E, Christie Rotanz, Mike Sciroppo, Brad Weingast, Jack Willis, P.E, Anthony Wilson, Thomas Weingarten, P.E., Stephen Worn, Paul Yetman, and my special friends at 7x24 Exchange, AFE Region 4, Data Center Dynamics and Long Island Forum for Technology (www.lift.org), and Mission Critical Magazine.

Dedicated to and Brian K. Fabel, Al Baker, Bill Mann, Kenneth Morrelly, Terri Sinski, and Vecas Gray; six inspirational people who contributed vastly to the Mission Critical and Homeland Security Industries and enriched all of our lives. You will be missed.

Lastly, my deepest apologies for anyone I have forgotten.

1An Overview of Reliability and Resiliency in Today’s Mission Critical Environment

“The best way to predict your future is to create it.”

Abraham Lincoln

1.1 Introduction

Continuous, clean, and uninterrupted power and cooling is the lifeblood of any data center, especially one that operates 24 hours a day, 7 days a week. Critical enterprise power is the power without which an organization would quickly be unable to achieve its business objectives. Today, more than ever, enterprises of all types and sizes are demanding 24‐hour system availability. This means enterprises must have 24‐hour power and cooling day after day, year after year. One such example is the banking and financial services industry. Business practices mandate continuous uptime for all computer and network equipment to facilitate round‐the‐clock trading and banking processes anywhere, and everywhere, from any device in the world. Banking and financial service firms are completely intolerant of unscheduled downtime, given the guaranteed loss of business that invariably results. However, providing the best equipment is not enough to ensure 24‐hour operation throughout the year. The goal is to achieve reliable 24‐hour power, cooling, and processing at all times, regardless of the technological sophistication of the equipment or the demands placed upon that equipment by the end‐user, be it business or municipality.

Today most industries are constantly expanding to meet the needs of the growing global digital economy. The industry as a whole has been innovative in the design and use of the latest technologies, driving its businesses to become increasingly digitized in this highly competitive business environment. The industry is progressively more dependent on the continuous operation of its data centers in reaction to the competitive realities of a global economy. To achieve optimum reliability when the supply and availability of power are becoming less certain is challenging to say the least. The data center of the past required only the installation of standalone protective electrical and mechanical equipment main. Data centers today operate on a much larger scale, 24/7. The proliferation of distributed systems using thousands of desktop PCs and workstations connected through LANs, WANs, WLAN, SAN, VPN, etc. simultaneously use dozens of software business applications and reporting tools, makes each building a “computer room.” These computer rooms are also known as Intermediate Distribution Frame (IDF) and Main Distribution Frame (MDF) critical spaces. As we add the total number of locations utilized by each bank all over the world utilizing the internet, we now realize the necessity of a critical infrastructure and associated benefits of uptime, and reliability.

The reputation of Corporate America was severely harmed now almost two decades ago by a number of historically significant events: the collapse of the dot.com bubble and the high‐profile corporate scandals. These events have taken a significant toll on financial markets and have served to deflate the faith and confidence of investors. In response, governments and other global organizations enacted new or revised existing laws, policies, and regulations. In the United States, laws such as the Sarbanes‐Oxley Act of 2002 (SOX), Basel II, and the U.S. Patriot Act were created. In addition to management accountability, another embedded component of SOX makes it imperative that companies not risk losing data or even risk downtime that could jeopardize accessing information in a timely fashion. These laws can actually improve business productivity and processes.

Many companies, due to lack of awareness, a misunderstanding of reliability concepts, or other factors, fail to consider installing backup equipment or design their systems with the proper levels of redundancy commensurate with their risk profile. Then, when a major power outage occurs, these same companies suddenly discover that they will take a huge hit operationally and financially. Only then do they learn that the hit would have been avoided entirely or reduced in magnitude had they undertaken appropriate action beforehand. During the months following the Northeast Blackout of 2003, for example, there was a marked increase in the installation of UPS systems and standby generators. Small and large businesses alike learned how susceptible they are to power disturbances and the associated costs of not being prepared. Some businesses that were not typically considered mission critical learned that they could not afford to be unprotected during a power outage. The Northeast blackout of 2003 emphasized the interdependencies across the critical infrastructure and the cascading impacts that occur when one component falters. Most ATMs in the affected areas stopped working, although several had backup systems that enabled them to function for a short period. Soon after the power went out, the Comptroller of the Currency signed an order authorizing national banks to close at their discretion. Governors in a number of affected states made similar proclamations for state‐chartered depository institutions. The end result was a loss of revenue, profits, and a threat to the confidence in our financial system. More prudent planning and the proper level of investment in mission critical infrastructure for electric, water, and telecommunications utilities, coupled with proactive building infrastructure preparation, and operations, could have saved the banking and financial services industry millions.

At the present time, the risks associated with cascading power supply interruptions from the public electrical grid in the United States have increased due to the ever‐increasing reliance on computer and related technologies. This has occurred while investments in the reliability and security of the grid have not kept pace with the levels recommended by industry experts. Today there are trillions of devices and billions of people connected to the world‐wide‐web. As the number of computers and related technologies continue to multiply in this increasingly digital world, the demand for reliability increases as well. Businesses are not only competing in the marketplace to deliver whatever goods and services are produced for consumption, but now they must compete to hire the best engineers from a dwindling pool of talent who can design the best infrastructures needed to obtain and deliver reliable power and cooling. This keeps the mission critical manufacturing and technology centers up and running with the ability to produce the very goods and services that sustain them. The idea that businesses today must compete for the best talent to obtain reliable power is not new, as are the consequences of failing to meet this challenge. Without reliable power, there are no goods and services for sale, no revenues, and no profits ‐ only losses when power is not available. Hiring and keeping the best‐trained engineers employing the very best analyses, making the best strategic choices, and following the best operational plans to keep ahead of the power supply curve is essential for any technologically sophisticated business to thrive and prosper. A key to success is to provide proper training and educational resources to engineers so they may increase their knowledge and keep current on the latest mission critical technologies available all over the world, which is one of the purposes of this content. In addition, companies need to pool their efforts toward improving educational opportunities and certification programs for young mission critical engineers to help address the decreasing workforce necessary to sustain the growing mission critical industry.

It is also essential for critical industries to constantly and systematically evaluate their mission critical systems, assess and reassess their level of risk tolerance versus the cost of downtime, and plan for future upgrades in equipment and services that are designed to meet business needs and ensure uninterrupted power and cooling supplies in the years ahead. Simply put, minimizing unplanned downtime reduces risk. Unfortunately, the most common approach is reactive, that is, spending time and resources to repair a failed piece of equipment after the fact as opposed to identifying when the equipment is likely to fail and repairing or replacing it without interruption. If the utility goes down, install a generator. If a ground‐fault trips critical loads, redesign the distribution system. If a lightning strike burns out power supplies, install a new lightning protection system. Such measures certainly make sense, as they address real risks associated with the critical infrastructure; however, they are always performed after the harm has occurred. Often, such efforts proceed in haste without enough consideration of how the short‐term fix fits into the larger picture of how the facility’s systems should operate in an integrated manner. This can result in the introduction of new vulnerabilities. Strategic planning, on the other hand, can identify internal risks and provide a prioritized plan for reliability improvements that identify the root causes of failure before they occur.

In the world of high‐powered business, owners of real estate have come to learn that they, too, must meet the demands for reliable power supply to their tenants. As more and more buildings are required to deliver service guarantees, management must decide what performance is required from each facility in the building. Availability levels of 99.999% (5.25 minutes of downtime per year) allow virtually no facility downtime for maintenance or other planned or unplanned events. Moving toward high reliability is imperative. Moreover, avoiding the landmines that can cause outages and unscheduled downtime never ends. Event planning and impact assessments are tasks that are never truly completed; they should be viewed afresh at least once every budget cycle.