152,99 €
This book provides a comprehensive treatment of investing chemical processing incidents. It presents on-the-job information, techniques, and examples that support successful investigations. Issues related to identification and classification of incidents (including near misses), notifications and initial response, assignment of an investigation team, preservation and control of an incident scene, collecting and documenting evidence, interviewing witnesses, determining what happened, identifying root causes, developing recommendations, effectively implementing recommendation, communicating investigation findings, and improving the investigation process are addressed in the third edition. While the focus of the book is investigating process safety incidents the methodologies, tools, and techniques described can also be applied when investigating other types of events such as reliability, quality, occupational health, and safety incidents.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 606
Veröffentlichungsjahr: 2019
PUBLICATIONS AVAILABLE FROM THECENTER FOR CHEMICAL PROCESS SAFETYof theAMERICAN INSTITUTE OF CHEMICAL ENGINEERS
This book is one in a series of process safety guidelines and concept books published by the Center for Chemical Process Safety (CCPS). Please go to www.wiley.com/go/ccps for a full list of titles in this series.
This edition first published 2019© 2019 the American Institute of Chemical Engineers
A Joint Publication of the American Institute of Chemical Engineers and John Wiley & Sons, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The rights of CCPS to be identified as the author of the editorial material in this work have been asserted in accordance with law.
Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
Editorial Office111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data is available.
Hardback ISBN: 9781119529071
Cover Images: Silhouette, oil refinery © manyx31/iStockphoto; Stainless steel © Creativ Studio Heinemann/Getty Images, Inc.; Dow Chemical Operations, Stade, Germany/Courtesy of The Dow Chemical Company
It is sincerely hoped that the information presented in this document will lead to an even more impressive safety record for the entire industry. However, the American Institute of Chemical Engineers, its consultants, the CCPS Technical Steering Committee and Subcommittee members, their employers, their employers’ officers and directors, and Baker Engineering and Risk Consultants, Inc.®, and its employees do not warrant or represent, expressly or by implication, the correctness or accuracy of the content of the information presented in this document. As between (1) American Institute of Chemical Engineers, its consultants, CCPS Technical Steering Committee and Subcommittee members, their employers, their employers’ officers and directors, and Baker Engineering and Risk Consultants, Inc.®, and its employees and (2) the user of this document, the user accepts any legal liability or responsibility whatsoever for the consequences of its use or misuse.
Cover
PREFACE
ACKNOWLEDGMENTS
ACRONYMS AND ABBREVIATIONS
1 INTRODUCTION
1.1 BUILDING ON THE PAST
1.2 INVESTIGATION BASICS
1.3 WHO SHOULD READ THIS BOOK?
1.4 THE GUIDELINE’S OBJECTIVES
1.5 THE GUIDELINE’S CONTENT AND ORGANIZATION
1.6 THE CONTINUING EVOLUTION OF INCIDENT INVESTIGATION
2 OVERVIEW OF CHEMICAL PROCESS INCIDENT CAUSATION
2.1 STAGES OF A PROCESS-RELATED INCIDENT
2.2 KEY CAUSATION CONCEPTS
2.3 SUMMARY
3 AN OVERVIEW OF INVESTIGATION METHODOLOGIES
3.1 HISTORY OF INVESTIGATION METHODOLOGIES AND TOOLS
3.2 TOOLS FOR USE IN PREPARATION FOR ROOT CAUSE ANALYSIS
3.3 STRUCTURED ROOT CAUSE ANALYSIS METHODOLOGIES
3.4 SELECTING AN APPROPRIATE METHODOLOGY
4 DESIGNING AN INCIDENT INVESTIGATION MANAGEMENT SYSTEM
4.1 SYSTEM CONSIDERATIONS
4.2 TYPICAL MANAGEMENT SYSTEM TOPICS
4.3 MANAGEMENT SYSTEM
5 INITIAL NOTIFICATION, CLASSIFICATION AND INVESTIGATION OF PROCESS SAFETY INCIDENTS
5.1 INTERNAL REPORTING
5.2 INCIDENT CLASSIFICATION
5.3 INCIDENT NOTIFICATION
5.4 TYPE OF INVESTIGATION
5.5 SUMMARY
6 BUILDING AND LEADING AN INCIDENT INVESTIGATION TEAM
6.1 TEAM APPROACH
6.2 ADVANTAGES OF THE TEAM APPROACH
6.3 LEADING A PROCESS SAFETY INCIDENT INVESTIGATION TEAM
6.4 POTENTIAL TEAM COMPOSITION
6.5 BUILDING A TEAM FOR A SPECIFIC INCIDENT
6.6 TEAM ACTIVITIES
6.7 SUMMARY
7 WITNESS MANAGEMENT
7.1 OVERVIEW
7.2 IDENTIFYING WITNESSES
7.3 WITNESS INTERVIEWS
7.4 CONDUCTING FOLLOW-UP ACTIVITIES
7.5 CONDUCTING FOLLOW-UP INTERVIEWS
7.6 RELIABILITY OF WITNESS STATEMENTS
7.7 SUMMARY
8 EVIDENCE IDENTIFICATION, COLLECTION AND MANAGEMENT
8.1 OVERVIEW
8.2 SOURCES OF EVIDENCE
8.3 EVIDENCE GATHERING
8.4 TIMELINES AND SEQUENCE DIAGRAMS
8.5 SUMMARY
9 EVIDENCE ANALYSIS AND CAUSAL FACTOR DETERMINATION
9.1 SCIENTIFIC METHOD
9.2 CONFIRMATION BIAS
9.3 EVIDENCE ANALYSIS
9.4 HYPOTHESIS FORMULATION
9.5 HYPOTHESIS TESTING
9.6 SELECT THE FINAL HYPOTHESIS
9.7 SUMMARY
10 DETERMINING ROOT CAUSES—STRUCTURED APPROACHES
10.1 CONCEPT OF ROOT CAUSE ANALYSIS
10.2 CASE HISTORIES
10.3 METHODOLOGIES FOR ROOT CAUSE ANALYSIS
10.4 ROOT CAUSE DETERMINATION USING LOGIC TREES
10.5 BUILDING A LOGIC TREE
10.6 EXAMPLE APPLICATIONS
10.7 ROOT CAUSE DETERMINATION USING PREDEFINED TREES
10.8 USING PREDEFINED TREES
10.9 CHECKLISTS
10.10 HUMAN FACTORS APPLICATIONS
10.11 SUMMARY
11 THE IMPACT OF HUMAN FACTORS
11.1 HUMAN FACTORS CONCEPTS
11.2 INCORPORATING HUMAN FACTORS INTO THE INCIDENT INVESTIGATION PROCESS
11.3 OTHER REFERENCES
11.4 SUMMARY
12 DEVELOPING EFFECTIVE RECOMMENDATIONS
12.1 KEY CONCEPTS
12.2 DEVELOPING EFFECTIVE RECOMMENDATIONS
12.3 TYPES OF RECOMMENDATIONS
12.4 THE RECOMMENDATION PROCESS
12.5 SUMMARY
13 PREPARING THE FINAL REPORT
13.1 REPORT SCOPE
13.2 INTERIM REPORTS
13.3 WRITING THE REPORT
13.4 SAMPLE REPORT FORMAT
13.5 REPORT REVIEW AND QUALITY ASSURANCE
13.6 INVESTIGATION DOCUMENT AND EVIDENCE RETENTION
13.7 SUMMARY
14 IMPLEMENTING RECOMMENDATIONS
14.1 ACTIVITIES RELATED TO RECOMMENDATION IMPLEMENTATION
14.2 VALIDATION OF EFFECTIVENESS – CASE STUDIES
14.3 PRACTICAL SUGGESTIONS FOR SUCCESSFUL RECOMMENDATION IMPLEMENTATION
15 CONTINUOUS IMPROVEMENT FOR THE INCIDENT INVESTIGATION SYSTEM
15.1 REGULATORY COMPLIANCE REVIEW
15.2 INVESTIGATION QUALITY ASSESSMENT
15.3 CAUSAL CATEGORY ANALYSIS
15.4 REVIEW OF NEAR-MISS EVENTS
15.5 RECOMMENDATIONS REVIEW
15.6 INVESTIGATION FOLLOW-UP REVIEW
15.7 KEY PERFORMANCE INDICATORS
15.8 SUMMARY
16 LESSONS LEARNED
16.1 VARIOUS SOURCES OF LEARNING FROM INCIDENTS
16.2 IDENTIFYING LEARNING OPPORTUNITIES
16.3 SHARING AND INSTITUTIONALIZING LESSONS LEARNED
16.4 SENIOR MANAGEMENT – INCIDENT SHARING AND COMMITMENT
16.5 EXAMPLES OF SHARING LESSONS LEARNED
16.6 SUMMARY
APPENDIX A.PHOTOGRAPHY GUIDELINES FOR MAXIMUM RESULTS
APPENDIX B. EXAMPLE PROTOCOL – CHECKING POSITION OF A CHAIN VALVE
APPENDIX C. PROCESS SAFETY EVENTS LEVELING CRITERIA
APPENDIX D. EXAMPLE CASE STUDY
APPENDIX E. QUICK CHECKLIST FOR INVESTIGATORS
APPENDIX F. EVIDENCE PRESERVATION CHECKLIST – PRIOR TO ARRIVAL OF THE INVESTIGATION TEAM
APPENDIX G. GUIDANCE ON CLASSIFYING POTENTIAL SEVERITY OF A LOSS OF PRIMARY CONTAINMENT
GLOSSARY
REFERENCES
INDEX
WILEY END USER LICENSE AGREEMENT
Chapter 2
Table 2.1
Chapter 3
Table 3.1
Chapter 4
Table 4.1
Chapter 5
Table 5.1
Table 5.2
Table 5.3
Table 5.4
Chapter 6
Table 6.1
Chapter 7
Table 7.1
Chapter 8
Table 8.1
Table 8.2
Table 8.3
Table 8.4
Table 8.5
Chapter 9
Table 9.1
Chapter 10
Table 10.1
Table 10.2
Table 10.3
Chapter 11
Table 11.1
Chapter 13
Table 13.1
Table 13.2
Table 13.3
Chapter 15
Table 15.1
Table 15.2
Table 15.3
Table 15.4
Table 15.5
Chapter 16
Table 16.1
Chapter 2
Figure 2.1 Event Tree for a Process-related Incident
Figure 2.2 Swiss Cheese Model
Figure 2.3 Latent (hidden) Failure
Figure 2.4 Incident Prevention Strategies
Figure 2.5 Universal Concept for Controlling Risk (Kletz)
Chapter 3
Figure 3.1 Overview of Investigation Tools
Figure 3.2 Schematic of an MES display (Benner, 2000)
Figure 3.3 Top Portion of the Generic MORT Tree
Figure 3.4 Common Features of Investigation Methodologies
Chapter 4
Figure 4.1 Management System for Process Safety Investigation
Figure 4.2 Checklist for Developing an Incident Investigation Plan
Chapter 5
Figure 5.1 . Logic Tree for Determining Incident Classification
Figure 5.2 Example Risk Matrix for Determining Incident Classification
Chapter 6
Figure 6.1 Investigation Team Collaboration
Chapter 7
Figure 7.1 Iteration between Witness and Physical Evidence Collection and Analysis
Figure 7.2 List of Potential Witnesses
Figure 7.3 . Illustration of Human Observation Limitations
Figure 7.4 Overview of Interview Process
Chapter 8
Figure 8.1 Iteration between Data Analysis and Data Gathering
Figure 8.2. Forms of Data Fragility
Figure 8.3 As-found Position of Valves—Example Photo
Figure 8.4 Initial Site Visit—Example Photo
Figure 8.5 Timeline Example Based on Precise Data
Figure 8.6 Timeline Example Based on Approximate Data
Figure 8.7 Timeline Example Based on a Combination of Precise and Approximate Data
Figure 8.8 Timeline Tips
Figure 8.9 Sequence Diagram for Tank Overflow Example
Chapter 9
Figure 9.1 Scientific Method Process
Figure 9.2 Basic Steps in Failure Analysis
Figure 9.3 Rules for Causal Factor Charting
Figure 9.4 Example of a Causal Factor Chart
Chapter 10
Figure 10.1 Example of 5 Whys Root Cause Analysis
Figure 10.2 Example of Ishikawa Fishbone Diagram
Figure 10.3 Structured Root Cause Methods Described in This Chapter
Figure 10.4 . Flowchart for Root Cause Determination Using Logic Trees
Figure 10.5 Generic Logic Tree Displaying the AND-Gate
Figure 10.6 Generic Logic Tree for a Fire
Figure 10.7 Generic Logic Tree Displaying the OR-Gate
Figure 10.8 Logic Tree using the OR-Gate to establish an Ignition Source
Figure 10.9 Other Symbols Used in Logic Trees
Figure 10.10 Logic Tree Tips
Figure 10.11 Example Top of the Logic Tree, Employee Slip
Figure 10.12 Example Logic Tree Branch Level, Oil Spill
Figure 10.13 Example Logic Tree, Hand-carried Containers
Figure 10.14 Logic Tree, Slip/ Trip/ Fall Incident
Figure 10.15 Logic Tree Top, Employee Burn
Figure 10.16 Logic Tree Branch, Acid Spray
Figure 10.17 Expanded Logic Tree Sample, Employee Burn
Figure 10.18 Operator Fatality Branch
Figure 10.19 Fire Branch
Figure 10.20 Fact/Hypothesis Matrix for the Kettle Exit Piping Failure
Figure 10.21 Exit Piping Crack Branch
Figure 10.22 Flowchart for Root Cause Determination—Predefined Tree/ Checklist
Figure 10.23 Example of Root Causes Arranged Hierarchically within a Section of a Predefined...
Figure 10.24 Incident Sequence
Figure 10.25 Complete Causal Factor Chart for Fish Kill Incident
Figure 10.26 Top of the Predefined Tree
Figure 10.27 First Question of the Human Performance Difficulty Category
Figure 10.28 Human Engineering Branch of the Tree
Figure 10.29 Analysis of the Human Engineering Branch
Chapter 11
Figure 11.1 Common Human Factors Model (CCPS, 2007)
Figure 11.2 Example of Poor Pump and Switch Arrangement
Figure 11.3 Incident Causation Model (EI, 2016)
Chapter 12
Figure 12.1 Incident Investigation Recommendation Flowchart
Figure 12.2 Layers of Safety (Foord, 2004)
Figure 12.3 Bow-Tie Barrier Method
Figure 12.4 Example Recommendations and Assessment Strategies (ABS, 2001)
Chapter 14
Figure 14.1 Flowchart for Implementation and Follow-up
Chapter 16
Figure 16.1 Example Safety Alert
Figure 16.2 CCPS Process Safety Beacon
Figure 16.3 ICI Safety Newsletter No. 96/ 1 & 2
Figure 16.4 ICI Safety Newsletter No. 96/7
Figure 16.5 Learning Event Report Example
Figure 16.6 Process Safety Bulletin Example
Cover
Table of Contents
INTRODUCTION
iii
v
vi
vii
xix
xx
xxi
xxiii
xxiv
xxv
xxvi
xxvii
xxviii
xxix
xxx
xxxi
xxxii
xxxiii
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
96
97
98
99
100
101
102
103
104
105
106
107
108
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
314
315
316
317
318
319
320
321
322
323
324
326
327
328
329
330
331
332
333
334
335
336
337
338
340
341
342
343
344
345
346
347
348
349
350
352
353
354
355
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
Figure 2.1 Event Tree for a Process-related Incident
Figure 2.2 Swiss Cheese Model
Figure 2.3 Latent (hidden) Failure
Figure 2.4 Incident Prevention Strategies
Figure 2.5 Universal Concept for Controlling Risk
Figure 3.1 Overview of Investigation Tools
Figure 3.2 Schematic of an MES display
Figure 3.3 Top Portion of the Generic MORT Tree
Figure 3.4 Common Features of Investigation Methodologies
Figure 4.1 Management System for Process Safety Investigation
Figure 4.2 Checklist for Developing an Incident Investigation Plan
Figure 5.1 Logic Tree for Determining Incident Classification
Figure 5.2 Example Risk Matrix for Determining Incident Classification
Figure 6.1 Investigation Team Collaboration
Figure 7.1 Iteration between Witness and Physical Evidence Collection and Analysis
Figure 7.2 List of Potential Witnesses
Figure 7.3 Illustration of Human Observation Limitations
Figure 7.4 Overview of Interview Process
Figure 8.1 Iteration between Data Analysis and Data Gathering
Figure 8.2 Forms of Data Fragility
Figure 8.3 As-found Position of Valves—Example Photo
Figure 8.4 Initial Site Visit—Example Photo
Figure 8.5 Timeline Example Based on Precise Data
Figure 8.6 Timeline Example Based on Approximate Data
Figure 8.7 Timeline Example Based on a Combination of Precise and Approximate Data
Figure 8.8 Timeline Tips
Figure 8.9 Sequence Diagram for Tank Overflow Example
Figure 9.1 Scientific Method Process
Figure 9.2 Basic Steps in Failure Analysis
Figure 9.3 Rules for Causal Factor Charting
Figure 9.4 Example of a Causal Factor Chart
Figure 10.1 Example of 5 Whys Root Cause Analysis
Figure 10.2 Example of Ishikawa Fishbone Diagram
Figure 10.3 Structured Root Cause Methods Described in This Chapter
Figure 10.4 Flowchart for Root Cause Determination Using Logic Trees
Figure 10.5 Generic Logic Tree Displaying the AND-Gate
Figure 10.6 Generic Logic Tree for a Fire
Figure 10.7 Generic Logic Tree Displaying the OR-Gate
Figure 10.8 Logic Tree using the OR-Gate to establish an Ignition Source
Figure 10.9 Other Symbols Used in Logic Trees
Figure 10.10 Logic Tree Tips
Figure 10.11 Example Top of the Logic Tree, Employee Slip
Figure 10.12 Example Logic Tree Branch Level, Oil Spill
Figure 10.13 Example Logic Tree, Hand-carried Containers
Figure 10.14 Logic Tree, Slip/Trip/Fall Incident
Figure 10.15 Logic Tree Top, Employee Burn
Figure 10.16 Logic Tree Branch, Acid Spray
Figure 10.17 Expanded Logic Tree Sample, Employee Burn
Figure 10.18 Operator Fatality Branch
Figure 10.19 Fire Branch
Figure 10.20 Fact/Hypothesis Matrix for the Kettle Exit Piping Failure
Figure 10.21 Exit Piping Crack Branch
Figure 10.22 Flowchart for Root Cause Determination—Predefined Tree/Checklist
Figure 10.23 Example of Root Causes Arranged Hierarchically within a Section of a Predefined Tree
Figure 10.24 Incident Sequence
Figure 10.25 Complete Causal Factor Chart for Fish Kill Incident
Figure 10.26 Top of the Predefined Tree
Figure 10.27 First Question of the Human Performance Difficulty Category
Figure 10.28 Human Engineering Branch of the Tree
Figure 10.29 Analysis of the Human Engineering Branch
Figure 11.1 Common Human Factors Model
Figure 11.2 Example of Poor Pump and Switch Arrangement
Figure 11.3 Incident Causation Model
Figure 12.1 Incident Investigation Recommendation Flowchart
Figure 12.2 Layers of Safety
Figure 12.3 Bow-Tie Barrier Method
Figure 12.4 Example Recommendations and Assessment Strategies
Figure 14.1 Flowchart for Implementation and Follow-up
Figure 16.1 Example Safety Alert
Figure 16.2 CCPS Process Safety Beacon
Figure 16.3 ICI Safety Newsletter No. 96/1 & 2
Figure 16.4 ICI Safety Newsletter No. 96/7
Figure 16.5 Learning Event Report Example
Figure 16.6 Process Safety Bulletin Example
Table 2.1 Attributes of a Management System
Table 3.1 Some Characteristics of Selected Public Methodologies
Table 4.1 Suggested Training for Effective Implementation
Table 5.1 Common Classification Schemes
Table 5.2 Tier 1 Process Safety Event Severity Categories
Table 5.3 Example of Likelihood Levels for Determining Incident Classification
Table 5.4 Examples of the Impacts of a 1000-lb Cyclohexane Release
Table 7.1 Example Questions for Witnesses and Emergency Responders
Table 8.1 Scene Activities and Typical Responsibilities
Table 8.2 Examples of Paper Evidence
Table 8.3 Examples of Electronic Data
Table 8.4 Examples of Position Data
Table 8.5 Example Data Collection Form for Recording Physical Evidence
Table 9.1 Example Fact/Hypothesis Matrix – Chemical Reduction Explosion
Table 10.1 Strengths and Weaknesses of the 5 Whys Technique
Table 10.2 Strengths and Weaknesses of Logic Trees
Table 10.3 Strengths and Weaknesses of Predefined Trees
Table 11.1 Human Factors Issues
Table 13.1 Sample Sections of an Incident Investigation Report
Table 13.2 Findings, Causal Factors, Root Causes and Recommendations
Table 13.3 Example Checklist for Written Reports
Table 15.1 Requirement Compliance Checklist
Table 15.2 Investigation Key Element Audit Checklist
Table 15.3 Example Categories for Incident Investigation Findings
Table 15.4 Recommendations Review Checklist
Table 15.5 Example Follow-Up Checklist
Table 16.1 Questions for Identifying Learning Opportunities
The American Institute of Chemical Engineers (AIChE) has helped chemical plants, petrochemical plants, and refineries address the issues of process safety and loss control for over 30 years. Through its ties with process designers, plant constructors, facility operators, safety professionals, and academia, the AIChE has enhanced communication and fostered improvement in the high safety standards of the industry. AIChE’s publications and symposia have become an information resource for the chemical engineering profession on the causes of incidents and the means of prevention.
The Center for Chemical Process Safety (CCPS), a directorate of AIChE, was established in 1985 to develop and disseminate technical information for use in the prevention of major chemical accidents. CCPS is supported by a diverse group of industrial sponsors in the chemical process industry and related industries who provide the necessary funding and professional guidance for its projects. The CCPS Technical Steering Committee and the technical subcommittees oversee individual projects selected by the CCPS. Professional representatives from sponsoring companies staff the subcommittees and a member of the CCPS staff coordinates their activities.
Since its founding, CCPS has published many volumes in its “Guidelines” series and in smaller “Concept” texts. Although most CCPS books are written for engineers in plant design and operations and address scientific techniques and engineering practices, several guidelines cover subjects related to chemical process safety management. A successful process safety program relies upon committed managers at all levels of a company, who view process safety as an integral part of overall business management and act accordingly.
Incident investigation is an essential element of every process safety management program. This book presents underlying principles, management system considerations, investigation tools, and specific methodologies for investigating incidents in a way that will support implementation of a rigorous process safety program at any facility. The principles and suggested practices contained in this expanded third edition are not limited to chemical and petroleum process incidents. The basic concepts and provided examples are equally applicable to mining, pharmaceutical, manufacturing, mail order fulfillment, and numerous other hazardous industries.
A team of incident investigation experts from the petroleum, chemical, and consulting industries, as well as a regulatory agency representative, drafted the chapters for this guideline and provided real-world examples to illustrate some of the tools and methods used in their profession. The subcommittee members reviewed the content extensively and industry peers evaluated this book to help ensure it represents a factual accounting of industry best practices. This third edition of the guideline provides updated information on many facets of the investigative process as well as additional details on important considerations such as human factors, forensics, and legalities surrounding incident investigations.
The American Institute of Chemical Engineers wishes to thank the Center for Chemical Process Safety (CCPS) and those involved in its operation, including its many sponsors whose funding made this project possible; the members of its Technical Steering Committee who conceived of and supported this Guidelines project; and the members of its Incident Investigation Subcommittee. The Incident Investigation Subcommittee of the Center for Chemical Process Safety authored this third edition of the Guidelines for Investigating Process Safety Incidents.
The members of the CCPS Incident Investigation Subcommittee were:
Michael Broadribb, Baker Engineering and Risk Consultants, Inc.
Laurie Brown, Eastman Chemical Company
Chonai Cheung, Contra Costa County
Eddie Dalton, BASF
Carolina Del Din, PSRG
Jerry Forest, Celanese, Subcommittee Chair
Scott Guinn, Chevron Corporation
Christopher Headen, Cargill
Kathleen Kas, Dow Chemical Company
Mark Paradies, System Improvements, Inc.
Nestor Paraliticci, Andeavor
Muddassir Penkar, Evonik Canada Inc.
Morgan Reed, Exponent
Meg Reese, Occidental Chemical Corp.
Marc Rothschild, DuPont
Joy Shah, Reliance Industries Ltd
Dan Sliva, CCPS Staff Advisor
Robert (Bob) Stankovich, Eli Lilly
Lee Vanden Heuvel, ABS Consulting
Terry Waldrop, AIG
Scott Wallace, Olin
Della Wong, Canadian Natural Resources
The third edition was authored by Baker Engineering and Risk Consultants, Inc. The authors at BakerRisk were:
Quentin A. Baker
Michael P. Broadribb
Cheryl A. Grounds
Thomas V. Rodante
Roger C. Stokes
Dan Sliva was the CCPS staff liaison and was responsible for overall administration of the project.
CCPS also gratefully acknowledges the comments and suggestions received from the following peer reviewers:
Amy Breathat, NOVA Chemicals Corporation
Steven D. Emerson, Emerson Analysis
Patrick Fortune, Suncor Energy
Walter L. Frank, Frank Risk Solutions, Inc.
Barry Guillory, Louisiana State University
Jerry L. Jones, CFEISBC Global
Gerald A. King, Armstrong Teasdale LLP
Susan M. Lee, Andeavor
William (Bill) D. Mosier, Syngenta Crop Protection, LLC
Mike Munsil, PSRG
Pamela Nelson, Solvay Group
Katherine Pearson, BP Americas
S. Gill Sigmon, AdvanSix
Their insights, comments, and suggestions helped ensure a balanced perspective to this Guideline.
The efforts of the document editor at BakerRisk are gratefully acknowledged for contributions in editing, layout, and assembly of the book. The document editor was Phyllis Whiteaker.
The members of the CCPS Incident Investigation Subcommittee wish to thank their employers for allowing them to participate in this project and lastly, we wish to thank Anil Gokhale of the CCPS staff for his support and guidance.
ACC
American Chemistry Council
AIChE
American Institute of Chemical Engineers
ALARP
As Low as Reasonably Practicable
ANSI
American National Standards Institute
API
American Petroleum Institute
ARIP
Accidental Release Information Program
ARIA
Analysis, Research and Information on Accidents
ASME
American Society of Mechanical Engineers
BARPI
Bureau for Analysis of Industrial Risks and Pollutions
BP
Boiling Point
BI
Business Interruption
BLEVE
Boiling Liquid Expanding Vapor Explosion
BPCS
Basic Process Control System
C
Consequence factor, related to magnitude of severity
CCF
Common Cause Failure
CCPS
Center for Chemical Process Safety,
CE/A
Change Evaluation/Analysis
CEFIC
(European) Chemical Industry Council
CEI
Dow Chemical Exposure Index
CELD
Cause and Effect Logic Diagram
CFD
Computational Fluid Dynamics
CIRC
Chemical Incidents Report Center
CLC
Comprehensive List of Causes
COMAH
Control of Major Accident Hazards
CPQRA
Chemical Process Quantitative Risk Assessment
CSB
Chemical Safety and Hazards Investigation Board (US)
CTM
Causal Tree Method
CW
Cooling Water
D
Number of times a component or system is challenged (hr–1 or year–1)
DCS
Distributed Control System
DIERS
Design Institute for Emergency Relief Systems
DMAIC
Define, Measure, Analyze, Improve, Control
DOT
Department of Transportation
E& CF
Events & Causal Factor Charting
EBV
Emergency Block Valve
EHS
Environmental, Health & Safety
EI
Energy Institute
EPA
United States Environmental Protection Agency
eMARS
European Commission Major Accident Reporting System
EPSC
European Process Safety Centre
ERPG
Emergency Response Planning Guideline
ETA
Event Tree Analysis
F
Failure Rate (hr–1 or year–1)
f
Frequency (hr–1 or year–1)
F& EI
Dow Fire and Explosion Index
F/N
Fatality Frequency versus Cumulative Number
FCE
Final Control Element
FEA
Finite Element Analysis
FMEA
Failure Modes and Effect Analysis
FTA
Fault Tree Analysis
HAZMAT
Hazardous Materials
HAZOP
Hazard and Operability Study
HAZWOPER
Hazardous Waste Operations and Emergency Response
HBTA
Hazard–Barrier–Target Analysis
HE
Hazard Evaluation
HIRA
Hazard Identification and Risk Analysis
HMI
Human Machine Interface
HSE
(UK) Health and Safety Executive
HRA
Human Reliability Analysis
ICCA
International Council of Chemical Associations
IChemE
Institution of Chemical Engineers
IEC
International Electrotechnical Commission
IEEE
Institute of Electrical and Electronic Engineers
IOGP
International Association of Oil & Gas Producers
IPL
Independent Protection Layer
ISA
The Instrumentation, Systems, and Automation Society (formerly, Instrument Society of America)
ISBL
Inside Battery Limits
ISD
Inherently Safer Design
ISO
International Organization for Standardization
JSA
Job Safety Analysis
KPI
Key Performance Indicators
LAH
Level Alarm—High
LAL
Level Alarm—Low
LEL
Lower Explosive Limit
LFL
Lower Flammability Limit
LI
Level Indicator
LIC
Level Indicator—Control
LNG
Liquefied Natural Gas
LOPA
Layer of Protection Analysis
LOPC
Loss of Primary Containment
LOTO
Lockout/Tagout
LSHH
Level Sensor High High
LT
Level Transmitter
MARS
Major Accident Reporting System
MAWP
Maximum Allowable Working Pressure
MCSOII
Multiple-Cause, Systems-Oriented Incident Investigation
MES
Multilinear Event Sequencing
MHIDAS
Major Hazard Incident Data System
MI
Mechanical Integrity
MIC
Methyl isocyanate
MM
Million
MOC
Management of Change
MOM
Singapore's regulatory standard for incident investigation
MORT
Management Oversight Risk Tree
MSDS
Material Safety Data Sheet
NAICS
North American Industry Classification System
NFPA
National Fire Protection Association
N
2
Nitrogen
NOM
Mexico's regulatory standard for incident investigations
NTSB
National Transportation Safety Board
IOGP
International Association of Oil and Gas Producers
OREDA
The Offshore Reliability Data project
ORPS
Occurrence Reporting and Processing System
OSBL
Outside Battery Limits
OSHA
United States Occupational Safety and Health Administration
P
fatality
Probability of Fatality
P
ignition
Probability of Ignition
P
person present
Probability of Person Present
P
Probability
P& ID
Piping and Instrumentation Diagram
PCB
Polychlorinated Biphenyl
PFD
Probability of Failure on Demand
PHA
Process Hazard Analysis
PI
Pressure Indicator
PIF
Performance Influencing Factor
PL
Protection Layer
PLC
Programmable Logic Controller
PM
Preventive Maintenance
PPE
Personal Protective Equipment
PSHH
Pressure Sensor High High
PSI
Process Safety Information
PSID
Process Safety Incident Database
PSM
Process Safety Management
PSM
also Canada's (non-regulatory) standard, individualized by district
PSV
Pressure Safety Valve (Relief Valve)
R
Risk
RCA
Root Cause Analysis
RIDDOR
Reporting of Injuries, Diseases and Dangerous Occurrence Regulations
RMP
Risk Management Program (US)
RQ
Release Quantity
RV
Relief Valve
SAWS
China's regulatory guideline for incident investigations
SCAT
Systematic Cause Analysis Technique
SCE
Safety Critical Equipment
SDS
Safety Data Sheets
SEMS
Safety and Environmental Management System
SHE
Safety Health & Environment
SIF
Safety Instrumented Function
SIS
Safety Instrumented System
SMART
Specific, Measureable, Agreed/Attainable, and Realistic/Relevant, with Timescales
SOL
Safe Operating Limit
SOP
Standard Operating Procedure
SOURCE
Seeking Out the Underlying Root Causes of Events
SRK
Skills, Rules, Knowledge
SSDC
System Safety Development Center
STEP
Sequentially Timed Events Plot
T
Test Interval for the Component or System (hours or years)
T
0
starting time
T
n
ending time
TNO
Nederlandse Organisatie voor Toegepast
Natuurwetenschappelijk Onderzoek (TNO; English:Netherlands Organization for Applied Scientific Research)
UEL
Upper Explosive Limit
UFL
Upper Flammable Limit
VCE
Vapor Cloud Explosion
VLE
Vapor Liquid Equilibrium
XV
Remote Activated/Controlled Valve
Flixborough, Bhopal, Piper Alpha, Deepwater Horizon, Buncefield— all are now synonyms for catastrophe. These names are inextricably linked with images of death, suffering, environmental damage and disastrous loss tied to the production of chemicals, fuels, or oils. An objective review of the world’s industrial history reveals a story punctuated with infrequent yet similarly tragic incidents. Invariably, in the wake of such tragedy, companies, industries, and governments work together to learn the causes. Their ultimate goal is to implement the knowledge acquired through diligent investigation, which in turn can help prevent recurrence or mitigate consequences.
Investigations into catastrophic events have revealed something of major significance—the key to preventing disaster first lies in recognizing leading indicators rather than the lagging indicators. Leading indicators exist, and therefore can be uncovered, in incidents that are much less than catastrophic. They can even be seen in so-called near-misses that may have no discernable impact on routine operation. By examining abnormal/upset operations, near-misses, and lower-consequence higher-frequency occurrences, companies may identify deficiencies that, if left uncorrected, could eventually result in serious or even catastrophic events.
The two most significant roles incident investigations can play in comprehensive process safety programs are:
Preventing disasters by consistently examining and learning from near-misses (inclusive of abnormal operations, minor events, etc.) and;
Preventing disasters by consistently examining and learning from more serious accidents.
The Center for Chemical Process Safety (CCPS) of the American Institute of Chemical Engineers (AIChE) recognized the role of incident investigation when it published the original Guidelines for Investigating Chemical Process Incidents in 1992.
The first edition provided a timely treatment of incident investigation including:
a detailed examination of the role of incident investigation in a process safety management system,
guidance on implementing an incident investigation system, and
in-depth information on conducting incident investigations, including the tools and techniques most useful in understanding the underlying causes.
The second edition, released in 2003, built on the first text’s solid foundation. The goal was to retain the knowledge base provided in the original book while simultaneously updating and expanding upon it to reflect the latest thinking. That edition presented techniques used by the world’s leading practitioners in the science of process safety incident investigation.
This third edition is a further enhancement of the second edition. Specific emphasis has been placed on updating investigation techniques and analytical methodologies, and applying them to example case studies where possible. Expanded topics include scientific validation of hypotheses, rigorous physical evidence documentation and examination, scientific analysis, hypothesis rejection and substantiation, learnings from repeat incidents, and means to institutionalize learnings within an organization.
Successful investigations are dependent on preplanning, documented procedures, appropriate investigator training and experience, appropriate support from leadership, and necessary resources (personnel, time, and materials), to conduct a thorough investigation. It is imperative that operating organizations conduct careful and comprehensive investigations that are factual and defensible. Developing and following written procedures allows organizations to consistently respond promptly and effectively, establishes the basis for continuous improvement, and helps preserve a company’s “license to operate”.
1.2.1 The First Step in conducting a successful incident investigation is to recognize when an incident has occurred so that an Incident Management System (Chapter 4) can be activated. Linked with incident recognition are Initial Notification, Classification, and Investigation (Chapter 5).
It is important to use standard terminology when referring to incident investigation so that those investigating an occurrence all share a common language that efficiently and accurately supports their investigation objectives. Some investigators may define the terms presented below slightly differently or use other descriptive terms that have the same meaning. Some organizations may desire to further sub-divide these terms into different levels. Within the scope of this book, the following definitions for key terms will apply throughout:
1.2.1.1 Incident—an unusual, unplanned, or unexpected occurrence that either resulted in, or had the potential to result in harm to people, damage to the environment, or asset/business losses, or loss of public trust or stakeholder confidence in a company’s reputation. Some examples are:
process upset with potential process excursions beyond operating limits,
release of energy or materials,
challenges to a protective barrier,
loss of product quality control,
etc.
1.2.1.1(a) Accident—an incident that results in a significant consequence involving:
human impact,
detrimental impact on the community or environment,
property damage, material loss,
disruption of a company’s ability to continue doing business or achieve its business goals, (e.g. loss of operating license, operational interruption, product contamination, etc.).
1.2.1.1(b) Near-miss—an incident in which an adverse consequence could potentially have resulted if circumstances (weather conditions, process safeguard response, adherence to procedure, etc.) had been slightly different.
For most occurrences, protective barriers prevent a resultant adverse consequence. Such occurrences are often referred to as near-hits, near-misses, or close calls. For every incident labeled a near-miss, more subtle precursors exist that, if investigated and understood, could provide valuable insights into factors that could be applied to mitigating or preventing other incidents.
1.2.2 The Second Step in conducting a thorough investigation is to assemble a qualified team (Chapter 6) that will determine and analyze the facts of the incident. This team’s charter is to apply appropriate investigation tools and methodologies (Chapter 3) that will lead to the identification of the latent causes and application of remedies that could have prevented the incident or mitigated its consequence.
1.2.3 The Third Step in incident investigation is to gather information, separate facts from suppositions, analyze data, and determine what happened. Before conducting a cause analysis, a comprehensive and accurate understanding of what happened must first be completed. Witness management (Chapter 7), evidence management Chapter 8), and evidence analysis and hypothesis testing (Chapter 9) are key concepts to be employed during the investigation process.
1.2.4 The Fourth step in incident investigation is to determine root causes for the failure(s) that initiated or failed to prevent the incident. Note that root cause is being used in this book in the traditional sense, i.e.:
Root Cause - A fundamental, underlying, system-related reason why an incident occurred that identifies a correctable failure(s) in management systems.
By this definition, a root cause is the most fundamental level in the cause determination, and there is no more fundamental level. Recommendations can be developed for root causes that will prevent, lessen the likelihood, and/or consequence, of the same and similar incidents from occurring. Whereas, causal factors are invariably contributory in nature and, for the purposes of this book, are defined as:
Causal Factor - A major unplanned, unintended contributor to an incident (a negative event or undesirable condition), that if eliminated would have either prevented the occurrence of the incident, or reduced its severity or frequency.
This definition implies that, if recommendations are based on causal factors, they would only prevent the same incident but not similar incidents from occurring. Therefore, recommendations should be based on root causes.
Once the most likely hypothesis is validated, determining root causes via a structured approach (Chapter 10) will help the investigation team determine all relevant factors. Understanding the impact of human factors is key to identifying root causes and is discussed in detail in (Chapter 11). Once root causes have been identified, effective recommendations can be developed (Chapter 12).
1.2.5 The Fifth Step in incident investigation is preparing the investigation report (Chapter 13) which details the facts, findings, and recommendations prepared by the investigation team. Typically, recommendations are written to prevent incident recurrence by:
improving the process technology,
upgrading the operating or maintenance procedures or practices,
improving compliance with existing organizational systems (operational discipline); and
upgrading the management systems, (often the most critical area).
1.2.6 The Sixth Step in incident investigation is to implement and communicate the team’s conclusions. After the investigation is completed and the findings and recommendations are issued in the report, a system is needed to implement and audit those recommendations (Chapter 14). This is not part of the investigation itself, but rather the follow-up related to it. Once a technological, procedural, or administrative corrective action is enacted, it is monitored periodically for effectiveness and, where appropriate, modified to meet the intent of the original recommendation. Learnings from an investigation can also be institutionalized and shared throughout the company and industry, particularly with those most affected by the incidents.
These six steps will result in the greatest positive effect when they are performed in an atmosphere of openness and trust. Management demonstrates, by both word and deed, that the primary objective is not to assign blame, but to implement system fixes and share learnings for the sake of preventing future incidents. This book helps organizations define and refine their incident investigation systems to achieve positive results effectively and efficiently.
This book assists three target groups:
Incident investigation team leaders
Incident investigation team members
Corporate and site process safety managers and coordinators
This book provides a valuable reference tool for anyone directly involved in leading or participating on incident investigation teams. It presents knowledge, techniques, and examples to support successful investigations. This book offers a model for success in building or upgrading an incident investigation program.
Like previous editions, the book remains focused primarily on investigating process-related incidents. Most organizations find that integrating process safety with other types of investigations provides an opportunity to enhance any investigation. Readers will find that the methodologies, tools, and techniques described in the following chapters may be successfully applied when investigating other types of occurrences such as operational reliability, product quality, and occupational health and safety incidents.
Readers should be able to achieve the following objectives.
Describe the basic principles behind successful incident investigations.
Identify the essential features of a management system designed to foster and support high quality incident investigations.
List detailed steps for planning and conducting incident investigations, including investigative tools, techniques, and methodologies for determining causes.
Use the findings of an investigation to make effective recommendations that can reduce the likelihood of recurrence or mitigate the consequences of similar incidents (or even dissimilar incidents with common root causes).
Plan an effective system for documenting, communicating, and resolving investigation findings and recommendations, including a method to track resolution of incident recommendations.
Effectively share the learnings of investigations and institutionalize learnings to prevent the lessons from being lost over time.
The summaries below provide an overview of the content and organization of the book chapter-by-chapter to assist in quickly locating a particular area of interest.
This chapter discusses the basics of determining incident causation, general types of incidents, and the linkage between causation theories, root causes, and management systems. Understanding incident sequence models, barrier analysis, and failure modes can greatly assist investigators in dissecting the anatomy of process incidents.
This chapter provides an overview of investigation methodologies, associated tools, and techniques that come together to form a modern structured investigate approach. An overview of the historical transition is provided along with description of methodologies and tools most commonly used by CCPS members.
This chapter provides an overview of a management system for investigating process safety incidents. It opens with a review of responsibilities from management through the workforce and presents the important features that a management system can address to be effective. It examines systematic approaches that start with notification, team structure, functional and agency integration, document control, team objectives, etc. The learning objective is to define a management system that supports incident investigation teams, root cause determinations, effective recommendation implementation, follow-up, and continuous improvement.
Timely reporting of incidents enables management to take prompt preventative or corrective measures to mitigate consequences. Many major process safety incidents were preceded by precursor occurrences (typically referred to as near-misses) that might have gone unrecognized or ignored because “nothing bad” actually happened. The lessons learned from any incident can be extremely valuable. However, this benefit is only realized when incidents are recognized, reported, and investigated. This chapter describes important considerations for internal reporting of incidents, the process of classifying incidents into categories, and means for determining appropriate levels of investigation to be conducted.
Personnel with proper training, skills, and experience are critical to the successful outcome of an incident investigation. This chapter describes team composition as a function of incident type, complexity, and severity, and includes suggested training topics. It also provides team leaders with a high-level overview of the basic team activities typically required in the course of conducting an investigation.
This chapter discusses techniques for identifying witnesses and effective interviewing techniques designed to obtain reliable information from them. Witnesses often hold the most intimate knowledge of conditions at the time of the incident, actions taken pre-incident and post-incident, process design and operations, etc. Effective management of witnesses is a crucial element of the investigation process. Issues related to witness interactions and interviewing techniques are covered in detail.
Facts are the fuel that an investigation needs to reach a successful conclusion. This chapter addresses the methods and practical considerations of data-gathering and archiving activities. It describes plan development; priority establishment; different types and sources of data; data-gathering tools, techniques, and preservation; documentation requirements; photography and video techniques; suggested supplies; etc.
This chapter provides practical guidelines for analyzing evidence, proving/disproving hypotheses, and developing causal factors. The use of a scientific methodology to sort out facts from collected data is explained, and techniques are offered for use during this iterative and overlapping process. Identifying causal factors is an intermediate step towards determining root causes, and implementing recommendations based on root causes should inherently address the causal factors as well.
This chapter addresses methods and tools used successfully to identify multiple root causes. Process safety incidents are almost always the result of more than one root cause. This chapter provides a structured approach for determining root causes. It details some powerful, widely used and proven tools and techniques available to incident investigation teams, including timelines, fault trees, logic trees, predefined trees, checklists, and application of human factors. Examples are included to demonstrate how they apply to the types of incidents readers are likely to encounter.
This chapter describes human factor considerations in incident investigation. It provides insight and tools to identify and address applicable human factor issues throughout an investigation. Practical models are presented along with examples.
Once the likely causes of an incident have been identified, investigation teams evaluate what can be done to help prevent recurrence or mitigate consequences. The incident investigation recommendations are the product of this evaluation. This chapter addresses types of recommendations, attributes of high quality recommendations, methods to document and present recommendations, and related management responsibilities.
In the case of incident investigation, a major milestone is completed when the final incident investigation report is submitted. The incident report documents the investigation team’s findings, conclusions, and recommendations. This chapter describes practical considerations for writing formal incident reports, and discusses the attributes of quality reports and differences among incident notifications, interim reports, and a final report. Considerations and associated practical techniques are provided for stating report scope, preparing preliminary notices, documenting the investigation process and results, developing a report format, and performing a quality assurance check that includes management review and approval.
The recommendations generated from an incident investigation when implemented in a timely and effective fashion, decrease the probability of recurrence, and/or reduce the potential consequences of an event. This chapter begins with case examples that underscore key concepts, and then focuses on the critical aspects of effectively implementing recommendations. It addresses initial resolution of the recommendations, their full implementation, effectiveness of follow-up, and tracking.
The adage “if it ain’t broke, don’t fix it” does not apply to process safety management systems. A continuous improvement pillar is an integral part of the process safety management system. This chapter describes techniques that can help the incident investigation element of process safety remain strong and viable in an ever-changing technical, business, and regulatory environment. It includes considerations for assessing existing incident investigation programs as well as approaches for implementing continuous improvement.
Sharing lessons learned, not only across the organization but also across industry and related agencies, is an extremely effective way to learn from the occurrences of others. This chapter focuses on how to obtain and critically analyze incident information and share core learnings, and provides examples.
The appendices provide a wealth of supplemental information on the subject of incident investigation. Topics include:
A. Photography Guidelines for Maximum Results
B. Example Protocol – Checking Position of a Chain Valve
C. Process Safety Events Leveling Criteria
D. Example Case Study
E. Quick Checklist for Investigators
F. Evidence Preservation Checklist– Prior to Arrival of the Investigation Team
G. Guidance on Classifying Potential Severity of a Loss of Primary Containment
The glossary provides definitions of terms used throughout the book. To the greatest extent possible, definitions are consistent with the CCPS Process Safety Glossary.
(https://www.aiche.org/ccps/resources/glossary).
An extensive list of references is assembled to allow the reader to obtain the source reference papers and reports for investigation methodologies.
Like all the elements of process safety management, the incident investigation element continues to evolve. The AIChE Center for Chemical Process Safety assists this evolution by providing information to help companies safely operate process facilities. To this purpose, CCPS and the contributing authors offer this third edition of this guidebook on investigating process safety incidents.
For an investigation of a chemical process incident to be effective, the investigation team should apply a systematic approach that identifies the root causes of the incident, as defined in Chapter 1. As a rule, the benefits of this systematic approach result from:
Applying a consistent and effective investigative effort, and
Implementing sound process safety management principles.
The investigation team should apply an approach based on basic incident causation concepts. When a system or process fails, it may be difficult to trace the reasons for its failure. Based on available historic incident data, the makeup of a major incident is rarely simple and rarely results from a single root cause. Serious process safety incidents typically involve a complex sequence of occurrences and conditions that can include, but are not limited to:
equipment faults or faulty design,
latent unsafe conditions,
environmental circumstances, and
human errors.
Understanding the concepts of incident causation is essential to comprehensively investigate incidents and prevent their recurrence or mitigate their consequences through implementation of effective recommendations.
Numerous theories and models of incident causation have been developed over the years (Heinrich, 1936; Gibson, 1961; Recht, 1965; Haddon, 1980; Peterson, 1984, etc.). These theories and models may appear at first to be diverse and disparate, but they do contain a number of common themes and concepts. As a result of this research, industry best practices in incident investigation have evolved significantly over the last few decades, based upon a number of key incident causation theories.
This chapter discusses models that illustrate how a process safety incident can develop in a staged manner, often as a result of weaknesses in the management system. It also provides a brief overview of key causation concepts such as loss of primary containment, linkage between root causes and the management system, involvement of human factors, and multiple root causes.
Experience from systematic analyses of past process safety incidents has allowed researchers to develop incident models that display the makeup of a process-related incident using a conceptual framework.
The progression of any process-related incident could be described as occurring in three different phases or stages (DoE, 1985):
Change from normal operating state into a state of abnormal (or disturbed) operation, i.e. a deviation from intended safe operation.
Loss of control of the abnormal operating phase, which may involve a breakdown of a barrier function. A barrier function is a safety feature such as a shutdown valve or containment system, a procedure, or the communication system. When safety systems fail, the incident can evolve from an undesirable occurrence to a near-miss and, if enough barriers fail, the incident could progress to an operational interruption or accident, depending upon the consequences or circumstances.
The severity of subsequent consequences is influenced by [the impact of] loss of control of energy accumulations. Process safety incidents can involve different hazardous energies, such as chemical, mechanical, electrical, thermal and pressure.
This model introduces the general concept that there is typically a sequence of events leading to a process safety incident. Understanding the sequence of events, and the barriers that have failed can help investigators to understand the progression of an incident.
An event tree model is an example of a more structured conceptual framework encompassing the three phases of a process-related incident. Figure 2.1 illustrates an example of an event tree of incident causation.
Figure 2.1Event Tree for a Process-related Incident
In the Figure 2.1 example, there is:
Deviation from normal operation into abnormal operation. An example is the tank level deviation, which could be caused by various events or conditions, such as: operator error, faulty instrumentation, etc.
Breakdown of control of the abnormal operation. An example is the distributed control system (DCS) not compensating properly. Another example is the operator not detecting the deviation.
Loss of control of energy. An example is the operator not responding, which allows the tank to overflow.
This example has three contributors to incident causation in each of the three phases: equipment, process systems, and human. Under different circumstances, the organization, the environment and/or external factors may also contribute. There are two detection systems and two intervention opportunities. Depending on the success or failure of each, there are three potential paths that result in no adverse consequences and four potential paths that lead to failure, with overflow as the immediate consequence. Note that sometimes there are more opportunities for things to go wrong than to go right and the event tree clearly depicts the specific paths that can lead to an undesired event.
This example illustrates that event trees can be useful models of an incident sequence because they provide a graphical, logic-based depiction of the various potential consequences that could occur, depending on the pathway of an event. This is a more structured sequence model than the three-phase model, but it does not fully address the weaknesses in barriers and the management systems behind them.
Another way to represent the staged events and conditions that result in an incident is by using the Swiss Cheese model (Reason, 1990). This model takes one of the failure paths defined in the event tree that leads to a consequence of concern. The protective barriers (safety systems) are represented by parallel slices of Swiss cheese. These barriers represent the equipment, procedures/practices, and people that comprise elements of the management system for the facility.
Ideally each barrier should be robust, but like the holes in Swiss cheese, all barriers have weaknesses (Figure 2.2) resulting from:
Active failures (e.g., equipment failures, unsafe acts, human errors, procedural violations, etc.).
Latent failures (e.g., design/equipment deficiencies, inadequate/ impractical procedures, time pressure, unsafe conditions, fatigue, etc.) – see
Section 2.1.4
below.
Figure 2.2Swiss Cheese Model
These weaknesses can lead to management system failures resulting in a process safety incident (see Section 2.2.2 below).
In reality, the holes or weaknesses are not static; they are dynamic and continually open and close. For example, one personnel shift may be more experienced and diligent than another, so that some barriers begin to degrade further at shift change. Each barrier may not work when needed, and is fully dependent on management system implementation to ensure a reasonable probability of working on demand.
If a weakness occurs in one barrier, there may be one or more other barriers that can provide sufficient protection and, while the weakness may have an undesirable outcome, it is unlikely that a significant incident will occur. However, most process safety incidents involve a combination of multiple active and latent failures. Therefore, investigators should understand that no layer of protection is perfect, and look for weaknesses in all barriers.
The Swiss Cheese model introduced the concept of latent failures (also known as latent conditions). Historic incident data show that latent failures have played an important role in incident causation (Reason, 1990). The term latent failure implies the condition is dormant or hidden. Normally the latent failure can be revealed before an incident occurs, through testing or auditing during typical operations within the process, as shown in Figure 2.3.
Figure 2.3Latent (hidden) Failure
There is always a possibility, however, that a latent failure may remain hidden during testing. There are several reasons a latent failure may not be detected, including, but not limited to:
It was not activated by the test used.
The test was deficient, gave wrong results, or did not test the system properly.
The test activity itself activates failure upon the next use of the process.
It is important that investigators understand the concept that latent failures can contribute to an incident, in addition to more obvious active factors, such as unsafe acts and spontaneous equipment failure. Latent failures may involve organizational influences, inadequate supervision, human error and equipment/system preconditions that were hidden from, or unknown to, personnel responsible for the process.
Some of the common concepts from incident causation theories that are relevant to the investigation of process safety incidents are:
There is potential or actual loss of containment or energy,
There is a direct linkage between root causes and the management system,
Most incidents involve human factors,
Each incident will likely have multiple root causes,
Events are not root causes, and
Risk is not reduced until effective remedies are implemented.
Each of these causation concepts and a number of avoidable pitfalls that incident investigators should be aware of are discussed below.
