40,99 €
A straightforward explanation of root cause analysis and systems thinking, illustrating, with real-world examples and first-hand accounts, why things can 'slip through our fingers' and what to do to reduce the chances of things going off track. Beyond the Five Whys summarises, for the first time, many of the tried and tested ways of understanding problems using insights from aviation, high reliability organisations and a range of thought-provoking sources. The book provides readers with a clear and structured explanation how to analyse setbacks and head off problems in the first place. It will challenge much of the received wisdom, such as the idea there can be one root cause or that a person or bad culture could be a root cause. Specific areas covered: * Learn what root causes are, how they differ from immediate and contributing causes and why it's so important to go beyond the Five Whys technique for root cause analysis. * Recalibrate the way you think about things going wrong, incorporating insights from systems thinking, so you can be clearer what 'cultural' or systemic problems mean in practice. * Learn about the eight principal ways things can slip through our fingers. * Go beyond the blame game and firefighting to avoid the never ending cycle of repeating issues. * Strengthen your ability to read the output of a 'lessons learned' or enquiry report. * Get a fresh perspective, using these techniques, on why the Titanic tragedy turned out so badly, and understand the numerous parallels between what happened then and a range of recent setbacks we have seen, such as the Covid 19 pandemic. * Consider the broader application of these techniques to some of the challenges we face in the 21st century. Beyond the Five Whys also contains supplemental guidance how to make improvements in an organisation. It is of value to business managers and those in specialist roles such as GRC, ESG, risk, compliance, quality, project management, H&S, IT, and internal audit roles.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 476
Veröffentlichungsjahr: 2023
Cover
Table of Contents
Title Page
Copyright
Dedication
Introduction
Notes
Section 1: A High‐Level Overview Of RCA and Systems Thinking
Chapter 1: Critical Points Concerning Root Cause Analysis (RCA)
Immediate, Contributing and Root Causes
RCA Fundamentals: Facts, Timelines and Causality
The Bowtie Diagram: Thinking through What Can Happen and What to Do about It
Prevention, Detection and Recovery Measures
High‐Reliability Organisations and RCA
The 5 Whys (Five Whys) and More
Rare, If Ever, to Have Just One Root Cause: Beyond the Five Whys
Summary
Notes
Chapter 2: The Fishbone Diagram and Eight Ways of Understanding Why
Can Root Cause Analysis Be Applied to Minor Defects? Insights from Lean Ways of Working
The Fishbone Diagram and Different Categories of Causes
The Modified Fishbone Diagram: Eight Ways of Understanding Why
Summary
Notes
Chapter 3: Systems Thinking and Eight Ways to Understand Why, with Connections
Systems Thinking – The Value of Stepping Back and Seeing the Bigger Picture
Illustrating the Power of Thinking in Systems – Causal Loop Analysis
The Value of Stepping Back to See Patterns, Tipping Points and Vicious Circles
Every System Is Perfectly Capable of Giving the Results it Currently Gives
A Systems Perspective on the Financial Crisis of 2007–2008
When to Stop and Boundary Diagrams
Eight Ways to Understand Why, with Connections
Summary
Notes
Section 2: Eight Ways to Understand Why, with Connections
Chapter 4: Setting Appropriate Strategies, Goals, and Understanding Risks
The Successes and Failures of the Wright Brothers at Kittyhawk
Realism and Learning in Pharmaceuticals
Business Case Transparency – Balancing Optimism and Realism with an Eye on Downside Risks
Summary
Practical Suggestions
Notes
Chapter 5: Having Timely, Accurate Data, Information and Communications
Lessons Learned in Cockpit Design
Accounting and Business Performance Measurement
Summary
Practical Suggestions
Notes
Chapter 6: Setting Appropriate Roles, Responsibilities, Accountabilities and Authorities
In Many Contexts, Clear Roles and Checks and Balances Are Essential
Summary
Practical Suggestions
Notes
Chapter 7: Ensuring a Suitable Design
Ahead of Its Time and Still Standing Today: The Eiffel Tower
Good Design Applies to More Than Just Buildings and Structures
Summary
Practical Suggestions
Notes
Chapter 8: Understanding and Accommodating External Factors
External Factors Are Not the Root Cause of Problems
Know Your Customer
Summary
Practical Suggestions
Notes
Chapter 9: Effective Building, Maintenance and Change
One Side of the Coin: Building, Change Management and Project Management
The Other Side of the Coin: Keeping Things the Same
Summary
Practical Suggestions
Notes
Chapter 10: Understanding and Adapting to ‘Human Factors’
Human Error and the Evolution of Human Factors Thinking
Human Variability and ‘the Miracle on the Hudson’
Human Factors
Just Culture – Balancing Holding People to Account Against Unfair Scapegoating
Reframing What Needs to Be Fixed
Considering ‘Culture’, ‘Tone at the Top’ and ‘Behavioural Risk’
Human Factors in Aviation
The Consequences of Machiavellian Behaviour and What to Do About It
Silence in Many Quarters About Organisational Politics
Summary
Practical Suggestions
Notes
Chapter 11: Addressing Resources, Priorities and Dilemmas
Safe Staffing
Prioritisation in Aviation
Prioritisation in Other Contexts
Addressing Dilemmas and Managing Trade‐Offs
Summary
Practical Suggestions
Notes
Chapter 12: The Titanic Tragedy and Parallels with Modern Disasters
Some Basic Facts About the Titanic Tragedy
The Scope and Boundaries of the Two Titanic Inquiries
The Essential RCA Questions and Reflections on the Rigour of the Inquiries
The Titanic Tragedy – Overall Root Causes
The Titanic Tragedy, in Summary: An RCA and Systems Analysis Perspective
Notes
Section 3: Taking Action, Now and In the Future
Chapter 13: The Challenges with Action Planning
Diagnosis, Then Treatment
Be Prepared for Not All Root Causes to Be Addressed
A Paradox at the Heart of an RCA or Inquiry Scope
Some Strategies for Overcoming Barriers to Action
Other Considerations
Notes
Chapter 14: Where Are We Now and Looking Ahead
Where Are We Now: Going Beyond the Five Whys
Making Improvements in Your Working Environment (If Applicable)
Things to Consider, Taking a Broader Perspective, and Looking at Current and Future Challenges
No Organisation or Person Should Be ‘Above’ the Need to Learn
‘The Journey of a Thousand Miles Begins with One Step’ – Lao Tzu
Notes
Appendix A: Practical Advice on Action Planning
The Types of Actions That Are More and Less Likely to Address Root Causes
Going Beyond Training: Sustaining Interest and Vigilance
Timescales for Remediation and Tracking Remediation Progress
Validating Closure of Action Plans
Notes
Appendix B: Practical Advice to Improve RCA in an Organisation
Where to Focus Your Improvement Efforts
Making the Most of Existing RCA Expertise
Building a Case for Change
Selecting Appropriate RCA Techniques
Thematic Analysis – Utilising RCA for Insights at an Organisational Level
Reflections from Managers Who Have Improved Their RCA Practice
Supplement B1: List of Different RCA and Systems Analysis Techniques
Notes
Appendix C: Practical Advice for Internal Audit and Others in an Audit or Inspection Role
Getting Your Bearings
Start Thinking About RCA When Making Audit/Inspection Plans
Consider Root Causes from the Start of Any Assignment
Incorporate RCA Thinking into the Audit/Inspection Methodology and Work Programmes
Use Root Causes to Consolidate Observations and Improve Action Plans
Be Prepared to Look in the Mirror
Applying RCA to Day‐to‐Day Challenges the Audit Team Encounters
Other Materials That May Be of Use
Seeing RCA Improvement as a Journey and Remembering Mindset Change Takes Time
Supplement C1: Calling Out Unhelpful Patterns and Cultural Problems
Supplement C2 – Overcoming Resistance to Act: Joining the Dots
Recruitment
Bond Issues
Acknowledgements
Index
End User License Agreement
Introduction
Illustration 0.1 Using Tools and Insights for the journey ahead
Chapter 1
Diagram 1.1 Bowtie diagram (illustrative)
Diagram 1.2 Single‐loop learning
Diagram 1.3 Double‐loop learning
Diagram 1.4 Three‐way five whys
Diagram 1.5 Fault tree showing three‐way five whys
Chapter 2
Diagram 2.1 Fishbone diagram
Diagram 2.2 Eight ways to understand why: a modified fishbone diagram
Chapter 3
Diagram 3.1 An organisation as an open system (illustration)
Diagram 3.2 Electric vehicle sales, impact of charging points (causal loop d...
Diagram 3.3 Causal loop analysis – 2007–2008 financial crisis
Diagram 3.4 Boundary diagram choices for RCA (illustration)
Diagram 3.5 Eight ways to understand why, with connections (illustration)
Chapter 4
Illustration 4.1 Goals, risks and acceptable losses
Diagram 4.2 Eight ways to understand why, with connections – scaling up surp...
Chapter 5
Illustration 5.1 Communication and information
Table 5.2 Disciplined filtering upwards including FMEA ratings
Illustration 5.3 Perspectives on progress
Illustration 5.4 How will you reach your goal?
Chapter 6
Illustration 6.1 Accountability
Table 6.2 Accountability mapping as a tool to clarify role clarity or role c...
Diagram 6.3 The systemic nature of unclear roles, responsibilities etc. (ill...
Chapter 7
Illustration 7.1 Design
Chapter 8
Illustration 8.1 External factors
Chapter 9
Illustration 9.1 Building and maintenance
Table 9.2 Overview of Key Findings Concerning the Fire Control project
Diagram 9.3 Challenges with a project where there are stretch goals, top dow...
Chapter 10
Illustration 10.1 The Human Factor
Table 10.2 Just Culture – Action Options (illustration of possible responses...
Illustration 10.3 Group dynamics – fight flight
Illustration 10.4 Political animals (after Kim James & Simon Baddley)
Chapter 11
Illustration 11.1 Safe staffing
Diagram 11.2 Outsourcing – tensions in what is wanted
Chapter 12
Diagram 12.1 Bowtie diagram applied to the Titanic/Iceberg strike
Illustration 12.2 The Titanic tragedy
Diagram 12.3 The Titanic tragedy – many lines of defence in place
Diagram 12.4 Titanic tragedy – reducing the risk of encountering ice
Chapter 13
Diagram 13.1 RCA does not ensure everything will get fixed
Chapter 14
Illustration 14.1 The choices ahead
Appendix A
Table A.1 The Pros and Cons of Different Corrective Actions
Diagram A.2 Recommended remediation timescales
Appendix B
Diagram B.1 Change management – key ingredients
Table B.2 Summary of various RCA tools with potential pros and cons
Diagram B.3 A new type of thematic analysis
Appendix C
Diagram C.1 Some key ingredients of an internal audit report
Diagram C.2 Progressive approach to assignment ratings
Cover Page
Half Title Page
Title Page
Copyright
Dedication
Introduction
Table of Contents
Begin Reading
Appendix A Practical Advice on Action Planning
Appendix B Practical Advice to Improve RCA in an Organisation
Appendix C Practical Advice for Internal Audit and Others in an Audit or Inspection Role
Acknowledgements
Index
Wiley End User License Agreement
ii
iii
iv
v
ix
x
xi
xii
1
3
4
5
6
7
8
9
10
11
12
13
15
16
17
18
19
20
21
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
79
80
81
82
83
84
85
86
87
88
89
90
91
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
215
216
217
218
219
220
221
222
223
224
225
226
227
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
331
332
333
334
335
336
337
338
339
340
341
James C. Paterson
This edition first published 2024
James C. Paterson © 2024
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of James C. Paterson to be identified as the Author of this work has been asserted in accordance with law.
Registered Office(s)
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats. Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organisation, website or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organisation, website or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential or other damages.
Library of Congress Cataloguing‐in‐Publication Data is Available:
ISBN 9781394191055 (Hardback)ISBN 9781394191062 (ePDF)ISBN 9781394191079 (ePub)
Cover Design: Wiley
Cover Image: © Michael Rosskothen/Adobe Stock
To: Isabelle, for everything;
To: Timothy, Claudie and Elodie; William; Nicholas and Felicity – and all those near and dear to the family;
‘Don't Panic’.1
For the past 10 years, I have been working with clients in relation to root cause analysis and systems thinking, and it's time to provide an overview for a wider audience.
Why might this be of value?
To clarify why specific problems can be hard to fix and why surprises can come from ‘out of the blue’ despite lots of efforts to the contrary;
To help you think through, when things don't go to plan, how you might assess what any individuals involved did or did not do; put bluntly, should anyone take the blame?
To illustrate the benefits of moving beyond firefighting, addressing a few root causes now can head off many more problems in the future;
To give some suggestions to avoid things going off track in the first place.
From the start, I want to recognise how easy it is to misunderstand root causes. For example, the iceberg was not the root cause of the Titanic tragedy.2
Root cause analysis involves resisting the desire to blame someone or something immediately. Thus, neither Captain Edward Smith nor Lookout Frederick Fleet were the root causes of the Titanic disaster, nor were the number of lifeboats.3 That said, I will illustrate the problems of a ‘no‐blame’ mindset.
Root cause analysis helps us untangle the ‘what, where, when, how and who’ of a disaster or setback and helps us understand why it happened.
I will share numerous real‐life accounts from executives and managers illustrating the ‘hairline cracks’ to look out for that can result in significant setbacks. Their stories challenge a lot of the received wisdom and urban myths about what separates success from failure.
This book gives an overview of the essentials of root cause analysis (RCA henceforth) and systems thinking that should be of practical value. The essence of what I am aiming for is summed up in Illustration 0.1.
The book has three main sections comprising 14 chapters and three optional appendices.
Illustration 0.1 Using Tools and Insights for the journey ahead
Section 1 provides a high‐level overview of RCA and systems thinking.
Chapter 1 clarifies some critical points about RCA, including:
What are root causes and how do these differ from other types of cause?
Several ways to arrive at root causes, going beyond the popular ‘five whys’ approach.
Can there be just one root cause or a primary root cause?
Chapter 2 explains the fishbone diagram and outlines eight ways to understand why things can go off track.
Chapter 3 introduces systems thinking and links this with RCA.
The discussion also broadens to consider inquiries as well as RCA investigations.
Section 2 provides real‐world examples of why things can slip through your fingers.
Chapters 4–11 focus on ‘eight ways to understand why’:
With real‐world accounts from board members, executives and others, to see how and why things didn't go as planned.
Outlining what it means to say some problems are ‘systemic’ or ‘cultural’.
Chapter 12 provides a fresh perspective on the Titanic tragedy of April 1912, offering parallels with contemporary setbacks.
Section 3 closes the discussion by looking at practical actions that might be taken so we can start to better understand and head off the ‘wicked problems’ we are increasingly faced with.
Chapter 13 explains how to encourage action after an RCA or inquiry.
You can diagnose a disease through an RCA, or inquiry, but that doesn't mean the patient will take the medication needed.
Chapter 14 offers some further perspectives on how to address contemporary issues of importance.
Optional appendices providing further practical advice:
Appendix A
covers effective action planning.
Appendix B
offers advice for managers (including those working in quality, risk, and compliance roles etc.) including an overview of RCA and systems thinking techniques and useful links.
Appendix C
offers advice for those in audit, inspection or supervisory roles.
Overall, I hope to make the case that unless we can challenge old ways of thinking, we'll stay stuck in a ‘Groundhog Day’ of seeing the same old problems again, and again, and again.4 As Einstein said, ‘We can't solve problems by using the same kind of thinking we used when we created them’.5
1.
Douglas Adams, ‘
The Hitchhiker's Guide to the Galaxy
’.
2.
The sinking and significant loss of life of the crew and passengers of SS/RMS
Titanic
in April 1912. The iceberg was the immediate cause as explained in
Chapter 12
.
3.
Concerning Captain Edward Smith, the Board of Trade Titanic Inquiry concluded, ‘He made a mistake, a very grievous mistake, but one in which, in the face of practice and past experience, negligence cannot be said to have had any part; and in the absence of negligence, it is, … impossible to fix Captain Smith with blame’.
4.
In the film ‘
Groundhog Day
’, the main character re‐lives the same day repeatedly.
5.
https://www.brainyquote.com/quotes/albert_einstein_385842
Section 1 provides an introduction for those unfamiliar with root cause analysis (RCA) and systems thinking. It will also serve as a brief recap for those familiar with these topics.
Illustrations, diagrams and tables are included at intervals, recognising that sometimes ‘a picture is worth a thousand words’.
Notes are available to provide references and to expand on specific points that may be of particular interest.
At the end of each chapter is a bullet point list of many of the key points covered. Thereafter are a few suggestions for those working in organisations that are looking for specific ideas to put into practice.
‘Addiction is finding a quick and dirty solution to the symptom of the problem, which prevents or distracts one from the harder and longer‐term task of solving the real problem’.
– Donella H. Meadows1
If someone asks, ‘Why did the Titanic sink?’, the reply, ‘Because it hit an iceberg’, is a reasonable first answer. Specifically, the immediate cause that led to the sinking of the Titanic was that it hit an iceberg whilst travelling at around 20 knots on 14 April 1912.
However, an immediate cause is not the same as a root cause.
An immediate cause is the event or trigger that sets off a chain of events that results in an adverse outcome: think of a spark that might light a fire.
A contributing cause is something that ‘sets the stage’ for an immediate cause to create an adverse impact: think of dry tinder on a forest floor or the limited number of lifeboats on the Titanic.
So, beyond immediate and contributing causes, root causes are the reasons why things didn't go as planned.
Root cause analysis (RCA) describes a range of tools and techniques that examine what, when, where and how something happened, and who was involved, but ultimately seek to understand why something happened. It's about properly diagnosing an illness and – if we can – finding a cure, not just offering ‘sticking plaster’ solutions.2 So, RCA can help you better understand what can be done to stop problems from arising in the first place, or if you can't prevent problems, be clear about what can be done to make any setbacks tolerable.
The first cornerstone for an RCA, or inquiry, is gathering relevant facts and evidence in enough detail. Secondly, we need to establish a timeline of what happened; after all, one thing can't cause another without occurring before, or at the same time, as something else. And then we need to cross‐check whether one thing is really causing the other or just loosely associated with it.3
Given that facts and evidence are central to an effective RCA or inquiry, the root causes of a specific setback, or disaster, will always depend on the precise facts and circumstances at the time. This is where forensic analysis comes in. And many of the crime and crash investigation programmes we can see on TV show us clearly how much can be involved to get to the truth. However, for this overview book, I want to concentrate on establishing root causes after this hard work has been done.
The Bowtie diagram was developed over 40 years ago in the chemical industry.4 It sets out how you can picture an event that has gone wrong or might go wrong. It is a very useful way of helping you think through some of the most essential ideas in RCA. An example of a Bowtie is provided in Diagram 1.1:
Diagram 1.1 Bowtie diagram (illustrative)
You read the diagram from left to right on the top row and then again on the bottom row, thinking about measures to ensure that incidents do not occur in the first place (e.g. through preventative and detective measures). Or, if they occur, they do not have significant adverse consequences (e.g. because of recovery measures that can be deployed after an incident).5
Let's consider travelling by air as a simple example. We can think about the ingredients that make up a Bowtie in this context:
A key objective
is to travel safely without damage to the plane or injury to any passengers or crew.
The threats
include the risk that an aircraft might hit something.
Preventative measures
to avoid encountering threats in the first place include flight plans for each aircraft, air traffic control with radar, a competent crew who can steer the aircraft away from danger, a plane that is capable of manoeuvring safely, etc.
Detective measures
to spot threats coming up include radar on aircraft, anti‐collision devices, the crew who will look out for other aircraft, etc.
Recovery measures
include emergency procedures in case of an incident, trained crew capable of dealing with difficult events, air masks for passengers, etc.
Exactly what measures will be needed will depend on clearly understanding the threats the aircraft might face. But the overall message is that there should be, and are, multiple ways to keep an aircraft and its passengers safe, and this is one of the reasons flying is so safe because we are not relying on just one or two measures to protect us. Indeed, we have an entire system that is designed and operated, as much as possible, to support air travel and keep passengers safe.
When you think about air safety, prevention and detection are far preferable to recovery because there may be only so many things you can do after an incident occurs. Thus, the Bowtie reminds us to consider, in each context, the appropriate balance between prevention, detection and recovery measures.6
As well as using the Bowtie to think through aircraft and chemical plant safety, etc., it can be used for medical procedures and in a range of other domains.7 Of course, in daily life we don't usually explicitly refer to preventative, detective or recovery measures, but much of what goes on in practice seeks to achieve those ends.
With the Bowtie perspective, we can start the process of RCA if we fail to achieve an objective. We can ‘unpack’ causes, in a step‐by‐step way:
First, was the objective we had
clear
and did we
understand the threats to achieving it
?
Second, what measures were
planned, or designed
, to prevent and/or detect things from things going off track, or recover afterwards?
Third, were the measures
implemented in practice
and working effectively?
Considering how things are planned or designed provides an illustration of how RCA techniques can be used to anticipate and engineer away potential problems before they occur, increasing our chances of success. This takes us to High‐Reliability Organisations (HROs).
HROs can be found in aviation, oil and gas, nuclear, the chemical industry and other areas with very high stakes.
Consider a chemical plant where a pipeline starts to leak. You would expect the operators to be able to detect a pressure drop and then organise a rapid response to look at the leak. After that, you would expect them to repair or replace the pipe and clean up any pollution. The mindset of identifying, planning for and fixing problems is sometimes called ‘single‐loop’ learning, which can be summarised in Diagram 1.2.8
However, imagine that a few months later, there is another leak in a pipe and then another leak. It would likely become clear that a series of relatively quick‐fix solutions aren't good enough; you need to go deeper into the underlying reasons why similar issues are arising. For example, is the pressure in the pipes too high? Or are the lines being damaged by external factors, for example, falling branches?
Thinking about why problems might be recurring takes us to what is called ‘double‐loop’ learning, as illustrated in Diagram 1.3.
Diagram 1.2 Single‐loop learning
Diagram 1.3 Double‐loop learning
‘Quick fixes’ have their place, but the smart thing to do is to recognise that seeing the same, or similar, problems repeatedly is likely to be a sign that we have missed some other factor that's causing problems. In the example given, maybe we missed opportunities to protect the pipeline adequately. Such a hypothesis could be investigated, and if it explained why the problem was recurring, you would work on this. And this way of thinking might encourage you to find even smarter ways to stop a leak in the first place. For example, you might reroute the pipeline or change the chemicals you are using.
So, HROs aim to maintain a high level of safety, quality, security, etc., over an extended period, by continuously improving operational activities beyond quick fixes. The key ingredients of an HRO are as follows:9
A constant recognition that problems and failures might arise;
Continuous monitoring and awareness of what is happening and what is changing;
A reluctance to oversimplify things, meaning respect for expert views and insights;
A commitment to resilience; designing ‘layers of defence’ that will stop or minimise the chances of minor problems turning into catastrophes.
An HRO perspective says, ‘keep looking out for potential trouble’ and ‘it's better to be safe than sorry’, even if there is an additional cost in the short term.
HROs have developed a range of tools and techniques to minimise the chances of problems arising by thinking about (i) design excellence, and (ii) excellence in operational delivery and monitoring, and (iii) carrying out rigorous RCAs if they encounter setbacks or ‘near misses’.10
In an ideal world, you might hope that all organisations would want to aspire to be HROs. However, where risks seem lower or costs are constrained, this way of operating can seem too cautious, bureaucratic and wasteful. However, as the discussion about RCA and systems thinking progresses, I hope to show that an intelligent application of HRO practices can – in the right situations – significantly help organisations reduce the number of setbacks they experience.11
One of the most well‐known RCA techniques is called the Five Whys. Toyota developed it in quality circles in Japan and it has been used across the motor industry and beyond. It arose from a ‘Lean manufacturing’ approach, seeking to make things ‘right first time’.
Thus, if things didn't turn out as expected on the production line, operators learned to ask why, why, etc., to diagnose the reasons for this, and as a result, to put in place solutions to fix problems over the long run. Other Lean techniques developed alongside the Five Whys include poke yoke (or mistake proofing) and a kaizen culture (of continuous improvement).
As you gain experience with why questions, it can become clear there are different ways to focus these ‘why’ questions. So, if something bad happens, you can ask:
Why measures that might have
prevented
things from going wrong didn't work or work in time;
Why measures that might have
detected
a threat before it went wrong didn't work or work in time;
Why recovery measures that might have
reduced the impact of a problem
didn't work or work in time.
Thus, we arrive at the first RCA technique that goes beyond just asking why five times; it's called the ‘three‐way five whys’, where we ask ‘why’ in relation to prevention, detection and recovery and recognise the need to go beyond looking for just one root cause. A template for the three‐way five whys is provided in Diagram 1.4. It would be completed with answers to why something went off track, supported by appropriate evidence to underpin each of the conclusions so that they can be defended.
So, if a serious incident arises because of a leaking fuel pipe (say on an aircraft), it is quite possible it could arise because of all three of the following:
A flaw in the fuel pipeline – which, if built with a better design, might have prevented the leak; and
The absence of a fuel pressure monitor – which, if it was there, might have detected that fuel was leaking; and
Diagram 1.4 Three‐way five whys
A missing secondary fuel pipeline – which might have been switched to if the first fuel pipeline failed.
All three of these factors are likely to be important, because if any one of the measures was working fully effectively, they might have stopped or reduced the impact of an incident in the first place.
Doing RCA using the three‐way five whys can be likened to ‘digging into’ all the facts and looking for the important roots. Thus, the three‐way five whys can also be depicted as a ‘fault tree’ using Diagram 1.5.
You can see how an analogy with the roots of a plant is useful. If you don't find and pull out all the roots, the plant grows back. If you remove all the roots, it won't grow back. So, suppose you see repeating issues or similar issues arising. In that case, there may still be root causes that have not been identified or addressed. It's a tool that can take us from ‘single‐loop’ learning and quick fixes to ‘double‐loop’ learning, where we aim to understand all the reasons why things went wrong to fix things for the long run.
The fault tree in Diagram 1.5 is one example of various ‘tree’ diagrams that can be used in RCA, e.g. issue trees, logic trees, etc. Tree diagrams are commonly used in oil and gas, aviation, and engineering environments, but can be used in a range of other contexts. They have power in the way that they can show how causes can ‘split’ (i.e. be the result of one thing as well as another). However, I am not going to cover ‘tree’ techniques further at this juncture since there are a range of other techniques that I want to look at, given the scope of this book. But more information about different tree approaches to RCA can be found in Appendix B.
Diagram 1.5 Fault tree showing three‐way five whys
As you become more familiar with RCA, you will discover it's rare, perhaps impossible, to find that there is just one root cause for a significant disaster or recurring problem. This is one of the reasons why this book is called ‘Beyond the Five Whys’ because imagining you can just ask ‘why’ five times to get to the bottom of a problem can easily result in missing things.
The argument that there is no such thing as one root cause or no ‘main’ root cause can be hard to accept sometimes. I understand this, given that the term ‘root cause analysis’ seems to refer to a singular cause. If you are unsure about my assertion that there will normally be more than one root cause, I would ask you to keep an open mind for a little longer.12 As I run through the real‐life examples in this book and explain other RCA techniques beyond the five whys, I hope it will become clearer why it's useful and necessary to have this perspective.
In a nutshell:
‘Everything should be made as simple as possible, but not simpler’.
– after Albert Einstein.13
It is crucial to distinguish between immediate, contributing and root causes.
Root causes will always depend on the context and facts of the situation, so rigorous fact‐finding techniques and the development of detailed timelines are standard in RCA investigations.
Root causes are underlying reasons why something didn't go to plan.
Identifying root causes should help you understand what might be done to stop things from going wrong or maximise the chances of tolerable setbacks.
The Bowtie diagram is a useful tool when doing an RCA. It sets out objectives, threats, and then preventative, detective and recovery measures to keep things on track.
It's always tempting to look at the people involved when things go wrong, but RCA encourages us to look beyond the who to understand the why.
High‐Reliability Organisations focus on putting in place a range of measures and controls to reduce to a minimum the chances of an adverse event.
HROs apply ‘double‐loop’ learning and avoid getting stuck in firefighting mode; this involves being able to question underlying assumptions.
The ‘5 whys’ is a well‐known RCA technique. Still, it must be recognised that you can ask why in different ways – e.g. by looking at prevention, detection and recovery. This leads to the three‐way five whys technique, recognising no such thing as one root cause.
‘Fault trees’ are commonplace in RCA. There are many different variations.
There is rarely, if ever, a single root cause for a problem or near miss. Usually, there are hairline cracks concerning preventative, detective, and recovery measures.
1.
Thinking in Systems: A Primer, by Donella H Meadows. Published by Chelsea Green Publishing Company.
2.
There is a good discussion about the evolution of different root cause definitions here:
https://www.taproot.com/definition-of-a-root-cause/
. Note: one could devote pages to the topic of different cause types, but the scope of this overview book demands a ‘common sense’ approach in which types of cause should emerge during the course of the discussion, not least when we turn to the question of what action plans are likely to work.
3.
There is a TED talk on causation vs. correlation:
https://www.youtube.com/watch?v=8B271L3NtAw
. A Manchester University link on causality language and types:
https://www.phrasebank.manchester.ac.uk/explaining-cause-and-effect/
4.
For more detail about the Bowtie, see
https://www.caa.co.uk/safety-initiatives-and-resources/working-with-industry/bowtie/about-bowtie/where-did-bowtie-come-from/
5.
Some call measures ‘barriers’, ‘mitigations’ or ‘controls’ – they are activities or steps undertaken to prevent, detect or recover from adverse impacts from relevant threats. Of course some measures are more appropriate and effective than others, which will become obvious during the course of this book.
6.
News in 2023 about UK police data leaks and losses from the British Museum highlight the problem of limits to what has been done after an adverse event has occurred.
7.
Some more illustrations here:
https://risktec.tuv.com/risktec-knowledge-bank/bowtie-risk-management/lessons-learned-from-the-real-world-application-of-the-bow-tie-method/
. In a medical context, see
https://bmjopenquality.bmj.com/content/10/2/e001240
and
https://journals.sagepub.com/doi/pdf/10.1177/0310057X1604400615
. For a railway example, see
https://www.rssb.co.uk/en/safety-and-health/guidance-and-good-practice/bowties/using-bowtie-analysis
. For a project example, see
http://wiki.doing-projects.org/index.php/Using_the_bowtie_method_to_evaluate_the_risks_in_a_construction_project
8.
As coined by Argyris and Schon:
https://infed.org/chris-argyris-theories-of-action-double-loop-learning-and-organizational-learning/
9.
See this overview of critical elements, etc.
https://www.hse.gov.uk/research/rrpdf/rr899.pdf
10.
See Heinrich's model for safety, an example of how near‐miss reporting can be useful to head off major incidents or adverse events:
https://www.skybrary.aero/articles/heinrich-pyramid
; Further, there are a range of rather complex considerations in relation to quality‐related root cause analysis in the Lean and Lean six sigma arena including O‐PDCA and DMAIC as well as issues that potentially involve statistical analysis, where there are choices around One Variable at a Time (OVAT) and Multivariate at a Time (MVAT) approaches. It is out of the scope of this high‐level overview to go into detail on these matters, but readers with a particular interest might want to consider this further by going to A Smalley's book ‘Four types of problems’:
https://www.lean.org/store/book/four-types-of-problems/
11.
HROs is a huge subject. See this McKinsey article:
https://www.mckinsey.com/capabilities/operations/our-insights/what-high-reliability-organizations-get-right
and also
https://web.mhanet.com/media-library/high-reliability-organization-toolkit/
12.
For example,
https://www.taproot.com/is-there-just-one-root-cause-for-a-major-accident/
; An article in a medical context that also highlights the problem of looking for a single cause:
https://qualitysafety.bmj.com/content/26/5/417
13.
https://www.goodreads.com/quotes/search?utf8=%E2%9C%93&q=Everything+should+be+made+as+simple+as+possible%2C+but+not+simpler&commit=Search
There is some debate as to whether this quote originated from Albert Einstein, but the evidence points to him having a role in the form quoted.
‘All the classifications man has ever devised are arbitrary, artificial, and false, but … reflection also shows that such classifications are useful, indispensable, and above all unavoidable since they accord with an innate aspect of our thinking’.
– Egon Friedell1
It should be self‐evident that if something catastrophic happens or things show only a slight improvement over time (e.g. a pipe keeps leaking), this is a time to do a root cause analysis (RCA).
And, if there is a ‘near miss’ (where nothing bad has happened yet, but something bad nearly happened), this can also be a time to consider an RCA, because it may be the only warning you get before something much worse happens.
Over a period of time, Lean manufacturing insights highlighted that if you do something frequently, even the smallest defect might become a big deal (e.g. if you were to have a faulty air bag installed in thousands of cars) and so might merit an RCA, to avoid problems accumulating.
While each RCA investigation, or inquiry, will be different, if you do RCA regularly enough or read enough inquiry reports, you start to see patterns
in what happened or nearly went wrong (e.g. a project did not meet its deadline or keep to the agreed budget) and
in the underlying reasons things didn't go to plan (e.g. a failure to appreciate a project's risks, alongside insufficient contingency plans).
Those doing Lean manufacturing noticed the same thing; quality problems in car production were sometimes the result of similar root causes (e.g. poor maintenance of factory machines or poorly designed parts). So, quality engineer Kaoru Ishikawa developed a ‘fishbone’ diagram to set out cause and effect relationships for quality defects. This has evolved into one of the most well‐known RCA tools.
The diagram takes an incident, a near miss or a risk of concern.2 Then, it encourages us to look at the range of reasons (known as causal factors) why this has happened or why things might go off track. It is essentially a modification of a tree diagram, but it has more structure by emphasising certain key causal factors that you should check to make sure you haven't missed anything important when analysing the situation.
These causal factors also give you different ways to ask ‘why’ questions, taking you from immediate to contributing to root causes. You complete a fishbone diagram by asking why along the different ‘bones’ (in a similar manner to questions about prevention, detection and recovery) and – depending on the facts – you will likely establish several root causes for a given area of concern. An example of a fishbone diagram is provided in Diagram 2.1. Note that although there are six cause types listed in the diagram, there may only be two or three cause types present for a specific area of concern; it all depends on the facts.
Different versions of the fishbone diagram can be found. Popular root cause categories you might see include the following:
In a manufacturing context: Manpower, Machines, Materials, Methods, Measurements and Environment (known as 5M+E);
In a sales and marketing context: Price, Place, Promotion, People, Process, Performance, etc. (known as 8Ps).
And more generally:
Strategy, Skills, Staff, Structure, Systems, etc., after the McKinsey 7S framework;
3
Diagram 2.1 Fishbone diagram
Strategy, Structure, Processes, People, etc., after the Galbraith Star Model;
4
Strategy and Objective Setting, Information and Communication, Governance and Culture, etc., after the COSO Model.
5
Over the years of working with clients, I have noticed categories that seem to help people understand why things haven't been working, but also how easy it is to misunderstand some categories and, as a result, get confused about the root causes. Here are two example to illustrate the point:
If you have ‘People’ or ‘Staff’ as an area to focus on, it is easy to focus on
who
was involved and
who
,
perhaps
, didn't get it right rather than
why these people did what they did
.
Likewise, if you have ‘Materials’, ‘Methods’, ‘Processes’ or ‘Systems’ as lines for inquiry, you might focus on what and how rather than
why
.
Dr Mohammad Farhad Peerally (Associate Professor, University of Leicester) offers his perspective:
‘I have studied the use of RCA in a medical context for several years. Over time, RCA practice has evolved and become more powerful. But there is no doubt that when you look at early attempts to analyze causes, the categories of “people” and “process” can be heavily populated. After all, many things come back to these factors, but this doesn't explain why.’
So, I want to share ‘eight ways of understanding why’, offering another way of structuring the fishbone diagram. The ‘eight ways of understanding why’ are distilled from a range of different techniques, from the lists provided earlier and elsewhere, and which have been effective in many situations and contexts.
The eight ways will be discussed in more detail, with practical examples, in Section 2, but in overview the eight ways focus on:
Whether strategies and goals were appropriately set, understanding key risks, and recognising what might be a tolerable setback;
6
Whether there was timely, accurate data, information and communications, to be able to see if things are going as planned or starting to go off track;
Whether roles, responsibilities, accountabilities and authorities (R2A2) were set appropriately so there are no gaps in who does what and who makes decisions;
Whether there was a suitable design – of facilities, buildings, structures, equipment, systems, policies, protocols, procedures, reward and recognition, etc. – e.g. if you want to build a tall structure, you need a design, or a plan, that factors in the loads and stresses it will be subject to;
Whether external factors were identified and appropriately accommodated, recognising these might knock things off track;
Whether building, maintenance and change were effectively executed; this could be of structures, systems, policies, etc. Because even if a design is good, shoddy building, poor maintenance or repairs can be a precursor to problems;
Whether human factors were adequately considered; because if you want to fly to the moon, you need people with ‘the right stuff’, but even then, they will need to work as an effective team (both in the spacecraft and at ground control);
Whether resources, priorities and dilemmas were properly understood and worked through.
The eight ways of understanding why can be set out in a ‘modified fishbone diagram’ – see Diagram 2.2.
Diagram 2.2 Eight ways to understand why: a modified fishbone diagram
So, if a project isn't going to plan, we can ask ‘why’ in different ways: e.g. ‘Is the project timeline and budget realistic?’ and ‘Are there adequate “early warning signs”?’ or ‘Are there issues with resourcing, etc.?’ and, depending on the facts, start to unpick what has been going on and why.
The eight ways to understand why is not proposed as a definitive RCA framework. However, my experience working with clients has taught me that it usually helps to shift people's mindset on root causes. And as I walk through the ‘eight ways of understanding why’ with real‐life examples from a range of contributors, it should become increasingly clear how and why one reason for a setback or recurring problem can be connected to another. To put it another way, it may become apparent how some challenges are ‘systemic’, which takes us to a discussion about systems thinking.
Lean ways of working encourage a ‘right first‐time’ mindset and use RCA to clarify why things may go off track.
Lean techniques emphasise that RCA is not only for major issues or setbacks. Instead, relatively minor issues, defects, inefficiencies or near misses may merit an RCA because when repeated they can become significant or they may be a warning of the risk of a much bigger adverse event.
The fishbone (Ishikawa) diagram, developed through Lean ways of working, shows how different types of causes may have contributed to a problem or near miss. It automatically recognises there may be more than one root cause.
The categories in the fishbone diagram can help you think about the common reasons why problems arise. They also act as a tool to help you head off problems before they appear.
Categories in the fishbone can vary and should help you focus on why, not just who or what, etc.
The modified fishbone provides ‘eight ways of understanding why’ things haven't gone to plan; the causal factors include: realistic goals; roles and responsibilities; addressing human factors; resources, priorities, and dilemmas.
1.
http://scihi.org/egon-friedells-cutural-histories/
2.
In relation to a given goal or objective, and assuming an incident has exceeded the limits for a setback that can be tolerated.
3.
McKinsey 7S:
https://www.mckinsey.com/capabilities/strategy-and-corporate-finance/our-insights/enduring-ideas-the-7-s-framework
4.
Galbraith Star Model:
https://www.jaygalbraith.com/component/rsfiles/download?path=StarModel.pdf
5.
COSO (ERM):
https://www.coso.org/Shared%20Documents/2017-COSO-ERM-Integrating-with-Strategy-and-Performance-Executive-Summary.pdf
6.
Often called a ‘risk appetite’. So, a project might be budgeted at £5 million and be due to return £10 million. You might be reluctantly prepared to accept costs of £6 million and returns of £9 million, but you would not be happy with costs of £8 million and returns of £8 million.
What you are and are not prepared to tolerate going wrong reflects the risk appetite.
‘A bad system will beat a good person every time’.
– W. Edwards Deming1
Gotthold Ephraim Lessing wrote, ‘In nature everything is connected, everything is interwoven, everything changes with everything, everything merges from one into another’2 as long ago as 1769. Over the years, biologists recognised again and again that it was hard to explain why the population of one animal species (e.g. lions) was declining without understanding the way it was affected by the presence or absence of other species (e.g. antelopes). And in turn, the population of antelopes depends on the availability of plants to eat, which depends on the terrain, the presence of water, the climate, etc. So, eventually the term ‘the ecosystem’ was coined by Arthur Tansley in the 1930s.
Fundamental to systems thinking is a recognition that some problems are complex rather than complicated. By way of an example:
Repairing a car is a complicated problem
: A car comprises many separate, discreet parts. If you identify something that is not working, parts that are faulty or broken can usually be repaired or replaced separately, in isolation. The parts can then be put back together again, and the car should work.
Improving transport links in a town or city is a complex problem
: Imagine that cyclists are looking for more traffic‐free cycle lanes. If you do everything you can to solve their concerns, they might be happy; but you may create problems for other road users (driving cars and trucks), who might be (rightly or wrongly) unhappy about the impact this has on them.
So, a systems thinking approach recognises that for complex problems you cannot break down each bit of a problem into pieces, solve it separately and then put these solutions together and expect them to work. Indeed, systems thinking tells us that thinking that you can solve a complex problem as if it was complicated (i.e. addressing issues as discreet, separate phenomena, in isolation) will very often create other ‘spin‐off’ problems immediately or in the future.
In business terms, if you take an organisation, you can consider its various activities (e.g. sales and marketing) as ‘systems’ that interact with and depend upon other internal ‘systems’ (e.g. procurement, finance and IT). Seen this way, the success or failure of an organisation can depend upon how well these systems (or elements) work together. And a classic term to describe poor working between parts of an organisation is to say there are ‘silos’.
External factors such as customers, competitors, suppliers, regulators, the environment, etc., can also be regarded as systems, and how the organisation interacts with these can also explain how and why it succeeds or fails. So, not putting customer service as central to the way the organisation operates can result in problems.
Diagram 3.1 provides an example of how you see an organisation as a series of systems that interact with one another internally and are also impacted by external systems.
The ingredients detailed can be varied to suit each organisation and context, but the power of thinking in terms of systems is to ‘step back and see the big picture’. Thus, if you are thinking about an objective you want to reach, or a problem you want to solve, a systems perspective can remind you of all the factors you need to consider. For example, we can ask: ‘Do suitable systems and technology support our goals?’, ‘Do we have the right people and capabilities?’ or ‘Do we have suitable suppliers to give us what we need?’, etc.
Thinking about business issues in systemic terms was first properly developed by Jay W Forrester.3 In the late 1950s and early 1960s, he showed General Electric that highs and lows in its production of certain domestic appliances, which were thought to be due to external factors, could arise ‘from entirely internally determined’ elements of the organisation.4 He was able to show that overly simplistic, linear thinking about business (i.e. thinking some problems were complicated rather than complex) could result in unexpected spin‐off problems.5 His concepts were applied across various industries and integrated with data analysis and computing. It enabled him to create models for what might be happening and help decision‐makers tackle even more complex challenges (e.g. urban planning, policy setting, etc.).
Diagram 3.1 An organisation as an open system (illustration)
Since then, systems thinking has become of interest in certain contexts, but it is still not ‘mainstream’ in many domains.6 However, there have been some popular books on systems thinking, such as
Peter Senge's ‘
The Fifth Discipline
’ (1990) which highlights the benefits of looking at the big picture when trying to address challenges and the importance of a learning culture.
Barry Oshry's ‘
Seeing Systems
’ (2007) which discusses some of the common interactions you see in organisations, including top/bottom and provider/customer, and the dysfunctional patterns (‘dances') that can arise between different parts of an organisation.
Donella Meadows' ‘
Thinking in Systems
’ (2008) (first written in the 1990s and emerging from work with Jay Forester at MIT) which explains systems analysis and feedback loops in a step‐by‐step way and illustrates how they can be applied to some of the global challenges we face.
Derek and Laura Cabrera's ‘
Systems Thinking Made Simple
’ (2015) which provides a very accessible overview of the topic. They highlight, among other things, the importance of understanding different perspectives.
Common ingredients in a systems thinking approach include the following:
Analysing systems into their sub‐systems and looking at the interactions between systems to see how causes and effects work (i.e. systems analysis).
7
Recognising that unless you understand causal connections and feedback loops within a system, and between systems, you will find yourself surprised when things don't work and surprised when new problems appear to come from ‘out of the blue’.
8
Let's consider a simple example and see the power of systems thinking using causal loop analysis.