Design for Reliability -  - E-Book

Design for Reliability E-Book

0,0
98,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A unique, design-based approach to reliability engineering Design for Reliability provides engineers and managers with a range of tools and techniques for incorporating reliability into the design process for complex systems. It clearly explains how to design for zero failure of critical system functions, leading to enormous savings in product life-cycle costs and a dramatic improvement in the ability to compete in global markets. Readers will find a wealth of design practices not covered in typical engineering books, allowing them to think outside the box when developing reliability requirements. They will learn to address high failure rates associated with systems that are not properly designed for reliability, avoiding expensive and time-consuming engineering changes, such as excessive testing, repairs, maintenance, inspection, and logistics. Special features of this book include: * A unified approach that integrates ideas from computer science and reliability engineering * Techniques applicable to reliability as well as safety, maintainability, system integration, and logistic engineering * Chapters on design for extreme environments, developing reliable software, design for trustworthiness, and HALT influence on design Design for Reliability is a must-have guide for engineers and managers in R&D, product development, reliability engineering, product safety, and quality assurance, as well as anyone who needs to deliver high product performance at a lower cost while minimizing system failure.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 499

Veröffentlichungsjahr: 2012

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Design for Reliability

WILEY SERIES IN QUALITY & RELIABILITY ENGINEERING

and related titles*

Electronic Component Reliability:

Fundamentals, Modelling, Evaluation and Assurance

Finn Jensen

Measurement and Calibration Requirements

For Quality Assurance to ASO 9000

Alan S. Morris

Integrated Circuit Failure Analysis:

A Guide to Preparation Techniques

Friedrich Beck

Test Engineering

Patrick D. T. O’Connor

Six Sigma: Advanced Tools for Black Belts and Master Black Belts*

Loon Ching Tang, Thong Ngee Goh, Hong See Yam, Timothy Yoap

Secure Computer and Network Systems: Modeling, Analysis and Design*

Nong Ye

Failure Analysis:

A Practical Guide for Manufacturers of Electronic Components and Systems

Marius Bâzu and Titu Băjenescu

Reliability Technology:

Principles and Practice of Failure Prevention in Electronic Systems

Norman Pascoe

Effective FMEAs: Achieving Safe, Reliable, and Economical Products and Processes Using Failure Mode and Effects Analysis

Carl Carlson

Design for Reliability

Dev Raheja and Louis J. Gullo (Editors)

Copyright © 2012 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Raheja, Dev.

Design for reliability / Dev Raheja & Louis J. Gullo.

p. cm.

ISBN 978-0-470-48675-7 (hardback)

1. Reliability (Engineering) I. Gullo, Louis J. II. Title.

TA169.R348 2011

620’.00452-dc23

2011042405

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

To my wife, Hema, and my children, Gauri, Pramod, and Preeti Dev Raheja

To my wife, Diane, and my children, Louis, Jr., Stephanie, Catherine, Christina, and Nicholas Louis J. Gullo

Contents

Contributors

Foreword

Preface

Introduction: What You Will Learn

1 Design for Reliability Paradigms

Dev Raheja

Why Design for Reliability?

Reflections on the Current State of the Art

The Paradigms for Design for Reliability

Summary

References

2 Reliability Design Tools

Joseph A. Childs

Introduction

Reliability Tools

Test Data Analysis

Summary

References

3 Developing Reliable Software

Samuel Keene

Introduction and Background

Software Reliability: Definitions and Basic Concepts

Software Reliability Design Considerations

Operational Reliability Requires Effective Change Management

Execution-Time Software Reliability Models

Software Reliability Prediction Tools Prior to Testing

References

4 Reliability Models

Louis J. Gullo

Introduction

Reliability Block Diagram: System Modeling

Example of System Reliability Models Using RBDs

Reliability Growth Model

Similarity Analysis and Categories of a Physical Model

Monte Carlo Models

Markov Models

References

5 Design Failure Modes, Effects, and Criticality Analysis

Louis J. Gullo

Introduction to FMEA and FMECA

Design FMECA

Principles of FMECA-MA

Design FMECA Approaches

Example of a Design FMECA Process

Risk Priority Number

Final Thoughts

References

6 Process Failure Modes, Effects, and Criticality Analysis

Joseph A. Childs

Introduction

Principles of P-FMECA

Use of P-FMECA

What Is Required Before Starting

Performing P-FMECA Step by Step

Improvement Actions

Reporting Results

Suggestions for Additional Reading

7 FMECA Applied to Software Development

Robert W. Stoddard

Introduction

Scoping an FMECA for Software Development

FMECA Steps for Software Development

Important Notes on Roles and Responsibilities with Software FMECA

Lessons Learned from Conducting Software FMECA

Conclusions

References

8 Six Sigma Approach to Requirements Development

Samuel Keene

Early Experiences with Design of Experiments

Six Sigma Foundations

The Six Sigma Three-Pronged Initiative

The RASCI Tool

Design for Six Sigma

Requirements Development: The Principal Challenge to System Reliability

The GQM Tool

The Mind Mapping Tool

References

9 Human Factors in Reliable Design

Jack Dixon

Human Factors Engineering

A Design Engineer’s Interest in Human Factors

Human-Centered Design

Human Factors Analysis Process

Human Factors and Risk

Human Error

Design for Error Tolerance

Checklists

Testing to Validate Human Factors in Design

References

10 Stress Analysis During Design to Eliminate Failures

Louis J. Gullo

Principles of Stress Analysis

Mechanical Stress Analysis or Durability Analysis

Finite Element Analysis

Probabilistic vs. Deterministic Methods and Failures

How Stress Analysis Aids Design for Reliability

Derating and Stress Analysis

Stress vs. Strength Curves

Software Stress Analysis and Testing

Structural Reinforcement to Improve Structural Integrity

References

11 Highly Accelerated Life Testing

Louis J. Gullo

Introduction

Time Compression

Test Coverage

Environmental Stresses of HALT

Sensitivity to Stresses

Design Margin

Sample Size

Conclusions

Reference

12 Design for Extreme Environments

Steven S. Austin

Overview

Designing for Extreme Environments

Designing for Cold

Designing for Heat

References

13 Design for Trustworthiness

Lawrence Bernstein and C. M. Yuhas

Introduction

Modules and Components

Politics of Reuse

Design Principles

Design Constraints That Make Systems Trustworthy

Conclusions

References and Notes

14 Prognostics and Health Management Capabilities to Improve Reliability

Louis J. Gullo

Introduction

PHM Is Department of Defense Policy

Condition-Based Maintenance vs. Time-Based Maintenance

Monitoring and Reasoning of Failure Precursors

Monitoring Environmental and Usage Loads for Damage Modeling

Fault Detection, Fault Isolation, and Prognostics

Sensors for Automatic Stress Monitoring

References

15 Reliability Management

Joseph A. Childs

Introduction

Planning, Execution, and Documentation

Closing the Feedback Loop: Reliability Assessment, Problem Solving, and Growth

References

16 Risk Management, Exception Handling, and Change Management

Jack Dixon

Introduction to Risk

Importance of Risk Management

Why Many Risks Are Overlooked

Program Risk

Design Risk

Risk Assessment

Risk Identification

Risk Estimation

Risk Evaluation

Risk Mitigation

Risk Communication

Risk and Competitiveness

Risk Management in the Change Process

Configuration Management

References

17 Integrating Design for Reliability with Design for Safety

Brian Moriarty

Introduction

Start of Safety Design

Reliability in System Safety Design

Safety Analysis Techniques

Establishing Safety Assessment Using the Risk Assessment Code Matrix

Design and Development Process for Detailed Safety Design

Verification of Design for Safety Includes Reliability

Examples of Design for Safety with Reliability Data

Final Thoughts

References

18 Organizational Reliability Capability Assessment

Louis J. Gullo

Introduction

The Benefits of IEEE 1624-2008

Organizational Reliability Capability

Reliability Capability Assessment

Design Capability and Performability

IEEE 1624 Scoring Guidelines

SEI CMMI Scoring Guidelines

Organizational Reliability Capability Assessment Process

Advantages of High Reliability

Conclusions

References

Index

Contributors

Steven S. Austin
Missile Defense Agency
Department of Defense
Huntsville, Alabama
Lawrence Bernstein
Stevens Institute of Technology
Hoboken, New Jersey
Joseph A. Childs
Missiles and Fire Control
Lockheed Martin Corporation
Orlando, Florida
Jack Dixon
Dynamics Research Corporation
Orlando, Florida
Louis J. Gullo
Missile Systems
Raytheon Company
Tucson, Arizona
Samuel Keene
Keene and Associates, Inc.
Lyons, Colorado
Brian Moriarty
Engility Corporation
Lake Ridge, Virginia
Dev Raheja
Raheja Consulting, Inc.
Laurel, Maryland
Robert W. Stoddard
Six Sigma IDS, LLC
Venetia, Pennsylvania
C.M. Yuhas

Foreword

The importance of quality and reliability to a system cannot be disputed. Product failures in the field inevitably lead to losses in the form of repair cost, warranty claims, customer dissatisfaction, product recalls, loss of sales, and in extreme cases, loss of life. Thus, quality and reliability play a critical role in modern science and engineering and so enjoy various opportunities and face a number of challenges.

As quality and reliability science evolves, it reflects the trends and transformations of technological support. A device utilizing a new technology, whether it be a solar power panel, a stealth aircraft, or a state-of-the-art medical device, needs to function properly and without failure throughout its mission life. New technologies bring about new failure mechanisms (chemical, electrical, physical, mechanical, structural, etc.), new failure sites, and new failure modes. Therefore, continuous advancement of the physics of failure, combined with a multi-disciplinary approach, is essential to our ability to address those challenges in the future.

In addition to the transformations associated with changes in technology, the field of quality and reliability engineering has been going through its own evolution: developing new techniques and methodologies aimed at process improvement and reduction of the number of design- and manufacturing-related failures.

The concept of design for reliability (DFR) has been gaining popularity in recent years and its development is expected to continue for years to come. DFR methods shift the focus from reliability demonstration and the outdated “test-analyze-fix” philosophy to designing reliability into products and processes using the best available science-based methods. These concepts intertwine with probabilistic design and design for six sigma (DFSS) methods, focusing on reducing variability at the design and manufacturing levels. As such, the industry is expected to increase the use of simulation techniques, enhance the applications of reliability modeling, and integrate reliability engineering earlier and earlier in the design process. DFR also transforms the role of the reliability engineer from being focused primarily on product test and analysis to being a mentor to the design team, which is responsible for finding and applying the best design methods to achieve reliability. A properly applied DFR process ensures that pursuit of reliability is an enterprise-wide activity.

Several other emerging and continuing trends in quality and reliability engineering are also worth mentioning here. For an increasing number of applications, risk assessment will enhance reliability analysis, addressing not only the probability of failure but also the quantitative consequences of that failure. Life-cycle engineering concepts are expected to find wider applications in reducing life-cycle risks and minimizing the combined cost of design, manufacturing, quality, warranty, and service. Advances in prognostics and health management will bring about the development of new models and algorithms that can predict the future reliability of a product by assessing the extent of degradation from its expected operating conditions. Other advancing areas include human and software reliability analysis.

Additionally, continuous globalization and outsourcing affect most industries and complicate the work of quality and reliability professionals. Having various engineering functions distributed around the globe adds a layer of complexity to design coordination and logistics. Moving design and production into regions with little knowledge depth regarding design and manufacturing processes, with a less robust quality system in place and where low cost is often the primary driver of product development, affects a company's ability to produce reliable and defect-free parts.

Despite its obvious importance, quality and reliability education is paradoxically lacking in today's engineering curriculum. Few engineering schools offer degree programs or even a sufficient variety of courses in quality or reliability methods. Therefore, a majority of quality and reliability practitioners receive their professional training from colleagues, professional seminars, and from a variety of publications and technical books. The lack of formal education opportunities in this field greatly emphasizes the importance of technical publications for professional development.

The real objective of the Wiley Series in Quality & Reliability Engineering is to provide a solid educational foundation for both practitioners and researchers in quality and reliability and to expand the reader's knowledge base to include the latest developments in this field. This series continues Wiley's tradition of excellence in technical publishing and provides a lasting and positive contribution to the teaching and practice of engineering.

Andre Kleyner

Editor

Wiley Series in Quality & Reliability Engineering

Preface

Design for reliability (DFR) has become a worldwide goal, regardless of the industry and market. The best organizations around the world have become increasingly intent on harvesting the value proposition for competing globally while significantly lowering life cycle costs. The DFR principles and methods are aimed proactively to prevent faults, failures, and product malfunctions, which result in cheaper, faster, and better products. In Japan, this tool is used to gain customer loyalty and customer trust. However, we still face some challenges. Very few engineering managers and design engineers understand the value added by design for reliability; they often fail to see savings in warranty costs, increased customer satisfaction, and gain in market share.

These facts, combined with the current worldwide economic challenges, have created perfect conditions for this science of engineering. This is an art also because many decisions have to be made not only on evidence-based data, but also on engineering creativity to design out failure at lower costs. Readers will be delighted with the wealth of knowledge because all contributors to this book have at least 20 years hands-on experience with these methods.

The idea for this book was conceived during our participation in the IEEE Design for Reliability Technical Committee. We saw the need for a DFR volume not only for hardware engineers, but also for software and system engineers. The traditional books on reliability engineering are written for reliability engineers who rely more on statistical analysis than on improvements in inherent design to mitigate hardware and software failures. Our book attempts to fill a gap in the published body of knowledge by communicating the tremendous advantages of designing for reliability during very early development phase of a new product or system. This volume fulfills the needs of entry-level design engineers, experienced design engineers, engineering managers, as well as the reliability engineers/managers who are looking for hands-on knowledge on how to work collaboratively on design engineering teams.

ACKNOWLEDGMENTS

We would like to thank the IEEE Reliability Society for sowing the seed for this book, especially the encouragement from a former society president, Dr. Samuel Keene, who also contributed chapters in the book. We would like to recognize a few of the authors for conducting peer reviews of several chapters: Joe Childs, Jack Dixon, Larry Bernstein, and Sam Keene. We also thank the guest editors—Tim Adams, at NASA Kennedy Center, and Dr. Nat Jambulingam, at NASA Goddard Space Flight Center—who helped edit several chapters. We are grateful to Diana Gialo, at Wiley, who has always been gracious in helping and guiding us.

We acknowledge the contributions of the following:

Steve Austin (Chapter 12)

Larry Bernstein (Chapter 13)

Joe Childs (Chapters 2, 6, and 15)

Jim Dixon (Chapters 9 and 16)

Lou Gullo (Chapters 4, 5, 10, 11, 14, and 18)

Sam Keene (Chapters 3 and 8)

Brian Moriarty (Chapter 17)

Dev Raheja (Chapter 1)

Bob Stoddard (Chapter 7)

C. M. Yuhas (Chapter 13)

Dev Raheja

Louis J. Gullo

Introduction: What You will Learn

Chapter 1 Design for Reliability Paradigms (Raheja)

This chapter introduces what is means to design for reliability. It shows the technical gaps between the current state-of-art and what it takes to design reliability as a value proposition for new products. It gives real examples of how to get high return on investment to understand the art of design for reliability. The chapter introduces readers to the deeper level topics with eight practical paradigms for best practices.

Chapter 2 Reliability Design Tools (Childs)

This chapter summarizes reliability tools that exist throughout the product's life cycle from creation, requirements, development, design, production, testing, use and end of life. The need for tools in understanding and communicating reliability performance is also explained. Many of these tools are explained in further detail in the chapters that follow.

Chapter 3 Developing Reliable Software (Keene)

This chapter describes good design practices for developing reliable software embedded in most of the high technology products. It shows how to prevent software faults and failures often inherent in the design by applying evidence based reliability tools to software such as FMEA, capability maturity modeling, and software reliability modeling. It introduces the most popular software reliability estimation tool CASRE (Computer Aided Software Reliability Estimation).

Chapter 4 Reliability Models (Gullo)

This chapter is on reliability modeling, one of the most important tools for design for reliability in he early stages of design, to determine strategy for overall reliability. The chapter covers models for system reliability, component reliability, and shows the use of block diagrams in modeling. It discusses reliability growth process, similarity analysis used for physical modeling, and widely used models for simulation.

Chapter 5 Design Failure Modes, Effects, and Criticality Analysis (Gullo)

This chapter on FMECA contains the core knowledge for reliability analysis at system level, subsystem level and component level. The chapter shows how to perform risk assessment using a risk index called Risk Priority Number and shows how to eliminate single point failures making a design significantly less vulnerable. It explains the difference between FMEA and FMECA and how to us them for improving product performance and the maintenance effectiveness.

Chapter 6 Process Failure Modes, Effects, and Criticality Analysis (Childs)

The last chapter showed how to make design more robust. This chapter applies the FMEA tool to analyze a process for robustness such that the manufacturing defects are eliminated before the show p in production. The end result is improved product reliability with lower manufacturing costs. It covers step by step procedure to perform the analysis including the risk assessment using the Risk Priority Number.

Chapter 7 FMECA Applied to Software Development (Stoddard)

The FMEA tool is just as applicable to software design. There is very little literature on how to apply it to software. This chapter shows the details of how to use it to improve the software reliability. It covers the lessons learned and shows different ways of integrating the FMECA into the most widely used software development model known as “V” model. The chapter describes roles and responsibilities for proper use of this tool.

Chapter 8 Six Sigma Approach to Requirements Development (Keene)

In this chapter the author explains why Design of Experiments (DOE) is a sweet spot for identifying the key input variables to a Six Sigma programs. The chapter covers the origin of this program, the meaning of six sigma measurements, and how it is applied to improve the design. It then proceeds to cover the tools for designing the product for Six Sigma performance to reduce failure rates as close to zero as possible.

Chapter 9 Human Factors in Reliable Design (Dixon)

Humans are Often blamed for many product failures when in fact the fault lies in the insufficient attention to human factor engineering. This chapter covers the principles of human-centered design to make man-machine interface robust and error-tolerant. It covers how to perform the human factors analysis, and how to integrate it to make the product design user-friendly.

Chapter 10 Stress Analysis During Design to Eliminate Failures (Gullo)

This chapter explains why it is critical to reduce the design stress to improve durability as well as reliability. It introduces the concept of derating as a design tool. The author includes examples on electrical and mechanical stress analysis including how to apply this theory to software design. The chapter also shows how to apply Finite Element Analysis, a numerical technique, to solve specific design problems.

Chapter 11 Highly Accelerated Life Testing (Gullo)

Usually designers cannot predict what failures will occur for a new design. This chapter shows how highly accelerated life tests and highly accelerated stress tests can reveal the failure modes quickly. It covers how to design these tests and how to estimate the design margin from the test results. It shows different methods of accelerating the stresses.

Chapter 12 Design for Extreme Environments (Austin)

When a product is used in extreme cold or extreme heat such as in Alaska or in a desert in Arizona, we must design for such environments to assure product can last long enough. This chapter shows what factors need to be considered and how to design for each condition. It shows how lessons learned from space programs and overseas experience can help make products durable, reliable, and safe.

Chapter 13 Design for Trustworthiness (Bernstein and Yuhas)

This is a very important chapter because software design methods for reliability are not standardized yet. This chapter goes beyond reliability to design software such that it is also safe, and secure from errors in engineering changes which are very frequent. This chapter covers design methods and offers suggestions for improving the architecture, modules, interfaces, and using right policies for re-using the software. The chapter offers good design practices.

Chapter 14 Prognostics and Health Management Capabilities to Improve Reliability (Gullo)

Design for reliability practices should include detecting a malfunction before a product malfunctions. This chapter covers designing prognostics and product health monitoring principles that can b designed into the product. The result is enhanced system reliability. The chapter includes condition-based maintenance and time-based maintenance, use of failure precursors to signal an imminent failure event, and automatic stress monitoring to enhance prognosis.

Chapter 15 Reliability Management (Childs)

This chapter provides both motivation and guidance in outlining the importance of good reliability management. Management participation is the key to any successful reliability in design. It shows how to manage, plan, execute, and document the needs of the program during early design. It describes the important tasks, and closing the feedback loops after reliability assessment, problem solving, and reliability growth testing.

Chapter 16 Risk Management, Exception Handling, and Change Management (Dixon)

Many risks are overlooked in a product design. This chapter defines what is risk in engineering terms, how to predict risk, assess risk, and mitigate it. It highlights the role of risk management culture in mitigating risks and the critical role of configuration management for avoiding new risks from design changes. Included in this chapter is how to minimize oversights and omissions including requirement creeps.

Chapter 17 Integrating Design for Reliability with Design for Safety (Moriarty)

This chapter integrates reliability with safety, including how to design for safety. It covers several safety analysis techniques that equally apply to reliability. It shows the how a risk assessment code matrix is used widely in aerospace and many commercial products to make risk management decisions. It includes examples of risk reduction.

Chapter 18 Organizational Reliability Capability Assessment (Gullo)

This chapter describes the benefits of using IEEE 1624–2008 standard to describe how reliability capability of an organizational entity is determined by assessing eight key reliability practices and associated metrics. Management should know the capability of an organization to deliver a reliable product, which is defined as organizational reliability capability. It describes the process in detail with case studies.

Chapter 1

Design for Reliability Paradigms

Dev Raheja

Why Design for Reliability?

The science of reliability has not kept pace with user expectations. Many corporations still use MTBF (mean time between failures) as a measure of reliability, which, depending on the statistical distribution of failure data, implies acceptance of roughly 50 to 70% failures during the time indicated by the MTBF. No user today can tolerate such a high number of failures. Ideally, a user does not want any failures for the entire expected life! The life expected is determined by the life inferred by users, such as 100,000 miles or 10 years for an automobile, at least 10 years for kitchen appliances, and at least 20 years for a commercial airliner. Most commercial companies, such as automotive and medical device manufacturers, have stopped using the MTBF measure and aim at 1 to 10% failures during a self-defined time. This is still not in line with users' dreams. The real question is: Why not design for zero failures if we can increase profits and gain more market share? Zero failures implies zero mission-critical failures or zero safety-critical system failures. As a minimum, systems in which failures can lead to catastrophic consequences must be designed for zero failures. There are companies that are able to do this. Toyota, Apple, Gillette, Honda, Boeing, Johnson & Johnson, Corning, and Hewlett-Packard are a few examples.

The aim of design for reliability (DFR) is to design-out failures of critical system functions in a system. The number of such failures should be zero for the expected life of the product. Some components may be allowed to fail, such as in redundant systems. For example, in aerospace, as long as a system can function at least for the duration of the mission and the failed components are replaced prior to the next mission to maintain redundancy, certain failures can be tolerated. This is, however, insufficient for complex systems where thousands of software interactions, hundreds of wiring connections, and hundreds of human factors affect the systems' reliability. Then there are issues of compatibility [1] among components and materials, among subsystems, and among hardware and software interactions. Therefore, for complex systems we may find it impossible to have zero failures, but we must at least prevent the potential failures we know about. Since failures can come from unknown and unexpected interactions, we should try to design-in fallback modes for unexpected events. A “what-if” analysis usually points to some events of this type. To minimize failures in complex systems, in this book we describe techniques for improving software and interface reliability.

As indicated earlier, some companies have built a strong and long-lasting reputation for reliability based on aiming at zero failures. Toyota and Sony built their world leadership mostly on high reliability; and Hyundai has been offering a 10-year warranty and increasing its market share steadily. Progress has been made since then. In 1974, when nobody in the world gave a warranty longer than one year, Cooper Industries gave a 15-year warranty to electric power utilities on high-voltage transformer components and stood out as the leader in profitability among all Fortune 500 electrical companies. Raytheon has established a culture at the highest level in the corporation of providing customers with mission assurance through a “no doubt” mindset. Says Bill Swanson, chairman and CEO of Raytheon: “[T]here must be no doubt that our products will work in the field when they are needed” (Raytheon Company, Technology Today, 2005, Issue 4). Similarly, with its new lifetime power train warranty, Chrysler is creating new standards for reliability.

Reflections on the Current State of the Art

Reliability is defined as the probability of performing all the functions (including safety functions) satisfactorily for a specified time and specified use conditions. The functions and use conditions come from the specification. If a specification misses or is vague 60% or more of the time, the reliability predictions are of very little value. This is usually the case [2]. The second big issue is: How many failures should be tolerable? Some readers may not agree that we can design for zero critical failures, but the evidence supports the contrary conclusion. We may not be able to prevent failures that we did not foresee, but we can design out all the critical failure modes that we discover during the requirements analysis and in the failure mode and effects analysis (FMEA). In over 30 years' experience, I have yet to encounter a failure mode that cannot be designed-out. The cost is usually not an issue if the FMEA is conducted and the improvements are made during the early design stage. The time specified for critical failures in the reliability definition should be the entire lifetime expected.

In this chapter we address how to write a good system specification and how to design so as not to fail. We make it clear that the design for reliability should concentrate on the critical and major failures. This prevents us from solving easy problems and ignoring the complex ones. The following incident raises issues that are central to designing for reliability.

The lessons learned from the Interstate 35 bridge collapse in Minnesota on August 1, 2007 into the Mississippi River on August 1, killing 13, give us some clues about what needs to be done. Similar failure mechanisms can be found in many large electrical and mechanical systems, such as aircraft and electric power plants.

The bridge was expanded from four lanes to six, and eventually to eight. Some wonder whether that might have played a role in its collapse. Investigators said the failure resulted because of a flaw in its design. The designers had specified a metal plate that was too thin to serve as a junction of several girders.

Like many products, it gradually got exposed to higher loads, adding strain to the weak spot. At the time of the collapse, the maintenance crews had brought tons of equipment and material onto the deck for a repair job. The bridge was of a design known as a nonredundant structure, meaning that if a single part failed, the entire structure could collapse. Experts say that the pigeon dung all over the steel could have caused faster corrosion than was predicted.

This case history challenges the fundamentals of engineering taught in the universities.

Should the design margin be 100% or 800

%? “How does the designer determine the design margin?”

Should we design for pigeons doing their dirty job?

What about designing for all the other environmental stressors, such as chemicals sprayed during snow emergencies, tornados, and earthquakes?

Should we design-in redundancy on large mechanical systems to avoid disasters?

The wisdom says that redundancy delays failures but may not avoid disasters. The failure could occur in both the redundant paths, such as in an aircraft accident where the flying debris cut through all three redundant hydraulic lines.

Should we design for sudden shocks experienced by the bridge during repair and maintenance?

These concerns apply to any product, such as electronics, electrical power systems, and even a complex software design. In software, the corrosion can be symbolic for applying too many patches without knowing the interactions. Call it “software corrosion.”

The answers to the questions above should be a resounding “yes.” An engineering team should foresee all these and many more failure scenarios before starting to design. The obvious strategy is to write a good system specification by first predicting all major potential failures and avoiding them by writing robust requirements. Oversights and omissions in specifications are the biggest weakness in the design for reliability. Typically, 200 to 300 requirements are generally missing or vague for a reasonably complex system such as an automotive transmission.

Analyses techniques covered in this book for hardware and software help us discover many missing requirements, and a good brainstorming session for overlooked requirements always results in discovering many more. What we really need is perhaps the paradigms based on lessons learned.

The Paradigms for Design for Reliability

Reliability is a process. If the right process is followed, results are likely to be right. The opposite is also true in the absence of the right process. There is a saying: “If we don't know where we are going, that's where we will go.” It is difficult enough to do the right things, but it is even more difficult to know what the right things are!

Knowledge of the right things comes from practicing the use of lessons learned. Just having all the facts at your fingertips does not work. One must utilize the accumulated knowledge for arriving at correct decisions. Theory is not enough. One must keep becoming better by practicing. Take the example of swimming. One cannot learn to swim from books alone; one must practice swimming. It is okay to fail as long as mistakes are the stepping stones to failure prevention. Thomas Edison was reminded that he failed 2000 times before the success of the light bulb. His answer, “I never failed. There were 2000 steps in this process.”

One of the best techniques is to use lessons learned in the form of paradigms. They are easy to remember and they make good topics for brainstorming during design reviews.

Paradigm 1: Learn To Be Lean Instead of Mean

When engineers say that a component's life is five years, they usually imply the calculation of the mean value, which says that there is a 50% chance of failure during the five years. In other words, either the supplier or the customer has to pay for 50% failures during the product cycle. This is expensive for both: a lose–lose situation. Besides, there are many indirect expenses: for warranties, production testing, and more inventories to replace failed parts. This is mean management. It has a negative return on investment. It is mean to the supplier because of loss of future business and mean to the customer in putting up with the frustrations of downtime and the cost of business interruptions. Therefore, our failure rate goal should be as lean as possible. Engineers should promise minimum life to customers, not mean life. Never use averages in reliability; they are of no use to anyone.

Paradigm 2: Spend a Lot of Time on Requirement Analysis

It is worth repeating that the sources of most failures are incomplete, ambiguous, and poorly defined requirements. That is why we introduce unnecessary design changes and write deviations when we are in hurry to ship a product. Look particularly for missing functions in the specifications. There is often practically nothing in a specification about modularity, reliability, safety, serviceability, logistics, human factors, reduction of “no faults found,” diagnostics capability, and prevention of warranty failures. Very few specifications address even obvious requirements, such as internal interface, external interface, user–hardware interface, user–software interface, and how the product should behave if and when a sneak failure occurs. Developing a good specification is an iterative process with inputs from the customer and the entities that are downstream in the process. Those who are trying to build reliability around a faulty specification should only expect a faulty product. Unfortunately, most companies think of reliability when the design is already approved. At this stage there is no budget and no time for major design changes. The only thing a company can do is to hope for reasonable reliability and commit to do better the next time.

To identify missing functions, a cross-functional team is necessary. At least one member from each disciple should be present, such as manufacturing, field service, and marketing, as well as a customer representative. If the specification contains only 50% of the necessary features, how can one even think of reliability? Reliability is not possible without accurate and comprehensive specifications. Therefore, writing accurate performance specifications is a prerequisite for reliability. Such specifications should aim at zero failures for the modes that result in product recalls, high downtime, and inability to diagnose. My interviews with those attending my reliability courses reveal that the dealers are unable to diagnose about 65% of the problems (no faults found). Obviously, fault isolation requirements in the specifications are necessary to reduce down time.

To ensure the accuracy and completeness of a specification, only those who have knowledge of what makes a good specification should approve it. They must ensure that the specification is clear on what the product should never do, however stupid it may sound. For example: “There shall be no sudden acceleration during landing” for an aircraft. In addition, the marketing and sales experts should participate in writing the specification to make sure that old warranty problems “shall not” be in the new product and that there is enough gain in reliability to give the product a competitive edge.

The “shall not” specification is not limited to failures. That would be too simple. We must be able to see the complexity in this simplicity. This is called interconnectedness. We need to know that reliability is intertwined with many elements of life-cycle costs. The costs of downtime, repairs, preventive maintenance, amount of logistics support required, safety, diagnostics, and serviceability are dependent on the level of reliability. In the same spirit, we should also analyze product friendliness and modularity, which are interconnected with reliability. For example, General Motors is designing its hydrogen cars to have a single chassis for all models instead of 80 different chassis as is the case with current production. This action influences reliability in many ways. Similarly, an analysis of downtime should be conducted by service engineering staff to ensure that each fault will be diagnosed in a timely manner, repairs will be quick, and life-cycle costs will be reduced by extending the maintenance cycles or eliminating the need for maintenance altogether. The specification should be critiqued for quick serviceability and ease of access. Until the specification is written thoroughly and approved, no design work should begin. An example of the need to identify missing requirements is that nearly 1000 people around the world lost their lives while the kinks were being removed from the 290-ton McDonnell Douglas DC-10 during the 1970s. Blown-out cargo doors, shredded hydraulic lines, and engines dropped during the flight were just a few of the behemoth's early problems. It is obvious that the company did not have the right system performance specification. We rely on customers to tell us what they want, but they themselves don't know many requirements until there is a breakdown. Customers are not going to tell us that the cargo doors should not blow out during a crowded flight. It is the design team's responsibility to figure out what the customers did not say.

To find the design flaws early, a team has to view the system from various angles. You would not buy a house by just looking at the front view. You want to see it from all sides. Similarly, a product concept has to be viewed from at least the following perspectives:

Functions of the product

Range of applications

Range of environments

Active safety

Duty cycles during life

Reliability

Robustness for user or servicing mistakes

Logistics requirements

Manufacturability requirements

Internal interface requirements

External interface requirements

Installation requirements

Shipping and handling capabilities

Serviceability and diagnostics capabilities

Prognostics health monitoring

Usability on other products

Sustainability

There is a need to explain a sustainable design in the list above. Good product design is about meeting current needs without compromising the needs of future generations, such as by pollution or global warming. Current electronic and computers are not designed for sustainability. They should have been designed for reuse—the ability to recycle is not enough. Not everyone makes an effort to recycle. According to NBC News on October 4, 2007, there are over 3 billion such devices and only 15% are recycled. About 200 million tons, with mercury in the monitors and lead in the solder, wind up in landfills and often in drinking water.

Most designers are likely to miss many of the requirements noted above. This knowledge is not new. It can be included by inviting experts in these areas to brainstorm. There is no mechanism for customers to specify all of these. Suppliers that want to do productive work will teach customers how to develop good requirements as a team member. This makes the customer understand what needs to be in the contract. The point here is that if we have to fix many mistakes later (expensively), we cannot be proud of reliability, as craftsmen once were.

Paradigm 3: Measure Reliability by Life-Cycle Costs

It is wrong to measure reliability in terms of failure rates alone. Such a negative index with unknown impact does not get much attention from management, except when there is a crisis. It is the cost of failures that is important. It should be measured by reduction in life-cycle costs. The fewer the failures, the lower is the life-cycle cost. The costs should be measured over the expected life. They are not just warranty costs; they include the cost of downtime, repairs, logistics, human errors, and product liability. When I was in charge of the reliability of the Baltimore Rapid Transit Train system design, the reliability performance was measured in terms of cost per track mile. Similarly, at Baltimore Gas & Electric, reliability is measured in terms of cost per circuit mile. Smart customers look for only one performance feature: the life-cycle cost per unit of use. Those who approve the specification should concentrate on this measure. Reliability must result in cheaper, faster, and better products.

Paradigm 4: Design for Twice the Life

Why twice the life? The simple answer is that it is the fundamental taught in Engineering 101, which seems to have been forgotten. Remember 100% design margin? Second, it is cheaper than designing for one life if we measure reliability by the life-cycle cost savings. A division of Eaton Corporation requires twice-the-life at 500% return on investment [3]. It actually turns the situation into a positive cash flow, since there is nothing to be monitored if the failures occur beyond the first life. The 50% failure rate is now shifted to the second life, when the product is going to be obsolete. Engineers try to design transmission components without increasing the size or weight, using alternative means such as heat treating in a different way or eliminating joints in the assemblies. Occasionally, they may increase the size by a very minor amount, such as on wires or connectors, to expedite the solution. This is acceptable as long as the return on investment is at least 500%.

Another reason for twice the life is the need to avoid engineering changes, which seems to be obvious. Imagine a bridge designed for 20-ton trucks and a 30-year life. It may have no problems in the beginning. But the bridge degrades over time. After 10 years it may not be strong enough to take even 15 tons, and it is very likely to collapse. If it had been designed for twice the load (for 40 tons) or for a 60-year life, it should not fail at all during 30 years. It should be noted that designing for twice the load also results in twice the life most of the time, but one must still use some engineering judgment. This is similar to a 100% design margin. For the same reason, the electronic components in the aerospace industry are derated 50%. In one assembly the load-bearing capability was more than doubled by using a cheaper round key instead of a rectangular key. The round key has practically no stress concentration points. In another design, twice the life as well as twice the load capability were achieved by molding two parts as a single piece, preventing stresses at the joint. The cost was lower because no assembly was required, there were fewer part numbers in the inventory, no failures, and no downtime for customers.

What if we cannot design for twice the life? There are times when we cannot think of a proper solution for twice the life. Then one can go to other options, such as:

Providing redundancy on the weakest links, such as bolts, corroded joints, cables, and weak components.

Designing to fail safely such that no one is injured. For automobiles a safe mode can be that the car can switch to a degraded performance with enough time left to reach home or a repair facility.

Designing-in early prognostics-type warnings so that the user still has sufficient time to correct the situation—before failure occurs. One of the purposes of prognostics is to predict the remaining life.

Paradigm 5: Safety-Critical Components Should Be Designed for Four Lives

The rule of thumb in aerospace for safety-related components is to design for four times the life. A U.S. Navy policy (NAVAIR) is to design safety-critical components for four times the life and conduct a test for a minimum of twice the life. The expected life should include future increases in load. Many airlines use their aircraft beyond the design life by performing more maintenance. This indirectly exposes many components to work beyond the normal one life. This is the main reason for designing for four times the life, to maintain 100% design margin all the time. Similarly, many consumers drive cars far beyond the expected 10-year life.

We should also design for peak loads, not the usual mean load. When a high-voltage cable used in power lines broke easily, engineers could not duplicate the failure with average loads. When they applied the peak loads, they could.

Designing for four times the life does not mean overdesigning. It is the art of choosing the right concept. If the attention is placed on innovation rather than marginal improvements, engineers can design for multiple lives with little or no investment, as shown earlier by several examples. They must encourage themselves to think differently rather than latching on to outdated traditional methods of increasing the size or weight. Engineers who talk of costs when solving problems usually block out creativity. They draw the boundary around the solution. Their first thought is to increase the size or weight to design for high loads. This is very common defective thinking. This is where the universities need to be more knowledgeable. We need to balance logic with creativity and should still be able to show a high return on investment.

Paradigm 6: Learn to Alter the Paradox of Cost and Performance into a Win–Win Situation

Most engineers are of the opinion that high reliability costs more. World-class organizations embrace the paradox of increasing reliability and lowering costs simultaneously. Trade-off between reliability and cost is not always necessary. Toyota has mastered this paradigm, where high reliability and lower life-cycle costs are a way of life. Toyota has learned over the years that preventing failures is always cheaper than fixing them if the failure prevention process starts early in the design. If we capture the potential failures during the requirements analysis, we can include design for reliability without making wasteful engineering changes later. Similarly, during detailed design reviews, such tools as design failure modes, effects, and criticality analysis (FMECA), process FMECA, and fault tree analysis, if used early, can help us discover many missing, vague, and incomplete requirements. Engineering changes are the biggest source of waste in organizations, because most of them can be prevented. Here are some examples of achieving high reliability with very little or no investment. Since high reliability reduces life-cycle costs, the insignificant amount of investment does not negatively affect the win–win scenario.

Example 1

A company in Brazil had designed a large warning light bulb on a control console, with a plastic cover to reduce glare. They told me that they tried all kinds of plastics for the cap but that all of them melted after a few months. Someone suggested using a glass cover. We received the usual stupid answer: “Glass will cost three times as much as plastic. The cost of the product will be high.” The bad part is that many engineers look only at the cost of the component and completely ignore the cost of losing customers and the warranty costs to the employer. They are unaware that the cost of getting a new customer is at least five times the cost of retaining a current customer. When the team calculated the life-cycle costs of plastic versus the glass cap, the return on investment (ROI) turned out to be 300% in favor of the glass material. The author requested them to put a hold on the solution because we had agreed on an ROI goal of at least 500%. The author advised the entire team to take long showers for three weeks in the hope that someone would come up with a better idea. Why? Because when you take a long shower, your brain is calmed. In this state it is able to use over 1000 billion neurons that you have never used.

It so happened that the present author (the facilitator) was the one taking the long shower. Suddenly I began to feel that the engineers were giving me a snow job! They said that they tried all the plastics and they all melted. This could not be true. There are fundamentally two types of plastics: thermoplastics, which melt with heat, and thermoset plastics, which harden with the heat. I sent them an e-mail suggesting that they try thermoset plastic. It worked. They could not melt it, no matter how much heat they put in. They sent a nice e-mail: “Thanks for the research you did for us.” The cost of the new plastic was almost the same. Zero investment. One hundredfold life. One million percent ROI!

Example 2

The original European jet aircraft Comets were cracking around the windows. They were taken out of service for two years. The engineers, as usual, started to design thicker fuselage walls and proposed an enormous cost increase. Then someone suggested examining the failures and discovered that all the failures were around the corners of the widows. He suggested increasing the radius at the corners. Problem solved quickly, with hardly any investment. The ROI was least 100,000% if you consider the ratio of the cost of thickening the fuselage and the investment in changing the radius on the corners of the windows.

Example 3

At a General Motors facility, the headlamps were failing after about 1000 hours of use. The supplier was going to raise the price 100% to design for twice the life. An engineer turned the filament in the headlamp 90° to avoid harmful vibration and the life increased at least sixfold. Practically zero investment.

Example 4

A dent in a Caterpillar tractor spring was causing premature breakdowns. The reason for the dent was that the spring under the tractor occasionally hit rocks on the ground. The engineers reduced the diameter of the spring such that it wouldn't hit rocks and replaced it with a tougher spring. With a very small investment they got a better than 10,000% ROI.

Paradigm 7: Design to Avoid Latent Manufacturing Flaws

We can design for reliability as much as we want, but if manufacturing processes are subject to operator error and to wide swings in variability, a good design is bound to have premature failures. We need to identify manufacturing features such as the correct torque for fasteners, vulnerability to installing components backward, or vulnerability to using the wrong components. These features could be certain dimensions, alignment, proper fit of mating parts, property of a lubricant, workmanship, and so on. A product should be designed to avoid such vulnerabilities or should be testable during manufacturing to detect abnormalities. For lack of current terminology, we can call it design to avoid latent manufacturing flaws.

Let's look at an example of designing to reduce vulnerability to manufacturing variations. A new motorcycle design involved over 50 different fasteners. Following process FMEA, the production operators discovered that a separate torque was required for each fastener joint. They approached design engineers to ask if they could choose about 20 different fasteners instead of 50. This would allows them to concentrate on fewer fasteners and fewer fastening standards. Engineers were flabbergasted: Such advice coming from the hourly workers was an aha! moment for them. They standardized on a few fasteners.

Another example is from Delco Electronics (now Delphi). A plastic panel required that a plating process have a conductive surface. The plating had been peeling off in two to three years and six sigma team efforts failed to control the plating durability. Someone came up with the bright idea of adding carbon particles to the plastic to make it conductive. The entire plating process was eliminated. The cost went down by 70%. The reliability of the conductivity was now 100%! A good example of over 100,000% ROI.

The secret of controlling manufacturing flaws is to identify where inspection is needed and to design the process such that no inspection is required—if such a solution is possible.

One more example may help. In this case, the process is the focus. Assume that we want to design a dinner table with four legs such that the legs must be equal. If we cut one leg at a time, we cannot get them all equal because of the variability in the cutting process. But if we take all four legs together, and cut all of them with a single cut, they will all be equal.

Paradigm 8: Design for Prognostics Health Monitoring

In complex systems such as telecommunications and fly-by-wire systems, most system failures are not from component failures. They are from very complex interactions and sneak circuits. Failure rates are very difficult to predict. The sudden acceleration experienced by Audi 5000 users during the 1980s was a result of a software sneak failure. A bit in the integrated circuit register got stuck at zero value, which rapidly increased the speed when the gear was engaged in reverse mode. One way to prevent system failures is to monitor the health of critical features such as “stuck at” faults, critical functions, and critical inputs to the system. A possible solution is to develop a software program to determine prognostics, diagnostics, and possible fallback modes.

The following data on a major airline, announced at a Federal Aeronautics Administration (FAA) National Aeronautics and Space Administration (NASA) workshop [4] shows the extent of unpredicted failures:

Problems reported confidentially by airline employees: about 13,000

Number actually in airline files: about 2%, or 260

Number known to the FAA: about 1%, or 130

The sneak failures are more likely to be in embedded software, where it is impractical to do a thorough analysis. Frequently, the software requirements are faulty because they are not derived completely from the system requirements. Peter Neumann, a computer scientist at SRI International, highlights the nature of damage from software defects in the last 15 years [5]:

Wrecked a European satellite launch

Delayed the opening of the new Denver airport by one year

Destroyed a NASA Mars mission

Induced a U.S. Navy ship to destroy an airliner

Shut down ambulance systems in London, leading to several deaths

To counter such risks, we need an early warning, early enough to prevent a major mishap. This tool is prognostics health monitoring. It consists of tracking all the possible unusual events, such as signal rates, the quality of the inputs to the system, or unexpected outputs from the system, and designing in intelligence to detect unusual system behavior. The intelligence may consist of measuring important features and making a decision as to their impact. For example, a sensor input occasionally occurs after 30 milliseconds instead of 20 milliseconds as the timing requirement states. The question is: Is this an indication of a disaster? If so, the sensor calibration may be required before the failure manifests as a mishap.

Summary

In summary we can say that we need to define functions correctly. We need to design not to fail, and we need to implement all the paradigms covered in this chapter, including designing to avoid manufacturing problems. Once I was at a company meeting where the customers were asked to describe the warranty they would wish to have. One of them said (and others agreed): No warranty is the best warranty. Very few understood the paradox—the best warranty would be one that would never experience a claim. In other words, the customers wanted a failure-free design for reliability.

References

[1] Kuo, W., Compatibility and simplicity: the fundamentals of reliability, IEEE Trans. Reliab., vol. 56, Dec. 2007.

[2] Raheja, D. G., Product Assurance Technologies, Design for Competitiveness, Inc., 2002.

[3] Raheja, D. G., and Allocco M., Assurance Technologies Principles and Practices: A Product, Process, and System Safety Perspective, 2nd ed., Wiley, Hoboken, NJ, 2006, Appendix.

[4] Farrow, D. R., presented at the Fifth International Workshop on Risk Analysis and Performance Measurement in Aviation, sponsored by FAA and NASA, Baltimore, Aug. 19–21, 2003.

[5] Mann, C. C., Why software is so bad, Technol. Rev., July-Aug. 2002.

Chapter 2

Reliability Design Tools

Joseph A. Childs

Introduction

The importance of designing reliability into a product was the focus of Chapter 1. As technology continues to advance, products continue to increase in complexity. Their ability to perform when needed and to last longer are becoming increasingly important. Similarly, it is becoming more and more critical to be able to predict failure occurrences for today's products more effectively and more thoroughly. This means that reliability engineers must be increasingly effective at understanding what is at stake, assessing reliability, and assuring that product reliability maturity is at the level required. To assure this effectiveness, tools have been developed in the reliability engineering discipline. This chapter is a summary of such tools that exist in all aspects of a product's life: from invention, design, production, and testing, to its use and end of life.

The automation of reliability methods into tools is important for the repeatability of the process and results, for value-added benefits in terms of cost savings during the application of design analysis methods, and for achieving desired results faster, improving design cycle times. As design processes evolve, the tools should evolve. Innovation in the current electrical and mechanical design tool suite should include interfacing to the current design reliability tool suite.

Reliability Tools in the Life Cycles of Products

One important thing about reliability engineering as a discipline is that it is involved in all parts of a product's life: from product inception, its manufacture and use, to its end of life. This is because reliability is an intrinsic part of a product's essence, whether it is a “throwaway” coffee cup or a sophisticated spacecraft intended to last 10 years in outer space. As an intrinsic parameter, it must be taken into account in the definition, design, building, test, and use (and abuse) of the product. For each program phase, tools have been devised to enable engineers to gain insight into the requirements and status of reliability. Figure 1 provides a generalized flow, representing any product's life cycle and how reliability mirrors those phases throughout a development program. Figure 2 notes key activities and events throughout a product's life cycle.

Figure 1 Reliability involvement in program and product life.

Figure 2 Program and product life tasks.

The reliability tools are designed to help the reliability function to assess and enhance the design so that the product is capable of meeting and exceeding its goals.

In this chapter we provide an overview of many of the tools used in the design life of a product: what they are, how they are performed, and how their results are used by the various design disciplines—reliability, electrical, mechanical, and software design, test, and manufacturing engineering. Figure 3 illustrates the reliability tools that are discussed here, when they might be used in a product's life cycle, and how these tools match the actions and events in each phase of a product's lifetime.

Figure 3 Program and product life tasks, tied to reliability tasks

The Need for Tools: Understanding and Communicating Reliability Performance

Engineering