109,99 €
RELIABILITY PREDICTION FOR MICROELECTRONICS Wiley Series in Quality & Reliability Engineering REVOLUTIONIZE YOUR APPROACH TO RELIABILITY ASSESSMENT WITH THIS GROUNDBREAKING BOOK Reliability evaluation is a critical aspect of engineering, without which safe performance within desired parameters over the lifespan of machines cannot be guaranteed. With microelectronics in particular, the challenges to evaluating reliability are considerable, and statistical methods for creating microelectronic reliability standards are complex. With nano-scale microelectronic devices increasingly prominent in modern life, it has never been more important to understand the tools available to evaluate reliability. Reliability Prediction for Microelectronics meets this need with a cluster of tools built around principles of reliability physics and the concept of remaining useful life (RUL). It takes as its core subject the 'physics of failure', combining a thorough understanding of conventional approaches to reliability evaluation with a keen knowledge of their blind spots. It equips engineers and researchers with the capacity to overcome decades of errant reliability physics and place their work on a sound engineering footing. Reliability Prediction for Microelectronics readers will also find: * Focus on the tools required to perform reliability assessments in real operating conditions * Detailed discussion of topics including failure foundation, reliability testing, acceleration factor calculation, and more * New multi-physics of failure on DSM technologies, including TDDB, EM, HCI, and BTI Reliability Prediction for Microelectronics is ideal for reliability and quality engineers, design engineers, and advanced engineering students looking to understand this crucial area of product design and testing.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 625
Veröffentlichungsjahr: 2024
Joseph B. Bernstein
Ariel University,Israel
Alain A. Bensoussan
Toulouse,France
Emmanuel Bender
Massachusetts Institute of Technology (MIT),Cambridge,USA
This edition first published 2024© 2024 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The rights of Joseph B. Bernstein, Alain A. Bensoussan, and Emmanuel Bender to be identified as the authors of this work have been asserted in accordance with law.
Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data applied for:ISBN: HB: 9781394210930, ePDF: 9781394210947, epub: 9781394210954
Cover image: WileyCover design by: © Richard Newstead/Getty Images
To our patient and dedicated wives:Rina BatyaRevital
To my wife Corinne, my son Edwin,and to the memory of my parents, Myriem and Isaac Bensoussan.
Joseph B. BernsteinProfessor, Ariel University, Ariel (Israel)
BiographyProfessor Joseph B. Bernstein specializes in several areas of nano‐scale micro‐electronic device reliability and physics of failure research, including packaging, system reliability modeling, gate oxide integrity, radiation effects, Flash NAND and NOR memory, SRAM and DRAM, MEMS, and laser‐programmable metal interconnect. He directs the Laboratory for Failure Analysis and Reliability of Electronic Systems, teaches VLSI design courses, and heads the VLSI program at Ariel University. His laboratory is a center of research activity dedicated to serving the needs of manufacturers of highly reliable electronic systems using commercially available off‐the‐shelf parts. Research areas include thermal, mechanical, and electrical interactions of failure mechanisms of ultra‐thin gate dielectrics, nonvolatile memory, advanced metallization, and power devices. He also works extensively with the semiconductor industry on projects relating to failure analysis, defect avoidance, programmable interconnect used in field‐programmable analog arrays, and repair in microelectronic circuits and packaging. Professor Bernstein was a Fulbright Senior Researcher/Lecturer at Tel Aviv University in the Department of Electrical Engineering, Physical Electronics. Professor Bernstein is a senior member of IEEE.
Alain A. BensoussanThales Alenia Space France (1988–2019).Senior Engineer, Optics and Opto‐electronics Parts Expert at Thales Alenia Space France, Toulouse (France) (2010–2019)Formerly, Technical Advisor for Microelectronic and Photonic Components Reliability at IRT Saint Exupery, Toulouse, France (2014–2017)
BiographyDr. Alain Bensoussan is Doctor of Engineering (Dr.-Ing.) and docteur d’Etat from University Paul Sabatier (Toulouse, France) in applied physics. His field of expertise is on microelectronic parts reliability at Thales Alenia Space. He worked at Institut de Recherche (IRT) Saint Exupery (Aeronautic, Space, and Embedded Systems, AESE), Toulouse (France), as a technical adviser for microelectronic and photonic components reliability and was recognized at Thales Alenia Space as an expert on optics and opto‐electronics parts. Dr. Alain Bensoussan’s interests lie in several areas in microelectronics reliability and physics of failure applied research on GaAs and III‐V compounds, monolithic microwave integrated circuits (MMIC), microwave hybrid modules, Si and GaN transistors, IC’s and Deep‐Sub‐Micron technologies, MEMS and MOEMS, active and passive optoelectronic devices and modules. He has represented Thales Alenia Space in space organizations such as EUROSPACE and ESA for more than 15 years.
Emmanuel BenderPostdoctoral Researcher at Ariel University with a Research Affiliate at the Massachusetts Institute of Technology (MIT), Cambridge, USA
BiographyDr. Emmanuel Bender received the Ph.D. degree in electrical and electronics engineering from Ariel University, Ariel, Israel, in 2022. He specializes in statistical failure analysis of silicon VLSI technologies, including 16nm FinFETs. His work focuses on failure phenomena in packaged devices, including bias temperature instability, electromigration, hot carrier instability, and the self-heating effect. He applied the Multiple Temperature Operational Life (MTOL) testing method to generate reliability profiles on FPGA-programmed test structures in 45nm, 28nm, and 16nm technologies. He is currently working as a Postdoctoral Researcher with Ariel University and has a research affiliation with the Microsystems Technology Laboratories, MIT, with a primary focus on advanced packaging device failure analysis. Dr. Bender is a member of IEEE
E‐mail address(es):Joseph B. Bernstein: [email protected] A. Bensoussan: [email protected] Bender: [email protected]
Wiley Series in Quality & Reliability EngineeringDr. Andre V. KleynerSeries Editor
The Wiley Series in Quality & Reliability Engineering aims to provide a solid educational foundation for both practitioners and researchers in the Q&R field and to expand the reader’s knowledge base to include the latest developments in this field. The series will provide a lasting and positive contribution to the teaching and practice of engineering. The series coverage will contain, but is not exclusive to,
Statistical methods
Physics of failure
Reliability modeling
Functional safety
Six‐sigma methods
Lead‐free electronics
Warranty analysis/management
Risk and safety analysis
Wiley Series in Quality & Reliability Engineering
Reliability Prediction for MicroelectronicsJoseph B Bernstein, Alain A. Bensoussan, Emmanuel BenderMarch 2024
Software Reliability Techniques for Real‐World ApplicationsRoger K. YoureeDecember 2022
System Reliability Assessment and Optimization: Methods and ApplicationsYan‐Fu Li, Enrico ZioApril 2022
Design for Excellence in Electronics ManufacturingCheryl Tulkoff, Greg CaswellApril 2021
Design for MaintainabilityLouis J. Gullo (Editor), Jack Dixon (Editor)March 2021
Reliability Culture: How Leaders can Create Organizations that Create Reliable ProductsAdam P. BahretFebruary 2021
Lead‐free Soldering Process Development and ReliabilityJasbir Bath (Editor)August 2020
Automotive System Safety: Critical Considerations for Engineering and Effective ManagementJoseph D. MillerFebruary 2020
Prognostics and Health Management: A Practical Approach to Improving System Reliability Using Condition‐Based DataDouglas Goodman, James P. Hofmeister, Ferenc SzidarovszkyApril 2019
Improving Product Reliability and Software Quality: Strategies, Tools, Process and Implementation, 2nd EditionMark A. Levin, Ted T. Kalal, Jonathan RodinApril 2019
Practical Applications of Bayesian ReliabilityYan Liu, Athula I. AbeyratneApril 2019
Dynamic System Reliability: Modeling and Analysis of Dynamic and Dependent BehaviorsLiudong Xing, Gregory Levitin, ChaonanWangMarch 2019
Reliability Engineering and ServicesTongdan JinMarch 2019
Design for SafetyLouis J. Gullo, Jack DixonFebruary 2018
Thermodynamic Degradation Science: Physics of Failure, Accelerated Testing, Fatigue and ReliabilityAlec FeinbergOctober 2016
Next Generation HALT and HASS: Robust Design of Electronics and SystemsKirk A. Gray, John J. PaschkewitzMay 2016
Reliability and Risk Models: Setting Reliability Requirements, 2nd EditionMichael TodinovNovember 2015
Applied Reliability Engineering and Risk Analysis: Probabilistic Models and Statistical InferenceIlia B. Frenkel (Editor), Alex Karagrigoriou (Editor), Anatoly Lisnianski (Editor), Andre V. Kleyner (Editor)October 2013
Design for ReliabilityDev G. Raheja (Editor), Louis J. Gullo (Editor)July 2012
Effective FMEAs: Achieving Safe, Reliable, and Economical Products and Processes Using Failure Modes and Effects AnalysisCarl CarlsonApril 2012
Failure Analysis: A Practical Guide for Manufacturers of Electronic Components and SystemsMarius Bazu, Titu BajenescuApril 2011
Reliability Technology: Principles and Practice of Failure Prevention in Electronic SystemsNorman PascoeApril 2011
Improving Product Reliability: Strategies and ImplementationMark A. Levin, Ted T. KalalMarch 2003
Test Engineering: A Concise Guide to Cost‐Effective Design, Development and ManufacturePatrick O’ConnorApril 2001
Integrated Circuit Failure Analysis: A Guide to Preparation TechniquesFriedrich BeckJanuary 1998
Measurement and Calibration Requirements for Quality Assurance to ISO 9000Alan S. MorrisOctober 1997
Electronic Component Reliability: Fundamentals, Modelling, Evaluation, and AssuranceFinn JensenNovember 1995
The Wiley Series in Quality & Reliability Engineering aims to provide a solid educational foundation for researchers and practitioners in the field of quality and reliability engineering and expand the knowledge base by including the latest developments in these disciplines.
The importance of quality and reliability to a system can hardly be disputed. Product failures in the field inevitably lead to losses in the form of repair costs, warranty claims, customer dissatisfaction, product recalls, loss of sale, and in extreme cases, loss of life.
Engineering systems are becoming increasingly complex with added functions and capabilities. Rapid development of autonomous vehicles and growing attention to functional safety brings quality and reliability to the forefront of the product development cycle with continuous evolution of semiconductor devices playing an important role in this process. Perpetual miniaturization of integrated circuits (ICs) where feature sizes can go as low as few atomic layers presents new reliability and quality challenges to design and manufacturing processes. The old Test-Analyze-and-Fix (TAAF) technique is no longer a sustainable part of a product development process and is being steadily replaced by design for reliability (DfR) methods. Reliability prediction, which is an important part of DfR is a challenging task, especially considering the growing number of stress factors, operating environments and potential failure modes. Besides the ability to predict the reliability of the product at various points of its lifecycle it is also important to assess its remaining useful life (RUL) based on the product applications, mission profile, user severity and other usage characteristics affecting the degradation process.
The book you are about to read is written by the leading experts in the field of reliability prediction and physics of failure. Joseph Bernstein and Alain Bensoussan, whom I have a privilege to know personally, discuss the methods and approaches of reliability assessment by examining a wide variety of failure modes and failure mechanisms of electronics and explaining how to model the process of degradation in order to make realistic reliability prediction and to assess the RUL. The authors also dedicate significant parts of the book to accelerated testing and acceleration models, which are critical to reliability prediction and to the overall reliability modeling process.
Both authors have lifelong experience in studying quality and reliability of electronic products at both system and component levels. This book offers an excellent mix of theory, practice, useful applications, and common-sense engineering, making it a perfect addition to Wiley Series in Quality and Reliability Engineering. It also promotes the advanced study of physics of failure, which improves our ability to address a number of technological and engineering challenges now and in the future.
The purpose of this Wiley book series is not only to capture the latest trends and advancements in quality and reliability engineering but also to influence future developments in these disciplines. As quality and reliability science evolves, it reflects the trends and transformations of the technologies it supports. In addition to the transformations associated with changes in technology, the field of quality and reliability engineering has been going through its own evolution by developing new techniques and methodologies aimed at process improvement and reduction of the number of design-and manufacturing-related failures. Prognostics and Health Management (PHM) is among the fast-developing fields of study in reliability engineering and it is actively utilizing the concepts of physics of failure, degradation analysis and the effects of environmental stresses extensively discussed in this book.
Despite its obvious importance, quality and reliability education is paradoxically lacking in today's engineering curriculum. Very few engineering schools offer degree programs, or even a sufficient variety of courses, in quality or reliability methods; and the topic of physics of failure and reliability prediction do not receive sufficient coverage in today's engineering student curriculum. Therefore, the majority of the quality and reliability practitioners receive their professional training from colleagues, professional seminars, publications and technical books. The lack of opportunities for formal education in this field emphasizes too well the importance of technical publications for professional development.
We are confident that this book, as well as this entire book series, will continue Wiley's tradition of excellence in technical publishing and provide a lasting and positive contribution to the teaching, and practice of reliability and quality engineering.
Dr. Andre Kleyner,
Editor of the Wiley Series in Quality & Reliability Engineering
This book provides statistical analysis and physics of failure methods used in engineering and areas of Applied Physics of “Healthy” (PoH). The engineering and statistical analyses deal with the concept of remaining useful life (RUL) of electronics in the new deep sub‐micron (DSM) and nano‐scale technologies era and integrated electronics in operation. Many concepts developed in this book are of high interest to the benefit of various products and systems managers, manufacturers, as well as users in many commercial industries: aerospace, automotive, telecommunication, civil, energy (nuclear, wind power farms, solar energy, or even oil and gas).
A broad audience of practitioners, engineers, applied scientists, technical managers, experimental physicists, test equipment laboratory managers, college professors and students, and instructors of various continuing education programs were in mind to construct the overall structure of the book.
Engineering products must offer a worthwhile lifetime and operate safely during their service period under predefined conditions and environmental hazards. To achieve this goal, engineers must understand the ways in which the useful life‐in‐service of the product can be evaluated and incorporate this understanding into the design (by hardware or software). The tasks involved with reliability prediction are to analyze the design, test, manufacture, operate, and maintain the product, system, or structure at all stages of product development, manufacturing, and use, from the moment of design to the cessation of maintenance and repair to the end. Reliability standards are based on experience and consider random failure rate associated with an activation energy for a single failure mechanism. Today, we are using devices (smart phones, PC’s) with IC's size nodes as low as a few atomic layers (<5 nm range). Reliability analysis concepts must consider multiple stress and multiple failure mechanisms activated simultaneously.
This physics of failure‐based book is meant to teach reliability prediction for electronic system to offer a more accurate reliability estimation for electronic system by highlighting the problematic areas of conventional approaches and giving alternative suggestions to cover the shortcomings that lead to inaccurate estimations. It is the opinion of the authors that the major limitation in reliability prediction is the reliance on incorrect prediction mathematics that were improperly introduced early in the days of reliability physics methodology, started by Mil Handbook 217 and the like. These errors continue to propagate themselves to this day. Hence, the motivation for this book derives from the need we see that the practices of relying on incorrect statistical analysis and false combinations of physical phenomena lead to completely wrong approaches to reliability assessment and qualification.
We describe herein a competing failure mechanism approach based on an acceleration test matrix. We present a cell‐based reliability characterization and a statistical comparison of constant failure rate approximations for various physical rate‐based mechanisms. Our alternative suggestion should lead to correct reliability predictions and is justified mathematically rather than to assume a single “base” failure rate multiplied by vaguely justified “π” factors.
The problem at hand is not that conventional handbooks cannot give instructions for electronic system failure rate calculation or that they are not based on the physics of failure foundation; it is that they still apply fundamentally immature and incorrect assumptions. One example of such a basic false assumption is that there exists a “base” failure rate, λb, for some average condition and that small modifications, called “π” factors, can be multiplied to get a modified “true” failure rate. We will show here that this π‐factor modification has no mathematical justification and that a proper sum‐of‐failure‐rate approach would be much more consistent with today’s understanding of reliability physics.
Our assumptions and suggestions need to be articulated more to build an electronic system reliability paradigm.
“Zero failure” qualification reported by industries today is one of the criteria that blocks progress in the domain of reliability. One of the engineers’ responsibilities is to do “conjecture” to find new ways to test not only the products but also the physical theories behind them. When many devices are tested and zero failures occur during the qualification test, there is no way to distinguish exactly which failure mechanism did NOT fail, since no failure was found. In that case, which is the only acceptable case by most industry standards, it is impossible to tell what acceleration factor can be assigned to that lack of failures, especially when we know from the beginning that multiple mechanisms compete for dominance at any operating condition.
The electronics industry, for example, takes this to the extreme in the JEDEC standards (formerly Joint Electron Device Engineering Council), where they propose a χ2 statistic and allow for improperly adding degrees of freedom and a completely unjustified acceleration factor that can never be falsified, since there are no failures to be found. We will discuss this in more detail in what follows; however, we hope to show that there is no statistical validity to adding imaginary data that never occurred and, furthermore, to attributing acceleration factors that were never measured.
“Competing failure mechanisms” is one of the suggestions that could replace the “zero failure rate” paradigm. Once we accept it (now it is accepted even by industries and standard handbooks), we could apply an accelerating test matrix to provide accurate acceleration factors. Then not only the accurate lifetime of the system could be predicted but also flaws and weaknesses could be revealed.
Our primary purpose in this book is to challenge the “π” model of multiplying acceleration factors when these “adjustment” factors are reflected by multiple mechanisms that need to be separated. Secondly, we will reject the idea that a zero‐failure test “results” can give you a predictable time to fail. Alternatively, we will illustrate a multiple‐mechanism failure rate matrix approach that will accurately consider multiple failure mechanisms consistently and simultaneously, allowing for practical and accurate failure rate prediction and calculations.
The history of reliability engineering goes back to 1950s when electronics played a major role for the first time. At that time, there was great concern within the US military establishment for the reliability and maintainability of the current electronic systems. Many meetings and ad hoc groups were created to cope with the problems. Developing better parts, finding quantitative reliability for parts, and collecting field data on actual part failures to determine the root cause of problems were three major fields of research in those days.
When the complexity of electronic equipment began to increase significantly, and new demands were placed on system reliability, a permanent committee (AGREE) was established to identify the actions that could be taken to provide more reliable electronic equipment (1952). The reliability era began when the first Radio Corporation of America (RCA) report on reliability of electronic parts was released in 1956, the first time when reliability was defined as a probability. On the other hand, one of the first reliability handbooks titled Reliability Factors for Ground Electronic Equipment was published in 1956 by McGraw‐Hill under the sponsorship of the Rome Air Development Center (RADC); while the McGraw‐Hill handbook gave information on design considerations, human engineering, interference reduction, and a section on reliability mathematics, failure prediction was only mentioned as a topic under development.
Reliability prediction and assessment are traced to November 1956 with publication of the RCA release TR‐1100, titled “Reliability Stress Analysis for Electronic Equipment,” which presented models for computing rates of component failures. It was the first time that the concepts of activation energy and the Arrhenius relationship were used in modeling component failure rates. However, in 1960s, the first version of a military handbook for the reliability prediction of electronic equipment (MIL‐HDBK‐217) was published by the US Navy [1]. It covered a broad range of part types, and since then, it has been widely used for military and commercial electronics systems.
In July 1973, RCA proposed a new prediction model for microcircuits, based on previous work by the Boeing Aircraft Company. In the early 1970s, RADC further updated the military handbook and revision B was published in 1974. The advent of more complex microelectronic devices pushed the application of MIL‐HDBK‐2 17B beyond reason. This decade is known for development of new innovative models for reliability predictions. Then, RCA developed the physics‐of‐failure model, which was initially rejected because of the lack of availability of essential data.
To keep pace with the accelerating and ever‐changing technology base, MIL‐HDBK‐217C was updated to MIL‐HDBK‐217D on January 15, 1982 and to MIL‐HDBK‐217E on October 27, 1986. In December 1991, MIL‐HDBK‐217F became a prescribed US military reliability prediction document. Two teams were responsible for providing guidelines for the last update. Both teams suggested:
that the constant failure rate (CFR) model could not be used;
that some of the individual wear‐out failure mechanisms (like electromigration and time‐dependent dielectric breakdown) could be modeled with a lognormal distribution;
that the Arrhenius‐type formulation of the failure rate in terms of temperature should not be included in the package failure model; and
that stresses such as temperature change and humidity should be considered.
Both groups noticed that temperature cycling is more detrimental to component reliability than the steady state temperature at which the device is operating, so long as the temperature is below a critical value. This conclusion has been further supported by a National Institute of Standards and Technology (NIST), and an Army Fort Monmouth study which stated that the influence of steady‐state temperature on microelectronic reliability under typical operating changes is inappropriately modeled by an Arrhenius relationship [2–4]. However, considering the ability to separate failure mechanisms by separate Arrhenius activation energies, it may be possible to return to the physics of failure (PoF) assumption that each mechanism will have a unique activation energy.
There are several different approaches to the reliability prediction of electronic systems and equipment. Each approach has unique advantages and disadvantages; several papers have been published on the comparison of reliability assessment approaches. However, there are two distinguishable approaches to reliability prediction, traditional/empirical, and PoF approach.
Traditional, empirical models are those that have been developed from historical reliability databases either from fielded applications or from laboratory tests [5].
Handbook prediction methods are appropriate only for predicting the reliability of electronic and electrical components and systems that exhibit CFRs. All handbook prediction methods contain one or more of the following types of prediction:
Tables of operating and/or non‐operating CFR values arranged by part type,
Multiplicative factors for different environmental parameters to calculate the operating or non‐operating CFR, and
Multiplicative factors that are applied to a base operating CFR to obtain non‐operating CFR
[6]
.
MIL‐HDBK‐217 reliability prediction methodology which was developed under the activity of the RADC (now Rome Laboratory) and its last version released in February 1995 intended to “establish and maintain consistent and uniform methods for estimating the inherent reliability (i.e. the reliability of a mature design) of military electronic equipment and systems. The methodology provided a common basis for reliability predictions during acquisition programs for military electronic systems and equipment. It also established a common basis for comparing and evaluating reliability predictions or related competitive designs. The handbook was intended to be used as a tool to increase the reliability of the equipment being designed.”
In 2001, the office of the US Secretary of Defense stated that “…. the Defense Standards Improvement Council (DSIC) decided several years ago to let MIL‐HDBK‐217 ‘die the death.’ This is still the current OSD position, i.e. we will not support any updates/revisions to MIL‐HDBK‐217” [6].
Two basic methods for performing the prediction based on the data observation include the parts count and the parts stress analysis. The parts count reliability prediction method is used for the early design phases when not enough data is available, but the numbers of component parts are known. The information for parts count method includes generic part types (complexity for microelectronics), part quantity, part quality levels (when known or can be assumed), and environmental factors. Since equipment consists of the parts operating in more than one environment, the “parts count” equation is applied to each portion of the equipment in a distinct environment. The overall equipment failure rate is obtained by summing the failure rate for each component over its expected operating condition.
A part stress model is based on the effect of mechanical, electrical and environmental stress and duty cycles such as temperature, humidity, and vibration on the part failure rate. The part failure rate varies with applied stress and the strength–stress interaction determines the part failure rate. This method is used when most of the design is complete, and the detailed part stress is available. It is applicable during later design phases as well. Since more information is available at this stage, the result is more accurate than the parts count method.
The environmental factor gives the influence of environmental stress on the device. Different prediction methods have their own list of environmental factors suitable for their device conditions. For instance, the environmental factor of MIL‐HDBK‐217F covers almost all the environmental stresses suitable for military electronic devices except for ionizing radiation. The learning factor shows the maturity of the device; it suggests that the first productions are less reliable than the next generations [7, 8]. The parts stress model is applied at component level to obtain part failure rate (λp) estimation with stress analysis. A typical part failure rate can be estimated as:
where λb is the base failure rate obtained from statistical analysis of empirical data, the adjustment factors include: πT (temperature factor), πA (application factor), πV (voltage stress factor), πQ (quality factor), and πE (environmental factor). The equipment failure rate (λEQUIP) can be further predicted through parts count method:
where λg is the generic failure rate for the ith generic part, πQ is the quality factor of the ith generic part, Ni is the quantity of ith generic part and n is the number of different generic part categories in the equipment. To accommodate the advancement of technology, a reliability growth model was introduced in the handbook approach to reflect the state‐of‐the‐art technology.
where Gr is the growth rate, t1 is the year of manufacture for which a failure rate is estimated, t2 is the year of manufacture of parts on which the data were collected. It takes time to collect field data and obtain the growth rate Gr especially when the growth is fast. Furthermore, the validity of applying a reliability growth model without taking technology generation into consideration is not confirmed. It is said that the predictions based on the handbook approach usually lead to conservative failure rate estimation. However, the data is very vague as to the actual validity of the calculated failure rate and no correlation with reality has ever been shown.
Telcordia SR‐332, BT‐HRD‐5, NTT, CNET, RDF 93 and 2000, SAE, Siemens SN29500, prediction of reliability, integrity and survivability of microsystems (PRISM) [9] and FIDES are all the instances of traditional prediction models which provide their own sources of data environment from ground military, civil equipment, automotive, Siemens products, Telecom, commercial‐military and aeronautical and military, respectively. Most of these models have gained popularity over time because of their ease of use and uniqueness [10].
The stated purpose of Telcordia SR‐332 is “to document the recommended methods for predicting device and unit hardware reliability and for predicting serial system hardware reliability.” The methodology is based on empirical statistical modeling of commercial telecommunication systems whose physical design, manufacture, installation, and reliability assurance practices meet the appropriate Telcordia (or equivalent) generic and system‐specific requirements. In general, Telcordia SR‐332 adapts the equations in MIL‐HDBK‐217 to represent what telecommunications equipment experience in the field. Results are provided as a CFR, and the handbook provides the upper 90% confidence‐level point estimate for the CFR [6].
The basis of the Telcordia math models for devices is the Black Box Technique. This parts count method defines a black box steady‐state failure rate for different device types and parts count steady‐state failure rate for units; the system‐level failure rate is simply the sum of all failure rates of the units contained in a system.
The main concepts in MIL‐HDBK‐217 and Telcordia SR‐332 are similar, but Telcordia SR‐332 also can incorporate burn‐in, field, and laboratory test data, using a Bayesian analysis [7]. All these methods, in the end, continue to rely on the “π” model for adjusting a base failure rate, making mathematical justification quite illusive.
PRISM was developed in the 1990s by the Reliability Analysis Center (RAC) under contract with the US Air Force. The latest version of the method, which is available as software, was released in July 2001. RAC Rates is the name of PRISM mathematical model for component failure rates; the component models are based on filed data driven from several sources. PRISM applies Bayesian methods to the empirical data to get the system‐level predictions. This methodology considers the failures of components as well as those related to the system. However, the component models are the heart of the analysis. It provides different models for capacitors, diodes, integrated circuits, resistors, thyristors, transistors, and software. The total component failure rate is composed of:
operating conditions,
non‐operating conditions,
temperature cycling,
solder joint,
electrical overstress (EOS).
Each mechanism is treated independently.
For components not having RAC Rates models, PRISM provides Non‐electronic Parts Reliability and Electronic Parts Reliability Data books. A multitude of part types can be found in these data books with failure rates for various environments.
Unlike the other handbook based on the CFR models, the RAC Rate models do not have a separate factor for part quality level. Quality level is implicitly accounted for by a method known as process grading. Process grades address factors such as design, manufacturing, part procurement, and system management, which are intended to capture the extent to which measures have been taken to minimize the occurrence of system failures [6‐8, 11].
RDF 2000, released in July 2000, is a French Telecom standard that was developed by the Union Technique de l’Electricite (UTE). It is the last version of CNET reliability prediction methodology developed by the Centre National d’Etudes des Telecommunications (CNET). “RDF 2000 provides a unique approach to failure rate predictions in that it does not provide a parts count prediction. Rather component failure is defined in terms of an empirical expression containing a base failure rate multiplied by factors influenced by mission profiles.” These mission profiles contain information about operational cycling and thermal variations during various working phases [7].
FIDES prediction method attempts to predict the CFR experienced in the useful life portion of the classic bathtub curve. The approach models intrinsic failures such as item technology and distribution quality. It also considers extrinsic failures resulting from equipment specification, design, production, and integration, as well as selection of the procurement route. The methodology considers failures resulting from development and manufacturing, and the overstresses linked to the application such as electrical, mechanical, and thermal.
Some proponents claim that FIDES predictions are “close” to the observed failure rate; its predictions are somewhere in between PRISM predictions that are more optimistic and those of MIL‐HDBK‐217 which are more conservative [11, 12]. RAC PRISM and FIDES are two methodologies which take place in two stages; the key element is applying the Bayesian statistical techniques to the initial reliability prediction.
Due to the wide range of available traditional reliability predictions, several articles have been published to study and compare the prediction models and methodologies. As an instance of these efforts, IEEE Std 1413‐1998 was developed to identify the key required elements for an understandable and credible reliability prediction and to provide its users with sufficient information to evaluate prediction methodologies and to effectively use their results.
A comparison of some of the more popular electronic reliability prediction tools is shown in Table 1.1 [10, 13–16].
Table 1.1 Electronic reliability prediction tool comparator.
Source: Reproduced with permission from [16]/Quanterion Solutions Incorporated.
Attribute
Electronic prediction tool
Methodology
MIL‐HDBK‐217
IEC‐TR‐62380
TELCORDIA
PRISM
FIDES
Version/date/source
F, Notice 2, February 1995
Edition 1 August 2004
SR‐332, Issue 1, May 2001
1.5 July, 2003
Issue A October, 2021
Distribution method
Handbook
Handbook
Handbook
Software
Handbook
Updates anticipated
No
Yes
Yes
Yes
Yes
Metric used
Failures per 10
6
operating hours
Failures per 10
6
operating hours
Failures per 10
9
operating hours
Failures per 10
6
calendar hours
Failures per 10
9
calendar hours
Software available
Yes
Yes
Yes
Yes
Yes
Environmental predefined choices
14
12
Limited to 5
37
7
Additional environmental modifiers
None
None
None
Operating temperature, dormant temperature, amplitude and frequency of thermal cycles, relative humidity, vibrational level
Operating temperature, amplitude and frequency of thermal cycles, relative humidity, vibrational level, ambient pollution level, overstress exposure
Part model type
Multiplicative
Multiplicative
Multiplicative
Additive
Additive
Operating profile
No
Yes
No
Yes
Yes
Thermal cycling
No
Yes
No
Yes
Yes
Thermal rise in part
Yes
Yes
No
Yes
Yes
Solder joints failures
No
Yes
No
Yes
Yes
Induced failures
No
No
No
Yes
Yes
Failure rate data base for other parts
Limited
Limited
No
Yes
Yes
Infant mortality
No
No
Yes
Yes
No
Dormant failure rate
No
No
No
Yes
Yes
Test data integration
No
No
Yes
Yes
No
Bayesian analysis
No
No
No
Yes
No
The most controversial aspects of traditional approaches could be classified as the concept of CFR, the use of Arrhenius relation, the difficulty in maintaining support data, problems of collecting good‐quality field data, and the diversity of failure rates with the source data are other limitations of traditional approaches [4].
Despite the disadvantages and limitations of traditional‐/empirical‐based handbooks, they are still used by engineers; strong factors such as good performance centered around field reliability ease of use as well as and providing approximate field failure rates make them still popular. Crane survey shows that almost 80% of the respondents use Mil handbook, while PRISM and Telcordia have second and third places in the chart [17].
The first publications on reliability predictions for electronic equipment were all based on curve fitting a mathematical model to historical field failure data to determine the CFR of parts. None of them could include a root‐cause analysis of the traditional/empirical approach. The PoF approach received a big boost when the US Army Material Command authorized a program to institute a transition from reliance exclusively on MIL‐HDBL‐217. The command’s Army Material System Analysis Activity (Amsaa), Communications‐Electronic Command (Cecom), the Laboratory Command (Lab‐com) and the University of Maryland’s Computer‐Aided Life‐Cycle Engineering (CALCE) group collaborate on developing a physics‐of‐failure handbook for reliability assurance. The methodology behind the handbook was assessing system reliability based on environmental and operating stresses, the material used, and the packaging selected. Two simulation tools called Computer‐Aided Design of Microelectronic Packages (CADMP‐2) and CALCE were developed to help with the assessment; CADMP‐2 assesses the reliability of electronics at the package level while CALCE assesses the reliability of electronics at the printed wiring board level. Together, these two models provide a framework to support a physics‐of‐failure approach to reliability in electronic systems design.
PoF uses the knowledge of root‐cause failure to design and do the reliability assessment, testing, screening, and stress margins to prevent product failures. The main task of PoF approach is to identify potential failure mechanisms, failure sites, and failure modes, the appropriate failure models, and their input parameters, determine the variability for each design parameter, and compute the effective reliability function. In summary, the objective of any physics‐of‐failure analysis is to determine or predict when a specific end‐of‐life failure mechanism will occur for an individual component in a specific application. A physics‐of‐failure prediction looks at each individual failure mechanism such as electromigration, solder joint cracking, die bond adhesion, etc., to estimate the probability of component wear‐out within the useful life of the product. This analysis requires detailed knowledge of all material characteristics, geometries, and environmental conditions. The subject is constantly challenging, considering that new failure mechanisms are discovered and even the old ones are not completely explained. One of the most important advantages of the physics‐of‐failure approach is the accurate predictions of wear‐out mechanisms and their cumulative effect on the time to fail. Moreover, since the acceleration test is one of the main aspects of finding the model parameters, it could also provide the necessary test criteria for the product. To sum up, modeling potential failure mechanisms, predicting the end‐of‐life, and using generic failure models effective for new materials and structures are the achievements of this approach [4].
The disadvantages of PoF approaches are related to their cost, complexity of combining the knowledge of materials, process, and failure mechanisms together, the difficulty of estimating the field reliability, and their inapplicability to the devices already in production as the result of its incapability of assessing the whole system [4, 18, 19]. The method needs access to the product materials, process, and data. Partly inspired and completed from Cushing et al. [15], Table 1.2 compares MIL‐HDBK‐217, FIDES approach and PoF.
Nowadays circuit designers have reliability simulations as an integral part of the design tools, like Cadence Ultrasim and Mentor Graphics Eldo. These simulators model the most significant physical failure mechanisms and help the designers meet the lifetime performance requirement. However, there are disadvantages which hinder designers to adopt these tools. First, the tools are not fully integrated into the design software because the full integration requires technical support from both the tool developers and the foundry. Second, they can’t handle the large‐scale design efficiently. The increasing complexity makes it impossible to exercise full‐scale simulation considering the resources that simulation will consume. Chip‐level reliability prediction only focuses on the chip’s end‐of‐life, while the known wear‐out mechanisms are dominant; however, these prediction tools do not predict the random, post‐burn‐in failure rate that would be seen in the field [5, 21].
By consideration of all the available reliability prediction tools and methods, they are all based on this original Mil‐Handbook approach of assuming a base failure rate, λb, and multiplying that by what should be small correction factors, π. We will show in the following chapters that this fundamentally contradicts the presumption of a failure rate, λ, since failure rates add and do not multiply. This brings us to the established and known reliability assumption that linear rate mechanisms add and must be separated into a properly weighted sum‐of‐rates model.
Table 1.2 Comparison between MIL‐HDBK‐217 and physics of failure.
Issue
MIL‐HDBK‐217
FIDES
Physics of failure
Model development
The effectiveness of models in providing accurate design or manufacturing guidance is questionable since they have been developed based on assumed constant failure‐rate data rather than root cause or time‐to‐failure data. As highlighted by Morris
[20]
, the data available is often fragmented, requiring interpolation or extrapolation for the development of new models. Consequently, it is not appropriate to associate statistical confidence intervals with the overall model results.
FIDES and MIL‐HDBK are both methodologies used for reliability prediction and analysis, but they have some key differences. FIDES offers advantages over MIL‐HDBK in terms of utilizing real‐world data, improving root cause analysis, and acknowledging the challenges of data completeness and quality. While statistical confidence intervals may not be directly associated with model results in FIDES, the emphasis on using real‐world data helps mitigate some of the limitations observed in MIL‐HDBK’s model development approach. In addition, FIDES model is presented to be based on field failure data from few aerospace and military companies but is representing a small part of the worldwide original equipment manufacturers. The model assumes that each component has a constant failure rate, which is not relevant, in terms of part types failure.
Physics of Failure (PoF) models based on science and engineering first principles offer an alternative physical approach to reliability prediction and analysis compared to MIL‐HDBK and FIDES models. PoF models rely on a deep understanding of the physical mechanisms and failure processes involved in a system, allowing for more accurate predictions and a clearer understanding of the underlying causes of failures. Unlike MIL‐HDBK and FIDES, PoF models consider the fundamental physics and engineering principles governing the behavior of materials, components, and systems. This approach enables the modeling of complex interactions between different stress factors, such as temperature, mechanical stresses, humidity, and vibration, leading to a more comprehensive understanding of failure mechanisms. PoF models can support both deterministic and probabilistic applications. Deterministic PoF models focus on understanding and predicting specific failure modes, providing valuable insights into the system’s weak points and guiding targeted mitigation strategies. On the other hand, probabilistic PoF models incorporate statistical variations and uncertainties in the inputs to provide a probabilistic assessment of reliability, considering the inherent variability in material properties, manufacturing processes, and operational conditions. By leveraging scientific principles and detailed knowledge of the system’s behavior, PoF models can overcome some of the limitations of MIL‐HDBK and FIDES approaches. They provide a solid foundation for reliability analysis, enabling engineers to make informed decisions based on the understanding of physical failure mechanisms rather than relying solely on empirical data or statistical models.
Device design modeling