Failure Analysis - Marius Bazu - E-Book

Failure Analysis E-Book

Marius Bazu

0,0
95,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Failure analysis is the preferred method to investigate product or process reliability and to ensure optimum performance of electrical components and systems. The physics-of-failure approach is the only internationally accepted solution for continuously improving the reliability of materials, devices and processes. The models have been developed from the physical and chemical phenomena that are responsible for degradation or failure of electronic components and materials and now replace popular distribution models for failure mechanisms such as Weibull or lognormal.

Reliability engineers need practical orientation around the complex procedures involved in failure analysis. This guide acts as a tool for all advanced techniques, their benefits and vital aspects of their use in a reliability programme. Using twelve complex case studies, the authors explain why failure analysis should be used with electronic components, when implementation is appropriate and methods for its successful use.

Inside you will find detailed coverage on:

  • a synergistic approach to failure modes and mechanisms, along with reliability physics and the failure analysis of materials, emphasizing the vital importance of cooperation between a product development team involved
  • the reasons why failure analysis is an important tool for improving yield and reliability by corrective actions
  • the design stage, highlighting the ‘concurrent engineering' approach and DfR (Design for Reliability) 
  • failure analysis during fabrication, covering reliability monitoring, process monitors and package reliability 
  • reliability resting after fabrication, including reliability assessment at this stage and corrective actions 
  • a large variety of methods, such as electrical methods, thermal methods, optical methods, electron microscopy, mechanical methods, X-Ray methods, spectroscopic, acoustical, and laser methods
  • new challenges in reliability testing, such as its use in microsystems and nanostructures

This practical yet comprehensive reference is useful for manufacturers and engineers involved in the design, fabrication and testing of electronic components, devices, ICs and electronic systems, as well as for users of components in complex systems wanting to discover the roots of the reliability flaws for their products.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 845

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Title Page

Copyright

Dedication

Series Editor's Foreword

Foreword by Dr Craig Hillman

Series Editory's Preface

Preface

About the Authors

Chapter 1: Introduction

1.1 The Three Goals of the Book

1.2 Historical Perspective

1.3 Terminology

1.4 State of the Art and Future Trends

1.5 General Plan of the Book

Chapter 2: Failure Analysis—Why?

2.1 Eight Possible Applications

2.2 Forensic Engineering

2.3 Reliability Modelling

2.4 Reverse Engineering

2.5 Controlling Critical Input Variables

2.6 Design for Reliability

2.7 Process Improvement

2.8 Saving Money through Early Control

2.9 A Synergetic Approach

Chapter 3: Failure Analysis—When?

3.1 Failure Analysis during the Development Cycle

3.2 Failure Analysis during Fabrication Preparation

3.3 FA during Fabrication

3.4 FA after Fabrication

3.5 FA during Operation

Chapter 4: Failure Analysis—How?

4.1 Procedures for Failure Analysis

4.2 Techniques for Decapsulating the Device and for Sample Preparation

4.3 Techniques for Failure Analysis

Chapter 5: Failure Analysis—What?

5.1 Failure Modes and Mechanisms at Various Process Steps

5.2 Failure Modes and Mechanisms of Passive Electronic Parts

5.3 Failure Modes and Mechanisms of Silicon Bi Technology

5.4 Failure Modes and Mechanisms of MOS Technology

5.5 Failure Modes and Mechanisms of Optoelectronic and Photonic Technologies

5.6 Failure Modes and Mechanisms of Non-Silicon Technologies

5.7 Failure Modes and Mechanisms of Hybrid Technology

5.8 Failure Modes and Mechanisms of Microsystem Technologies

Chapter 6: Case Studies

6.1 Case Study No. 1: Capacitors

6.2 Case Study No. 2: Bipolar Power Devices

6.3 Case Study No. 3: CMOS Devices

6.4 Case Study No. 4: MOS Field-Effect Transistors

6.5 Case Study No. 5: Thin-Film Transistors

6.6 Case Study No. 6: Heterojunction Field-Effect Transistors

6.7 Case Study No. 7: MEMS Resonators

6.8 Case Study No. 8: MEMS Micro-Cantilevers

6.9 Case Study No. 9: MEMS Switches

6.10 Case Study No. 10: Magnetic MEMS Switches

6.11 Case Study No. 11: Chip-Scale Packages

6.12 Case Study No. 12: Solder Joints

6.13 Conclusions

Chapter 7: Conclusions

Acronyms

Glossary

Terms Related to Electronic Components and Systems

Terms Related to Failure Analysis

Index

This edition first published 2011

© 2011 John Wiley & Sons, Ltd.

Registered office

John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.

The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloging-in-Publication Data

Bâzu, M. I. (Marius I.), 1948-

Failure Analysis : A Practical Guide for Manufacturers of Electronic Components and Systems / Marius Bâzu, Titu-Marius Bjenescu.

p. cm. – (Quality and Reliability Engineering Series ; 4)

Includes bibliographical references and index.

ISBN 978-0-470-74824-4 (hardback)

1. Electronic apparatus and appliances–Reliability. 2. Electronic systems–Testing. 3. System failures (Engineering)–Prevention. I. Bjenescu, Titu, 1938- II. Title.

TK7870.23.B395 2011

621.381–dc22

2010046383

A catalogue record for this book is available from the British Library.

Print ISBN: 978-0-470-74824-4 (HB)

E-PDF ISBN: 978-1-119-99010-9

O-book ISBN: 978-1-119-99009-3

E-Pub ISBN: 978-1-119-99000-0

With all my love, to my wife Cristina.

Marius I. Bâzu

To my charming dear wife Andrea—an unfailing source of inspiration—thankful and grateful for her love, patience, encouragement, and faithfulness to me, for all my projects, during our whole common life of a half-century.

To my descendants, with much love.

Titu-Marius I. Bjenescu

Series Editor's Foreword

During my eventful career in the aerospace and automotive industries (as well as in consulting and teaching) I have had plenty of opportunities to appreciate the contribution failure analysis makes to successful product development. While performing various engineering design functions associated with product reliability and quality, I have often found myself or a team member rushing to the failure analysis lab calling for help. And help we would always receive.

Reliability Science combines two interrelated disciplines: 1. Reliability Mathematics including probability, statistics and data analysis and 2. Physics of Failure. While the literature on the former is plentiful with a significant degree of depth, literature on the latter is somewhat scarce. The book you are about to read will successfully reduce that knowledge gap. It is not only a superb tutorial on the physics of failure, but also a comprehensive guide on failure mechanisms and the various ways to analyze them.

Engineering experience accumulated over the years clearly indicates that in reliability science Physics trumps Mathematics most of the time. Too often, jumping directly to statistical data analysis leads to flawed results and erroneous conclusions. Indeed, certain questions about the nature of failures need to be answered before life data analysis and probability plotting is carried out. Failure analysis is the key tool in providing answers to those questions. Physics of Failure is also vital to the emerging field of design for reliability (DfR). DfR utilizes past knowledge of product failure mechanisms to avoid those failures in future designs. No question that failure analysis requires sizable expenditures both in equipment and people, nevertheless done properly the return on this investment will be quick and substantial in terms of improved design and future cost reduction.

In this work, Marius Bâzu and Titu Bjenescu effectively demonstrate how critical it is to many engineering disciplines to understand failure mechanisms, and how important it is for the engineering community to continue to develop their knowledge of failure analysis. In addition, the authors clearly raise the bar on connecting theory and practice in engineering applications by putting together an exceptional collection of case studies at the end of the book.

Bâzu and Bjenescu successfully capture the essence and inner harmony of reliability analysis in this work. It undoubtedly will present a wealth of theoretical and practical knowledge to a variety of readers; from college students to seasoned engineering professionals in the fields of quality, reliability, electronics design, and component engineering.

Dr Andre Kleyner,

Global Reliability Engineering Leader at Delphi Corporation,

Adjunct Professor at Purdue University

Foreword by Dr Craig Hillman

Common sense.

In some respects, this is a simple, but powerful phrase. It implies grounding, a connection to the basics of knowledge and the physical world. But, it can also suggest a lack of insight or intelligence when thrown in an accusatory way. And, of course, common sense isn't always so common.

This phrase came to me as I was reading Marius' and Titu's manuscript, as it brought me back to my first days performing failure analysis on and predicting reliability of electronic components, boards, and systems. My background had been in Metallurgy and Material Science and I had experienced the revolution propagating through those disciplines as scientific approaches to processing and material behavior had resulted in dramatic improvements in cost and performance. These approaches had been germinating for decades, but demonstrated successes, economic forces, and the need for higher levels of quality control had forced their implementation throughout the materials supply chain.

So, imagine my surprise and amusement when I was informed that ‘Physics of Failure’ was the future of electronics reliability prediction and failure analysis. The future? Shouldn't this have been the past and present as well? How else could you ensure reliability? Statistics were useful in extrapolating laboratory results to larger volumes (either in quantity or size), but the fundamental understanding of materials reliability was always based on physics-based mechanisms. Isn't Physics of Failure, also known as Reliability Physics, common sense?

However, as Marius and Titu elaborate so well in their tome, the approaches that have evolved so well in other single-disciplinary fields did not always seamlessly integrate into the world of electronics. Even back in the 1940's and 1950's, electronics were complex (at least compared to bridges and boilers), involving multiple materials (copper, steel, gold, alumina, glass, etc.), multiple suppliers, multiple assembly processes, and multiple disciplines (electrical, magnetic, physics, material science, mechanics, etc.). The concept of assessing each mechanism and extrapolating their evolution over a range of conditions must have seemed mind-boggling at the time. As described succinctly, “a physics-of-failure-like model developed by for small-scale CMOS was virtually unusable by system manufacturers, requiring input data (details about design layout, process variables, defect densities, etc.) that are known only by the component manufacturer”. Even users of the academic PoF tools out of the University of Maryland have admitted to the weeks or months necessary to gather the information required, which results in analyses that are more interesting exercises then a true part of the product design process.

Despite these frustrations, the path forward to physics-based failure analysis and reliability prediction is clear. Studies by Professor Joseph Bernstein of Bar-Ilan University and Intel have clearly demonstrated that wearout behavior will dominate the newer technologies being incorporated into electronics today. In addition, standard statistical approaches will fumble over trying to capture the increasing complexities in silicon-based devices (what is the FIT rate of a system-in-package consisting of a DRAM, SRAM, microcontroller, and RF device? The same as discrete devices? Is there a relevant lambda here?).

By laying out a methodical process of Why/When/How/What, Marius and Titu are not only laying out the menu for incorporating these best practices into the design and manufacturing process that are elemental to electronic components, products, and systems, but they are also laying out the arguments on why the electronics community should be indulging in this feast of knowledge and understanding.

With knowledge, comes insight. With insight, comes success. And isn't that just Common sense?

Dr Craig Hillman, PhD

CEO

DfR Solutions

5110 Roanoke Place

College Park, MD 20740

301-474-0607 x308 (w)

301-452-5989 (c)

Series Editor's Preface

The book you are about to read re-launches the Wiley Series in Quality and Reliability Engineering. The importance of quality and reliability to a system can hardly be disputed. Product failures in the field inevitably lead to losses in the form of repair cost, warranty claims, customer dissatisfaction, product recalls, loss of sale, and in extreme cases, loss of life.

As quality and reliability science evolves, it reflects the trends and transformations of the technologies it supports. For example, continuous development of semiconductor technologies such as system-on-chip devices brings about unique design, test and manufacturing challenges. Silicon-based sensors and micromachines require the development of new accelerated tests along with advanced techniques of failure analysis. A device utilizing a new technology, whether it be a solar power panel, a stealth aircraft or a state-of-the-art medical device, needs to function properly and without failure throughout its mission life. New technologies bring about: new failure mechanisms (chemical, electrical, physical, mechanical, structural, etc.); new failure sites; and new failure modes. Therefore, continuous advancement of the physics of failure combined with a multi-disciplinary approach is essential to our ability to address those challenges in the future.

The introduction and implementation of Restriction of Hazardous Substances Directive (RoHS) in Europe has seriously impacted the electronics industry as a whole. This directive restricts the use of several hazardous materials in electronic equipment; most notably, it forces manufacturers to remove lead from the soldering process. This transformation has seriously affected manufacturing processes, validation procedures, failure mechanisms and many engineering practices associated with lead-free electronics. As the transition continues, reliability is expected to remain a major concern in this process.

In addition to the transformations associated with changes in technology, the field of quality and reliability engineering has been going through its own evolution developing new techniques and methodologies aimed at process improvement and reduction of the number of design- and manufacturing-related failures.

The concepts of Design for Reliability (DfR) were introduced in the 1990's but their development is expected to continue for years to come. DfR methods shift the focus from reliability demonstration and ‘Test-Analyze-Fix’ philosophy to designing reliability into products and processes using the best available science-based methods. These concepts intertwine with probabilistic design and design for six sigma (DFSS) methods, focusing on reducing variability at the design and manufacturing level. As such, the industry is expected to increase the use of simulation techniques, enhance the applications of reliability modeling and integrate reliability engineering earlier and earlier in the design process.

Continuous globalization and outsourcing affect most industries and complicate the work of quality and reliability professionals. Having various engineering functions distributed around the globe adds a layer of complexity to design co-ordination and logistics. Also moving design and production into regions with little knowledge depth regarding design and manufacturing processes, with a less robust quality system in place, and where low cost is often the primary driver of product development affects a company's ability to produce reliable and defect-free parts.

The past decade has shown a significant increase of the role of warranty analysis in improving the quality and reliability of design. Aimed at preventing existing problems from recurring in new products, product development process is becoming more and more attuned to engineering analysis of returned parts. Effective warranty engineering and management can greatly improve design and reduce costs, positively affecting the bottom line and a company's reputation.

Several other emerging and continuing trends in quality and reliability engineering are also worth mentioning here. Six Sigma methods including Lean and DFSS are expected to continue their successful quest to improve engineering practices and facilitate innovation and product improvement. For an increasing number of applications, risk assessment will replace reliability analysis, addressing not only the probability of failure, but also the quantitative consequences of that failure. Life cycle engineering concepts are expected to find wider applications to reduce life cycle risks and minimize the combined cost of design, manufacturing, quality, warranty and service. Reliability Centered Maintenance will remain a core tool to address equipment failures and create the most cost-effective maintenance strategy. Advances in Prognostics and Health Management will bring about the development of new models and algorithms that can predict the future reliability of a product by assessing the extent of degradation from its expected operating conditions. Other advancing areas include human reliability analysis and software reliability.

This discussion of the challenges facing quality and reliability engineers is neither complete nor exhaustive; there are myriad methods and practices the professionals must consider every day to effectively perform their jobs. The key to meeting those challenges is continued development of state-of-the-art techniques and continuous education.

Despite its obvious importance, quality and reliability education is paradoxically lacking in today's engineering curriculum. Few engineering schools offer degree programs or even a sufficient variety of courses in quality or reliability methods. Therefore, a majority of the quality and reliability practitioners receive their professional training from colleagues, professional seminars, publications and technical books. The lack of formal education opportunities in this field greatly emphasizes the importance of technical publications for professional development.

The main objective of Wiley Series in Quality & Reliability Engineering is to provide a solid educational foundation for both practitioners and researchers in quality and reliability and to expand the readers' knowledge base to include the latest developments in this field. This series continues Wiley's tradition of excellence in technical publishing and provides a lasting and positive contribution to the teaching and practice of engineering.

Dr Andre Kleyner,

Editor of the Wiley Series in Quality & Reliability Engineering

Preface

The goal of a technical book is to provide well-structured information on a specific field of activity, which has to be clearly delimited by the title and then completely covered by the content. In accomplishing this right book, two conditions must be fulfilled simultaneously: the book must be written at the right moment and it must be written by the right people.

If the potential readers represent a significant segment of the technical experts, and if they have already been prepared by the development of the specific field, one may say that the book is written at the right moment. Is this the case with our book? Yes, we think so. In the last 40 years, failure analysis (FA) has received growing attention, becoming since 1990 the central point of any reliability analysis. Due to its important role in identifying the design and process failure risks and elaborating corrective actions, one may say that FA has been the motor of development for some industries with harsh reliability requirements, such as aeronautics, automotive, nuclear, military, and so on.

We have chosen to focus the book mainly on the field of electronic components, with electronic systems seen as direct users of the components. Today, in this domain, it is almost impossible to conceive a serious investigation into the reliability of a product or process without FA. The idea that failure acceleration by various stress factors (the key to accelerated testing) could be modelled only for the population affected by the same failure mechanism (FM) greatly promoted FA as the only way to segregate such a population damaged by specific FMs.

Moreover, the simple statistical approach in reliability, which was the dominant one for years, is no longer sufficient. The physics-of-failure approach is the only one accepted at world level, being the solution for continuously improving the reliability of the materials, devices and processes. Even for the modelling of FMs, the well-known models based on distributions like Weibull or Lognormal have today been replaced by analytical models that are elaborated based on an accurate description of the physical or chemical phenomena responsible for degradation or failure of electronic components and materials.

In FA, a large range of methods are now used, from (classical) visual inspection to expensive and modern methods such as transmission electron microscopy, secondary ion mass spectroscopy and so on. Nice photos and clever diagrams, true examples of scientific beauty, represent a sort of siren's song for the ears of the specialist in FA. It is easy to fill an FA report with the attractive results of sophisticated methods, without identifying in the end the root causes of the failure. The FA specialist has to be as strong as Ulysses, continuing to drive the ‘boat’ of FA guided only by the ‘compass’ of a logic and coherent analysis! On the other hand, the customers of FA, manufacturers of electronic components and systems, have to be aware that a good FA report is not necessarily a report with many nice pictures and 3D simulations, but a report with a logic demonstration of FA, based on results obtained with necessary FA techniques, and, very importantly, with solid conclusions about the root causes of the failure for the studied item. It is the most difficult route, but the only one with fruitful results.

Consequently, for all the above reasons, we think that a practical guide to FA of electronic components and systems is a necessary tool. Today, a book of this kind, capable of orienting reliability engineers in the complicated procedures of FA, is sadly lacking. It is our hope to fill this void.

The second question is: are we the right people to write this book? Yes, we think so. In answering this question, we have to put aside the natural modesty of any scientist and show here the facts that support this statement. We are two people with different but perfectly complementary backgrounds, as anyone can see from the attached biographies. Marius Bâzu has behind him almost 40 years of activity in the academic media of Romania, as leader of a reliability group from an applied research institute focused on microtechnologies, and author of books and papers on the reliability of components. Titu Bjenescu has had an outstanding career in industry, responsible for the reliability domain in many Swiss companies and authoring many technical papers and books—written in English, French, German and Romanian. He is also a university professor and visiting lecturer or speaker at various European universities and other venues. Both authors are covering the three key domains of industry, research and teaching. Moreover, we have already worked together, publishing two books on the reliability of electronic components.

We hope we have convinced you that this is indeed a book written at the right moment and by the right people.

While writing this book, we were constantly asking ourselves who might be its potential readers. As you will see, the book is aimed to be useful to many people.

If you are working as a component manufacturer, the largest part of the book is directly focused on your activity. Even if you are not a reliability engineer, you will find significant information about the possible failure risks in your current work, with suggestions about how to avoid them or how to correct wrong operations.

You will also be interested in this book if you are a component user. This includes reliability engineers and researchers, electrical and electronic engineers, those involved in all facets of electronics and telecommunications product design and manufacturing, and those responsible for implementing quality and process improvement programmes.

If you are from the academic media, including teachers and students of the electrical and electronic faculties, you will find in this book all the necessary information on reliability issues related to electronic components. The book does not contain complicated scientific developments and it is written at a level allowing an easy understanding, because one of its goals is to promote reliability and failure analysis as a fascinating subject of the technical environment. Moreover, the book's companion website contains Q&A sessions for each chapter.

If you are a manufacturer of electronic components, but not directly involved in the reliability field, you will be interested in this book. By gathering together the main issues related to this subject, a fruitful idea is promoted: all parts contributing to the final product (designers, test engineers, process engineers, marketing staff and so on) participate together in its quality and reliability (the so-called ‘concurrent engineering’ approach).

If you are the manager of a company manufacturing electronic components, you may use this book as a tool for convincing your staff to involve itself in all reliability issues.

A prognosis on the evolution of FA in the coming years is both easy and difficult to make. It is easy, because everyone working in this domain can see the current trend. FA is still in a ‘romantic’ period, with many new and powerful techniques being used to analyse more and more complex devices. Still, the need for new tools is pressing, especially for the smallest devices, at nano level, but it is difficult to predict the way these issues will be solved. On the other hand, we think that procedures for executing FA will be stabilised and standardised very soon, allowing any user of an electronic component to verify the reliability of the product. In fact, our book is intended to be a step on this road!

Marius I. Bâzu, Romania

Titu-Marius I. Bjenescu, Switzerland

September 2010

About the Authors

Marius Bâzu received his BE and PhD degrees from the Politehnica University of Bucharest, Romania. Since 1971 he has worked at the National Institute for Research and Development in Microtechnology, IMT-Bucharest, Romania. He is currently Head of the Reliability Laboratory and was Scientific Director (2000–2003) and Vice-President of the Scientific Council of IMT-Bucharest (2003–2008). His past work involved the design of semiconductor devices and in semiconductor physics. Recent research interests include design for reliability, concurrent engineering, methods of reliability prediction in microelectronics and microsystems, a synergetic approach to reliability assurance and the use of computational intelligence methods for reliability assessment. He developed accelerated reliability tests, building-in reliability and concurrent engineering approaches for semiconductor devices. He was a member of the Management Board and chair of the Reliability Cluster of the Network of Excellence Design for Micro and Nano Manufacture—PATENT-DfMM (2004–2008 project in the Sixth Framework Program of the European Union) and leader of a European project (Phare/TTQM) on building-in reliability technology (1997–1999). He currently sits on the Board of the European network EUMIREL (European Microsystem Reliability), established in 2007, which aims to deliver reliability services on microsystems.

Dr Bâzu has authored or co-authored more than 130 scientific papers and contributions to conferences, including two books (co-authored with Titu Bjenescu, published in 2010 by Artech House and in 1999 by Springer Verlag) and is the referent of the journals Microelectronics Reliability, Sensors, IEEE Transactions on Reliability, IEEE Transactions on Components and Packaging and Electron Device Letters and Associate Editor of the journal Quality Assurance. Recipient of the AGIR (General Association of Romanian Engineers) Award for the year 2000, he was chairman of and presented invited lectures to several international conferences: CIMCA 1999 and 2005 (Vienna, Austria), CAS 1991–2009 (Sinaia, Romania) and MIEL 2004 (Niš, Serbia and Montenegro), amongst others. He is currently Invited Professor for Post-Graduate Courses at the Politehnica University of Bucharest.

Contact: 49, Bld. Timisoara, Bl. CC6, ap.34, Sector 6, 061315 Bucharest (Romania); Tel: +40(21)7466109; [email protected], [email protected]

Titu-Marius I. Bjenescu received his engineering training at Politehnica University of Bucharest, Romania. He designed and manufactured experimental equipment for the army research institute and for the air defence system. He specialises in QRA Management in Switzerland, the USA, the UK and West Germany and was a former Senior Member of the IEEE (USA). In 1969 he moved to Switzerland, joining first Asea Brown Boveri (as research and development engineer, involved in the design and manufacture of equipment for telecommunications), then—in 1974—Ascom, as Reliability Manager (recruitment by competitive examination), where he set up QRA and R&M teams, developed policies, procedures and training and managed QRA and R&M programmes. He also acted as QRA manager, monitoring and reporting on production quality and in-service reliability. As Swiss Official, he contributed to the development of new ITU and IEC standards. In 1985, he joined Messtechnik und Optoelektronik (Neuchâtel, Switzerland and Haar, West Germany), a subsidiary of Messerschmitt-Bölkow-Blohm (MBB) Munich, as quality and reliability manager, where he was product assurance manager of ‘intelligent cables’ and managed applied research on reliability (electronic components, system analysis methods, test methods and so on). Since 1986 he has worked as an independent consultant and international expert on engineering management, telecommunications, reliability, quality and safety.

He has authored many technical books, on a large range of technical subjects, published in English, French, German and Romanian. He is a university professor and has written many papers and articles on modern telecommunications and on quality and reliability engineering and management; he lectures on these subjects as invited professor, visiting lecturer and speaker at various European universities and other venues. Since 1991, he has won many awards and distinctions, presented by the Romanian Academy, Romanian Society for Quality, Romanian Engineers Association and so on, for his contribution to the reliability of science and technology. Recently, he received the honorific title of Doctor Honoris Causa from the Romanian Military Technical Academy and from the Technical University of the Republic of Moldavia. He is Invited Professor at the Romanian Military Technical Academy and at the Technical University of the Republic of Moldavia.

His extensive list of publications includes: Initiation à la fiabilité en électronique moderne, Masson, Paris, 1978; Elektronik und Zuverlässigkeit, Hallwag-Verlag, Berne and Stuttgart, 1979; Problèmes de la fiabilité des composants électroniques actifs actuels, Masson, Paris, 1980; Zuverlässigkeit elektronischer Komponenten, VDE-Verlag, Berlin, 1985; Reliability of Electronic Components. A Practical Guide to Electronic System Manufacturing (with M. Bâzu), Springer, Berlin and New York, 1999; Component Reliability for Electronic Systems (with M. Bâzu), Artech House, Boston and London, 2010.

Contact: 13, Chemin de Riant-Coin, CH-1093 La Conversion (Switzerland); Tel: ++41(0)217913837; Fax: ++41(0)217913837; [email protected].

Chapter 1

Introduction

1.1 The Three Goals of the Book

In our society, which is focused on success in any domain, failure is an extremely negative value. We all still remember the strong emotion produced worldwide by the crash of the Space Shuttle Challenger. On 28 January 1986, Challenger exploded after 73 seconds of flight, leading to the deaths of its seven crew members. The cause was identified after a careful failure analysis (FA): an O-ring seal in its solid rocket booster failed at lift-off, causing a breach in the joint it sealed and allowing pressurised hot gas from the solid rocket motor to reach the outside and impinge upon the attachment hardware. Eventually, this led to the structural failure of the external tank and to shuttle crash. This is a classical example of failure produced by the low quality of a part used in a system. Other examples of well-known events produced by failures of technical systems include the following:

On 10 April 1912, RMS Titanic, at that time the largest and most luxurious ship ever built, set sail on its maiden voyage from Southampton to New York. On 14 April, at 23 : 40, the Titanic struck an iceberg about 400 miles off Newfoundland, Canada. Although the crew had been warned about icebergs several times that evening by other ships navigating through the region, the Titanic was travelling at close to its top speed of about 20.5 knots when the iceberg grazed its side. Less than three hours later, the ship plunged to the bottom of the sea, taking more than 1500 people with it. Only a fraction of the passengers were saved. This was a terrible failure of a complex technical system, made possible because the captain had ignored the necessary precautions; hence the failure was produced by a human fault. The high casualty rate was further explained by the insufficient number of life boats, which implies a design fault.On 24 April 1980, the US President Jimmy Carter authorised the military operation Eagle Claw (or Evening Light) to rescue 52 hostages from the US Embassy in Tehran, Iran. The hostages had been held since 4 November 1979 by a commando of the Iranian Revolutionary Guard. Eight RH-53D helicopters participated in the operation, which failed due to many technical problems. Two helicopters suffered avionics failures en route and a sand storm damaged the hydraulic systems of another two. Because the mission plan called for a minimum of six helicopters, the rest were not able to continue and the mission was aborted. This is considered a typical case of reliability failure of complex technical systems.

Obviously, we have to fight against failures. Consequently FA has been promoted and has quickly become a necessary tool. FA attempts to identify root causes and to propose corrective actions aimed at avoiding future failure.

Given the large range of possible human actions, many specific procedures of FA are needed, starting with medical procedures for curing or preventing diseases (which are failures of the human body) and continuing with various procedures for avoiding the failure of technical systems. In this respect, the word ‘reliability’ has two meanings: first, it is the aim of diminishing or removing failures, and second, it is the property of any system (human or artefact) to function without failures, in some given conditions and for a given duration. In fact, FA is a component of reliability analysis. This idea will be detailed in the following pages.

From the above, one can see that the first goal of this book is to present the basics of FA, which is considered the key action for solving reliability issues. But there is a second purpose, equally important: to promote the idea of reliability, to show the importance of this discipline and the necessity of supporting its goals in achieving a given level of reliability as a key characteristic of any product.

Unfortunately, the first reliability issues were solved by statisticians, which led to a mathematical approach to reliability, predominant in the first 25–30 years of the modern history of the domain. Today other disciplines, such as physics and chemistry, are equally involved. All this issues are detailed in Section 1.2, where a short history of reliability as a discipline is presented.

The mathematical approach was restrictive and created the incorrect impression that the aim of reliability analysis was to impede the work of real specialists, forcing them to undertake redesigns due to cryptic results that nobody could understand. Today this misapprehension has generally been overcome, but its after-effects are still present in the mentality of some specialists. We want to persuade component manufacturers that reliability engineers are their best friends, simulating the behaviour of their product in real-life conditions and then recommending necessary improvements before the product can be sold. Manufacturers and reliability engineers must form a team, with information flowing in both directions.

Even more importantly, this book is aimed at showing to industry managers the reasons for taking reliability issues into account from the design phase onwards, through the whole cycle of development of a product. It has been proved that the only way to promote reliability requirements is top-down, starting with the manager and continuing down to every worker.

The third goal of the book starts from our subjective approach to reliability. We think reliability is a beautiful domain, offering immense satisfaction to any specialist, and involving a large range of knowledge, from physics, chemistry and mathematics to all engineering disciplines. That is why strong interdisciplinary teams are needed to solve reliability issues, which are difficult challenges for the human mind.

We want to show to young readers the beauty of reliability analysis, which can be compared to a simple mathematical demonstration. Another approach is to consider a reliability analysis to be similar to the activity of a detective: we have a ‘dead component’ and, based on the information gathered from those involved, we have to find out why this happened and ‘who did it’. This is possible because failures follow the law of cause and effect [1].

Our focus on the above three goals has imposed the structure of this first chapter. As you can see, we want not only to deliver a high quantity of information, but to convince the specialists who manufacture electronic components and systems how important FA is, and, more generally, to attract the reader to the ‘charming land of reliability’. This first chapter has a huge importance, as our main attractor. Consequently, we have tried to structure it as straightforwardly as possible. We have presumed the subject is a new one for the reader, so we thought it best to begin with a short history of reliability as a discipline, with a special emphasis on FA. A section on terminology will furnish definitions of the most important terms. Finally the state of the art in FA will be described, including a short description of the main challenges for the near future.

This first chapter will thus show the past, present and future of FA, together with the main terminology. With this knowledge acquired, we think the reader will be ready to learn the general plan of the book, which is given in the final part of this chapter.

1.2 Historical Perspective

There is a general consensus that reliability as a discipline was established during World War II (1939–1945), when the high number of failures noticed for military equipment became a concern, requiring an institutional approach. However, attempts to design a fair quality into an artefact or to monitor the way this quality is maintained during usage (i.e. reliability concerns) were first made a long time ago.

1.2.1 Reliability Prehistory

This story may begin thousands of years ago, during the Fifth Dynasty of the Ancient Egyptian Empire (2563−2423 BCE), when the pharaoh Ptah-hotep stated (in other words, of course, but this was the idea) that good rules are beneficial for those who follow them [2]. This is the first known remark about the quality of a product, specifically about the design quality. Obviously, during the following ages, many other milestones in quality and reliability history occurred:

In Ancient Babylon, the Code of Hammurabi (1760 BCE) said: ‘If the ship constructed for somebody is damaged during the first year, the manufacturer has to re-build it without any supplementary cost.’ This could be considered the first specification about the reliability of a product!In China, during the Soong dynasty (960–1279 CE), there were six criteria for the quality and reliability of arches: to be light and elastic, to withstand bending and temperature cycles, and so on. Close enough to modern specifications!At the same time in Europe, the guilds (associations of artisans in a particular trade) elaborated principles of quality control, based on standards. Royal governments promoted the control of quality for purchased materials; for instance, King John of England (1199–1216) asked for reports on the construction of ships.In the 1880s mass production began and F.W. Taylor proposed the so-called ‘Scientific Management’: assembly lines, division of labour, introduction of work standards and wage incentives. Latter, he wrote two basic books on management: Shop Management (1905) and The Principles of Scientific Management (1911).On 16 May 1924, W.A. Shewhart, engineer at the Western Electric Company, prepared a little memorandum about a page in length, containing the basics of the control chart. He later became the ‘father’ of statistical quality control (SQC): methods based on continual on-line monitoring of process variation and the concepts of ‘common cause’ and ‘assignable cause’ variability.In 1930, H.F. Dodge and H.G. Romig, working at Bell Laboratories, introduced the so-called Dodge–Romig tables: acceptance sampling methods based on a probabilistic approach to predicting lot acceptability from sampling results, centred on defect detection and the concept of acceptable quality level (AQL).

All these contributions (and many others) have paved the way for current, modern approaches in quality and have prepared the development of reliability as a discipline (see Sections 1.2.2 and 1.2.3).

Following World War II, in parallel with the rise of reliability as a discipline, the quality field has continued to be developed, mainly in the USA. Two eminent Americans, W. Edwards Deming and Joseph Juran, alongside the Japanese professor Kaouru Ishikawa, were successful in promoting this field in Japan. Another name has to be mentioned, Philip B. Crosby, who initiated the quality-control programme named ‘Zero Defects’ at Martin Company, Orlando, Florida, in the late 1960s. In 1983, Don Reinertsen proposed the concept of concurrent engineering as an idea to quantify the value of development speed for new products.

Due to the efforts of the above, a new discipline, called quality assurance, was born, aimed at covering all activities from design, development and production to installation, servicing, documentation, verification and validation.

1.2.2 The Birth of Reliability as a Discipline

The following two events are considered the founding steps of reliability as a discipline:

1. During World War II, the team led by the German rocket engineer Wernher Magnus Maximilian Freiherr von Braun (later the father of the American space programme) developed the V-1 rocket (also known as the Buzz-Bomb) and then the V-2 rocket. The repeated failures of the rockets made a safe launch impossible. Von Braun and his team tried to obtain a better device, focusing on improving the weakest part, but the rockets continued to fail. Eric Pieruschka, a German mathematician, proposed a different approach: the reliability of the rocket would be equal to the product of the reliability of its components. That is, the reliability of all components is important to overall reliability. This could be considered the first modern predictive reliability model. Following this approach, the team was able to overcome the problem [3].

2. In 1947, Aeronautical Radio Inc. and Cornell University conducted a reliability study on more than 100 000 electronic tubes, trying to identify the typical causes of failures. This could be considered the first systematic FA. As a consequence of this study, on 7 December 1950, the US Department of Defense (DoD) established the Ad Hoc Group on Reliability of Electronic Equipment (AHGREE), which became in 1952 the Advisory Group on the Reliability of Electronic Equipment (AGREE) [4]. The objectives proposed by this group are still valid today: (i) more reliable components, (ii) reliability testing methods before production, (iii) quantitative reliability requirements and (iv) improved collection of reliability data from the field (including failure analyses). Later, in 1956, AGREE elaborated the first reliability handbook, titled ‘Reliability Factors for Ground Electronic Equipment’. This report is considered the fundamental milestone in the birth of reliability engineering.

1.2.3 Historical Development of Reliability

The reliability discipline has evolved around two main subjects: reliability testing and reliability building. In this discussion, we think the history of prediction methods (which are based on FA) is the relevant element, being deeply involved in both subjects: as input data for reliability building and as output data for reliability testing.

Following the issue of the first reliability handbook, the TR-1100 ‘Reliability Stress Analysis for Electronic Equipment’, released by RCA in November 1956, proposed the first models for computing failure rates of electronic components, based on the concept of activation energy and on the Arrhenius relationship.

On 30 October 1959, the Rome Air Development Center (RADC; later the Rome Laboratory, RL) issued a ‘Reliability Notebook’, followed by some other basic papers contributing to the development of knowledge in the field: ‘Reliability Applications and Analysis Guide’ by D.R. Earles (September 1960); ‘Failure Rates’ by D.R. Earles and M.F. Eddins (April 1962); and ‘Failure Concepts in Reliability Theory’ by Kirkman (December 1963).

From the early 1960s, efforts in the new reliability discipline focused on one of the RADC objectives: developing prediction methods for electronic components and systems. Two main approaches were followed:

1. ‘The statistical approach’, using reliability data gathered in the field. The first reliability prediction handbook, MIL-HDBK-217A, was published in December 1965 by the US Navy. This was a huge success, being well received by all designers of electronic systems, due to its flexibility and ease of use. In spite of its wrong basic assumption, an exponential distribution of failures [5], the handbook became the almost unique prediction method for the reliability of electronic systems, and other sources of reliability data gradually disappeared [6].

2. ‘Thephysics-of-failure (PoF) approach’, based on the knowledge of failure mechanisms (FMs) (investigated by FA) by which the components and systems under study are failing. The first symposium devoted to this topic was the ‘Physics of Failure in Electronics’ symposium, sponsored by the RADC and the IIT Research Institute (IITRI), in 1962. This symposium later became the ‘International Reliability Physics Symposium’ (IRPS), the most influential scientific event in failure physics. On 1 May 1968, the MIL-HDBK-175 Microelectronic Device Data Handbook appeared (revised in 24 October 2000), with a section focused on FA (‘Reliability and Physics of Failure’).

The two approaches seemed to be diverging; system engineers were focused on the ‘statistical approach’ while component engineers working in FA were focused on the PoF approach. But soon both groups realised that the two approaches were complementary and attempts to unify the two methods have been made.

This has been facilitated by the fact that, in 1974, the RADC, which was the promoter of the PoF approach, became responsible for preparing the second version of MIL-HDBK-217, and of the subsequent successive versions (C…F), which tried to update the handbook by taking into account new advances in technology. However, instead of improved results, more and more sophisticated models were obtained, considered ‘too complex, too costly and unrealistic’ by the user community [6]. Another attempt, executed by RCA, under contract to the RADC, which tried to develop PoF-based models, was also unsuccessful. This was because the model users did not have access to information about the design and construction of components and systems.

In the 1980s, various manufacturers of electronic systems tried to develop specific prediction methods for reliability. Examples include the models proposed for automotive electronics by the Society of Automotive Engineers (SAE) Reliability Standards Committee and for the telecommunication industry (Bellcore reliability-prediction standards).

For the last version (F) of MIL-HDBK-217, issued on 10 July 1992, two teams (IIT/Honeywell and CALCE/Westinghouse (Centre for Advanced Life Cycle Engineering)) were commissioned by the (RL) RADC to provide guidelines. Both teams suggested the following conclusions:

The constant-failure-rate model (based on an exponential distribution) is not valid in real life.Electromigration and time-dependent dielectric breakdown could be modelled with a lognormal distribution.Arrhenius-type formulation of the failure rate in terms of temperature should not be included in the package failure model.Temperature change and humidity must be considered as key acceleration factors.Temperature cycling is more detrimental for component reliability than the steady-state temperature at which the device is operating, so long as the temperature is below a critical value.

Similar conclusions were supported by the studies [7, 8]. In fact, these conclusions are preparing a unified approach on prediction method, which has not been issued yet.

During the first 30 years of the reliability discipline, military products acted as the main drivers of reliability developments. However, starting from the 1980s, commercial electronic components became more and more reliable. In June 1994, the so-called ‘Acquisition Reform’ took place: the US DoD abolished the use of military specifications and standards in favour of performance specifications and commercial standards in DoD acquisitions [9]. Consequently, in October 1996, MIL-Q-9858, Quality Program Requirements, and MIL-I-45208 A, Inspection System Requirement, were cancelled without replacement. Moreover, contractors were henceforth required to propose their own methods for quality assurance, when appropriate. The DoD policy allows the use of military handbooks only for guidance. Many professional organisations (e.g. IEEE Reliability Society) attempted to produce commercial reliability documents to replace the vanishing military standards [10]. A number of international standards were also produced, including IEC TC-56, some NATO documents, British documents and Canadian documents. In addition to the new standardisation activities, the RL is also undertaking a number of research programmes to help implement acquisition reform.

However, some voices, such as Demko [11], consider a logistic and reliability disaster to be possible, because commercial parts, standards and practice may not meet military requirements. For this purpose, in June 1997 the IITRI of Rome (USA) developed SELECT, a tool that allows users to quantify the reliability of commercial off-the-shelf (COTS) equipment in severe environments [12]. Also, beginning from April 1994, a new organisation, the Government and Industry Quality Liaison Panel (GIQLP), made up of government agencies, industry associations and professional societies, was intimately involved in the vast changes being made in the government acquisition process [13].

MIL-HDBK-217 was among the targets of the Acquisition Reform, but it was impossible to replace it, because no other candidates (prediction methods) were available. However, attempts at elaborating a new handbook for predicting the reliability of electronic systems were made. Supplementary to the existing handbook, this document, called the ‘New System Reliability Assessment Method’, has to manage system-level factors (6). The system has to take into account previous information about the reliability of similar systems (built with similar technologies, for similar applications and performing similar functions), as well as test data about the new system (aimed at producing an initial estimate of the system's reliability).

On the other hand, a well-known example of commercial standards replacing the old military standards is given by the ISO 9000 family, first issued in 1987, with updates in 2000, 2004 and 2008. Basically, the IS0 9000 standards aimed to provide a framework for assessing the management system in which an organisation operates in relation to the quality of the furnished goods or services. The concept was developed from the US Military Standard for quality, MIL-Q-9858, which was introduced in the 1950s as a means of assuring the quality of products built for the US military services. But there is a fundamental new idea promoted by ISO 9000: the quality management systems of the suppliers are audited by independent organisations, which assess compliance with the standard and issue certificates of registration. The suppliers of defence equipment were assessed against the standards by their customers. In a well-documented and convincing paper about ISO 9000 standards, Patrick O'Connor elucidated the weak points of this approach [14]:

The philosophy of ‘total quality’ demands close partnership between supplier and purchaser, which is destroyed if the ‘third party’ has to audit the quality management of the supplier.It is hard to believe that this ‘third party’ is able to have the appropriate specialist knowledge about the products to be delivered.In fact, the ISO 9000 standards aim only to verify that the personnel of the supplier are strictly observing the working procedures and not whether the procedures are able to ensure a specified quality and reliability level. Of course, some organisations have generated real improvements as a result of registration, but it is not obvious that this will happen in all cases. Moreover, the high costs required to implement ISO 9000 may have detrimental effect on a real improvement, by technical corrective actions, in the manufacturing process.Two explanations are proposed for wide adoption of these standards, in spite of the solid arguments of many leading teachers of quality managers: (i) the tendency to believe that people perform better when told what to do, rather than when they are given freedom and the necessary skills and motivation to determine the best ways to perform their work and (ii) working only with ‘registered’ suppliers is the easy way for many bureaucrats to select the most appropriate suppliers for their products. The main responsibility is transferred to the ‘third party’.

As one can see, the ISO 9000 approach seeks to ‘standardise’ methods that directly contradict the essential lessons of the modern quality and productivity revolution (e.g. ‘total quality’), as well as those of the new management.

Some other important contributors to the domain of reliability include the following:

Genichi Taguchi, who proposed robust design, using fractional factorial design and orthogonal arrays in order to minimise loss by obtaining products with minimal variations in their functional characteristics.Dorian Shainin, reliability consultant for various companies, including NASA (the Apollo Lunar Module), who supported the idea of discovering and solving problems early in the design phase, before costly manufacturing steps are taken and before customers experience failures in the field.Wayne B. Nelson, who developed a series of methodologies on accelerated testing, based on FA.Larry H. Crow, independent consultant as well as an instructor and consultant for ReliaSoft Corporation, who made contributions in the areas of reliability growth and repairable system data analysis.Gregg K. Hobbs, the inventor of the highly accelerated life test (HALT), a stress-testing methodology, aimed at obtaining information about the reliability of a product.Michael Pecht, the founder of CALCE and the Electronic Products and Systems Consortium at the University of Maryland, which made essential contributions to the study of FMs of electronic components.Patrick O'Connor, who made contributions to our understanding of the role of failure physics in estimating component reliability; he also proposed convincing arguments against ISO 9000.

1.2.4 Tools for Failure Analysis

Initiated in 1947 for electronic tubes, FA was developed mainly for other microelectronic devices (transistors, integrated circuits (ICs), optoelectronic devices, microsystems and so on), but also for electronic systems. Since 1965, the number of transistors per chip has been doubling every 24 months, as predicted by Moore. Today, ICs are made up of hundreds of millions of transistors grown on a single chip. Key FA tools have been developed continuously, driving the growth of the semiconductor industry by solving difficult test problems.

The search for physical-failure root causes (especially with physical inspection and electrical localisation) is aimed at breaking through any technology barrier in each stage of chip development, package development, manufacturing process and field application, offering the real key to eradicate the error. Testing provides us with information on the electrical performance; FA can discover the detractors for the poor performance [15]. In today's electronic industry FA is squeezed between the need for very rapid analysis to support manufacturing and the exploding complexity of the devices. This requires knowledge of subjects like design, testing, technology, processing, materials science, physics, chemistry and even mathematics [16]!

As can be seen in Figure 1.1, a number of key FA tools and advanced techniques have been developed. These play essential roles in the development of semiconductor technology.

Figure 1.1 History of the main techniques used in failure analysis: year of appearance, name, acronym, name of inventor

1.3 Terminology

The lack of precision in terminology is a well-known disease of our times. Very often, specialists with the same opinion about a phenomenon are in disagreement because they are implicitly using different definitions for the same term, or because various terms having the same meaning. Of course, standards for the main terms of any technical field have been elaborated, but it is difficult to read a book with an eye to one or many standards. This is why we felt that this book needs a glossary. The glossary is located within the back matter and contains only the basic terms referring to the subject of the book, failure analysis for electronic components and systems, divided into two main sections:

Terms related to electronic components and systems.Terms related to FA.

Other more specific terms will be explained within the main body of the text, when necessary.

1.4 State of the Art and Future Trends

The current issues related to FA of electronic components and systems can be structured around three main areas:

techniques of FA;FMs;models for the PoF.

In this section, the most important subjects of each area will be discussed.

1.4.1 Techniques of Failure Analysis

Today, when FA is performed at system level, we have to analyse not only discrete (active and passive) components but a large range of ultra-high-density ICs, with a high design complexity that exceeds 300 million gates, manufactured by a huge variety of technologies (bipolar silicon, CMOS, BiCMOS, GaN, GaAs, InP, GaN, SiC, complex heterojunction structures and microelectromechanical systems, MEMSs), which will be detailed in Chapter 5. This is why the today's analyst faces complex equipment sets (curve tracers, optical microscopes, decapsulation tools, X-ray and acoustic microscopies, electron and/or optical and/or focused ion beam (FIB) tools, thermal detection techniques, the scanning probe atomic force microscope, surface science tools, a great variety of electrical testing hardware and so on) that are necessary to realise a spatial and complex FA. FA is a highly technical activity with increasingly complex, sophisticated and costly specialised equipments. It is very difficult to achieve a balance between customer satisfaction, cost-effectiveness and future challenges. Very often the analyst must use a limited set of tools, as the cost of all the required tools exceeds the budget of current operations. FA techniques are used to confirm and localise physical defects. The final objective is to find the root cause.

Failure modes and effects analysis (FMEA) is the systematic method of studying failure, formally introduced in the late 1940s for military usage by the US armed forces [3]. Later, FMEA was used for aerospace/rocket development to avoid errors in small sample sizes of costly rocket technology. Now FMEA methodology is extensively used in a variety of industries, including semiconductor processing, food service, plastics, software and health care. It is integrated into advanced product quality planning (APQP) to provide primary risk-mitigation tools and timing in the preventing strategy, in both design and process formats. FMEA is also useful at component level, especially for complex components (ICs, MEMS, etc.). In FMEA, failures are prioritised according to how serious their consequences are, how frequently they occur and how easily they can be detected. Current knowledge about risks of failures and preventive actions for use in continuous improvement are also documented. FMEA is used during the design stage in order to avoid future failures and later for process control, before and during ongoing operation of the process. Ideally, FMEA begins during the earliest conceptual stages of design and continues throughout the life of the product or service. The purpose of FMEA is to take action to eliminate or reduce failures, starting with the highest-priority ones.

At component level, a broad definition of FA includes: collection of background data, visual examination, chemical analysis, mechanical properties, macroscopic examination, metallographic examination, micro-hardness, scanning electron microscopy (SEM) analysis, microprobe, residual stresses and phases, simulation/tests, summary of findings, preservation of evidence, formulation of one or more hypotheses, development of test methodologies, implementation of tests/collection of data, review of results and revision of hypotheses. Each time, the customer will be notified.

First, the causes of a failure can be classified according to the phase of a product's life cycle in which they arise—design, materials processing, component manufacturing or service environment and operating conditions. Then two main areas of FA enable fast chip-level circuit isolation, circuit editing for quick diagnostic and problem-solving, helping bring forward semiconductor development:

Physical inspection, represented by three important tools: SEM, emission microscopy and transmission electron microscopy (TEM).Electrical localisation, executed mainly with liquid crystal analysis (LCA), photo electron microscopy (PEM) and FIB.

The package global localisation tool infra-red lock-in thermography (IR-LIT) became widely available in 2005 and is the most popular tool for global localisation for complex packages, such as system-in-package (SiP) and system-on-chip (SoC). Today the tool support for SoC development is X-ray CT, due to a significant resolution and speed improvement. FA has given and gives a continuous contribution to technological innovation in the whole history of semiconductor development.

When ICs are analysed, a number of tools and techniques are used due to device-specific issues: additional interconnection levels, power distribution planes and flip-chip packaging completely eliminate the possibility of employing standard optical or voltage-contrast FA techniques without destructive intervention. The defect localisation utilises techniques based on advanced imaging, and on the interaction of various probes with the electrical behaviour of devices and defects.

The thermal interaction between actively operated electronic components and applied characterisation tools is one of the most important interactions within FA and reliability investigations. It allows different kinds of thermal interaction mechanism to be utilised, which would normally have to be separated–for instance into classes with respect to thermal excitation and/or detection, spatial limitations and underlying physical principle. Although they all have in common the ability to link the thermo-electric device characteristic to a representing output signal, they have to be interpreted in completely different ways. Recently, the complementarity of the methods for localisation and characterisation as well as the according industrial demands and related limitations have been shown [17]; techniques such as IR-LIT and thermal induced voltage alteration (TIVA), case studies and the capability of non-established techniques like scanning thermal microscopy (SThM), thermal reflectance microscopy (TRM) and time domain thermal reflectance (TDTR) are also presented and their impact on reliability investigations is discussed.

Over the last few years, the increased complexity of devices has scaled the difficulty in performing FA. Higher integration has led to smaller geometry and better wire-to-cell ratios, thus increasing the complexity of the design. These changes have reduced the effectiveness of most of the current FA techniques; over the past few years, a variety of techniques and tools, such as electron-beam (E-beam) probers, FIB, enhanced imaging SEM and field emission SEM (FESEM), have been developed to determine the defects at wafer level. All these tools improve FA capabilities, but at substantial cost, running into hundreds of thousands of dollars. Some other examples of new techniques are given below:

A strategy was derived for FA in random logic devices (such as microprocessors and other VLSI chips) where the electrical scheme is not known. This strategy is based on the use of a test tool composed of an SEM allied to a voltage contrast, an exerciser, an image processing system and a control and data processing system [18].