E-Book
114,99 €

Industrial Data Analytics for Diagnosis and Prognosis E-Book

Shiyu Zhou

0,0

114,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Discover data analytics methodologies for the diagnosis and prognosis of industrial systems under a unified random effects model In Industrial Data Analytics for Diagnosis and Prognosis - A Random Effects Modelling Approach, distinguished engineers Shiyu Zhou and Yong Chen deliver a rigorous and practical introduction to the random effects modeling approach for industrial system diagnosis and prognosis. In the book's two parts, general statistical concepts and useful theory are described and explained, as are industrial diagnosis and prognosis methods. The accomplished authors describe and model fixed effects, random effects, and variation in univariate and multivariate datasets and cover the application of the random effects approach to diagnosis of variation sources in industrial processes. They offer a detailed performance comparison of different diagnosis methods before moving on to the application of the random effects approach to failure prognosis in industrial processes and systems. In addition to presenting the joint prognosis model, which integrates the survival regression model with the mixed effects regression model, the book also offers readers: * A thorough introduction to describing variation of industrial data, including univariate and multivariate random variables and probability distributions * Rigorous treatments of the diagnosis of variation sources using PCA pattern matching and the random effects model * An exploration of extended mixed effects model, including mixture prior and Kalman filtering approach, for real time prognosis * A detailed presentation of Gaussian process model as a flexible approach for the prediction of temporal degradation signals Ideal for senior year undergraduate students and postgraduate students in industrial, manufacturing, mechanical, and electrical engineering, Industrial Data Analytics for Diagnosis and Prognosis is also an indispensable guide for researchers and engineers interested in data analytics methods for system diagnosis and prognosis.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 537

Veröffentlichungsjahr: 2021

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Industrial Data Analytics for Diagnosis and Prognosis

A Random Effects Modelling Approach

Shiyu Zhou

University of Wisconsin – Madison

Yong Chen

University of Iowa

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008.

Limit of Liability/Disclaimer of Warranty:

While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herin may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services please contact our Customer Care Department with the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, however, may not be available in electronic format.

Library of Congress Cataloging-in-Publication Data:

Names: Zhou, Shiyu, 1970- author. | Chen, Yong (Professor of industrial and systems engineering), author.

Title: Industrial data analytics for diagnosis and prognosis : a random effects modelling approach / Shiyu Zhou, Yong Chen.

Description: Hoboken. NJ : John Wiley & Sons, Inc., 2021. | Includes bibliographical references and index.

Subjects: LCSH: Industrial engineering--Statistical methods. | Industrial management--Mathematics. | Random data (Statistics) | Estimation theory.

Classification: LCC T57.35 .Z56 2021 (print) | LCC T57.35 (ebook) | DDC 658.0072/7--dc23

LC record available at https://lccn.loc.gov/2021000379

LC ebook record available at https://lccn.loc.gov/2021000380

Cover image: © monsitj/ iStock/Getty Images

Cover design by Wiley

Set in 9.5/12.5pt STIX Two Text by Integra Software Services, Pondicherry, India.

To our families:

Yifan and LauraJinghui, Jonathan, and Nathan

Cover

Title page

Dedication

Preface

Acknowledgments

Acronyms

Table of Notation

Chapter 1: Introduction

1.1 Background and Motivation

1.2 Scope and Organization of the Book

1.3 How to Use This Book

Bibliographic Notes

Part 1 Statistical Methods and Foundation for Industrial Data Analytics

Chapter 2: Introduction to Data Visualization and Characterization

2.1 Data Visualization

2.1.1 Distribution Plots for a Single Variable

2.1.2 Plots for Relationship Between Two Variables

2.1.3 Plots for More than Two Variables

2.2 Summary Statistics

2.2.1 Sample Mean, Variance, and Covariance

2.2.2 Sample Mean Vector and Sample Covariance Matrix

2.2.3 Linear Combination of Variables

Bibliographic Notes

Exercises

Chapter 3: Random Vectors and the Multivariate Normal Distribution

3.1 Random Vectors

3.2 Density Function and Properties of Multivariate Normal Distribution

3.3 Maximum Likelihood Estimation for Multivariate Normal Distribution

3.4 Hypothesis Testing on Mean Vectors

3.5 Bayesian Inference for Normal Distribution

Bibliographic Notes

Exercises

Chapter 4: Explaining Covariance Structure: Principal Components

4.1 Introduction to Principal Component Analysis

4.1.1 Principal Components for More Than Two Variables

4.1.2 PCA with Data Normalization

4.1.3 Visualization of Principal Components

4.1.4 Number of Principal Components to Retain

4.2 Mathematical Formulation of Principal Components

4.2.1 Proportion of Variance Explained

4.2.2 Principal Components Obtained from the Correlation Matrix

4.3 Geometric Interpretation of Principal Components

4.3.1 Interpretation Based on Rotation

4.3.2 Interpretation Based on Low-Dimensional Approximation

Bibliographic Notes

Exercises

Chapter 5: Linear Model for Numerical and Categorical Response Variables

5.1 Numerical Response – Linear Regression Models

5.1.1 General Formulation of Linear Regression Model

5.1.2 Significance and Interpretation of Regression Coefficients

5.1.3 Other Types of Predictors in Linear Models

5.2 Estimation and Inferences of Model Parameters for Linear Regression

5.2.1 Least Squares Estimation

5.2.2 Maximum Likelihood Estimation

5.2.3 Variable Selection in Linear Regression

5.2.4 Hypothesis Testing

5.3 Categorical Response – Logistic Regression Model

5.3.1 General Formulation of Logistic Regression Model

5.3.2 Significance and Interpretation of Model Coefficients

5.3.3 Maximum Likelihood Estimation for Logistic Regression

Bibliographic Notes

Exercises

Chapter 6: Linear Mixed Effects Model

6.1 Model Structure

6.2 Parameter Estimation for LME Model

6.2.1 Maximum Likelihood Estimation Method

6.2.2 Distribution-Free Estimation Methods

6.3 Hypothesis Testing

6.3.1 Testing for Fixed Effects

6.3.2 Testing for Variance–Covariance Parameters

Bibliographic Notes

Exercises

Part 2 Random Effects Approaches for Diagnosis and Prognosis

Chapter 7: Diagnosis of Variation Source Using PCA

7.1 Linking Variation Sources to PCA

7.2 Diagnosis of Single Variation Source

7.3 Diagnosis of Multiple Variation Sources

7.4 Data Driven Method for Diagnosing Variation Sources

Bibliographic Notes

Exercises

Chapter 8: Diagnosis of Variation Sources Through Random Effects Estimation

8.1 Estimation of Variance Components

8.2 Properties of Variation Source Estimators

8.3 Performance Comparison of Variance Component Estimators

Bibliographic Notes

Exercises

Chapter 9: Analysis of System Diagnosability

9.1 Diagnosability of Linear Mixed Effects Model

9.2 Minimal Diagnosable Class

9.3 Measurement System Evaluation Based on System Diagnosability

Bibliographic Notes

Exercises

Appendix

Chapter 10: Prognosis Through Mixed Effects Models for Longitudinal Data

10.1 Mixed Effects Model for Longitudinal Data

10.2 Random Effects Estimation and Prediction for an Individual Unit

10.3 Estimation of Time-to-Failure Distribution

10.4 Mixed Effects Model with Mixture Prior Distribution

10.4.1 Mixture Distribution

10.4.2 Mixed Effects Model with Mixture Prior for Longitudinal Data

10.5 Recursive Estimation of Random Effects Using Kalman Filter

10.5.1 Introduction to the Kalman Filter

10.5.2 Random Effects Estimation Using the Kalman Filter

Biographical Notes

Exercises

Appendix

Chapter 11: Prognosis Using Gaussian Process Model

11.1 Introduction to Gaussian Process Model

11.2 GP Parameter Estimation and GP Based Prediction

11.3 Pairwise Gaussian Process Model

11.3.1 Introduction to Multi-output Gaussian Process

11.3.2 Pairwise GP Modeling Through Convolution Process

11.4 Multiple Output Gaussian Process for Multiple Signals

11.4.1 Model Structure

11.4.2 Model Parameter Estimation and Prediction

11.4.3 Time-to-Failure Distribution Based on GP Predictions

Bibliographical Notes

Exercises

Chapter 12: Prognosis Through Mixed Effects Models for Time-to-Event Data

12.1 Models for Time-to-Event Data Without Covariates

12.1.1 Parametric Models for Time-to-Event Data

12.1.2 Non-parametric Models for Time-to-Event Data

12.2 Survival Regression Models

12.2.1 Cox PH Model with Fixed Covariates

12.2.2 Cox PH Model with Time Varying Covariates

12.2.3 Assessing Goodness of Fit

12.3 Joint Modeling of Time-to-Event Data and Longitudinal Data

12.3.1 Structure of Joint Model and Parameter Estimation

12.3.2 Online Event Prediction for a New Unit

12.4 Cox PH Model with Frailty Term for Recurrent Events

Bibliographical Notes

Exercises

Appendix

Appendix: Basics of Vectors, Matrices, and Linear Vector Space

References

Index

Guide

Cover

Title page

Dedication

Table of Contents

Preface

Acknowledgments

Acronyms

Table of Notation

Begin Reading

Appendix: Basics of Vectors, Matrices, and Linear Vector Space

References

Index

End User License Agreement

Pages

iii

vii

viii

xii

xiii

xiv

xvi

xvii

xviii

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

Preface

Today, we are facing a data rich world that is changing faster than ever before. The ubiquitous availability of data provides great opportunities for industrial enterprises to improve their process quality and productivity. Industrial data analytics is the process of collecting, exploring, and analyzing data generated from industrial operations and throughout the product life cycle in order to gain insights and improve decision-making. This book describes industrial data analytics approaches with an emphasis on diagnosis and prognosis of industrial processes and systems.

A large number of textbooks/research monographs exist on diagnosis and prognosis in the engineering field. Most of these engineering books focus on model-based diagnosis and prognosis problems in dynamic systems. The model-based approaches adopt a dynamic model for the system, often in the form of a state space model, as the basis for diagnosis and prognosis. Different from these existing books, this book focuses on the concept of random effects and its applications in system diagnosis and prognosis. The impetus for this book arose from the current digital revolution. In this digital age, the essential feature of a modern engineering system is that a large amount of data from multiple similar units/machines during their operations are collected in real time. This feature poses significant intellectual opportunities and challenges. As for opportunities, since we have observations from potentially a very large number of similar units, we can compare their operations, share the information, and extract common knowledge to enable accurate and tailored prediction and control at the individual level. As for challenges, because the data are collected in the field and not in a controlled environment, the data contain significant variation and heterogeneity due to the large variations in working/usage conditions for different units. This requires that the analytics approaches should be not only general (so that the common information can be learned and shared), but also flexible (so that the behavior of an individual unit can be captured and controlled). The random effects modeling approaches can exactly address these opportunities and challenges.

Random effects, as the name implies, refer to the underlying random factors in an industrial process or system that impact on the outcome of the process. In diagnosis and prognosis applications, random effects can be used to model the sources of variation in a process and the variation among individual characteristics of multiple heterogeneous units. Some excellent books are available in the industrial statistics area. However, these existing books mainly focus on population level behavior and fixed effects models. The goal of this book is to adapt and bring the theory and techniques of random effects to the application area of industrial system diagnosis and prognosis.

The book contains two main parts. The first part covers general statistical concepts and theory useful for describing and modeling variation, fixed effects, and random effects for both univariate and multivariate data, which provides the necessary background for the second part of the book. The second part covers advanced statistical methods for variation source diagnosis and system failure prognosis based on the random effects modeling approach. An appendix summarizing the basic results in linear spaces and matrix theory is also included at the end of the book for the sake of completeness.

This book is intended for students, engineers, and researchers who are interested in using modern statistical methods for variation modeling, analysis, and prediction in industrial systems. It can be used as a textbook for a graduate level or advanced undergraduate level course on industrial data analytics and/or quality and reliability engineering. We also include “Bibliographic Notes” at the end of each chapter that highlight relevant additional reading materials for interested readers. These bibliographic notes are not intended to provide a complete review of the topic. We apologize for missing literature that is relevant but not included in these notes.

Many of the materials of this book come from the authors’ recent research works in variation modeling and analysis, variation source diagnosis, and system condition and failure prognosis for manufacturing systems and beyond. We hope this book can stimulate some new research and serve as a reference book for researchers in this area.

Shiyu ZhouMadison, Wisconsin, USAYong ChenIowa City, Iowa, USA

Acknowledgments

We would like to thank the many people we collaborated with that have led up to the writing of this book. In particular, we would like to thank Jianjun Shi, our Ph.D. advisor at the University of Michigan (now at Georgia Tech.), for his continuous advice and encouragement. We are grateful for our colleagues Daniel Apley, Darek Ceglarek, Yu Ding, Jionghua Jin, Dharmaraj Veeramani, Yilu Zhang for their collaborations with us on the related research topics. Grateful thanks also go to Raed Kontar, Junbo Son, and Chao Wang who have helped with the book including computational code to create some of the illustrations and designing the exercise problems. Many students including Akash Deep, Salman Jahani, Jaesung Lee, and Congfang Huang read parts of the manuscript and helped with the exercise problems. We thank the National Science Foundation for the support of our research work related to the book.

Finally, a very special note of appreciation is extended to our families who have provided continuous support over the past years.

S.Z. and Y.C.

Acronyms

AIC

Akaike Information Criterion

BIC

Bayesian Information Criterion

CDF

Cumulative Distribution Function

Expectation–Maximization

Gaussian Process

i.i.d.

Independent and identically distributed

IoT

Internet of Things

IQR

Interquartile Range

KCC

Key Control Characteristic

KQC

Key Quality Characteristic

LME

Linear Mixed Effects

LRT

Likelihood Ratio Test

MLE

Maximum Likelihood Estimation

MINQUE

Minimum Norm Quadratic Unbiased Estimation

MOGP

Multiple Output Gaussian Process

PCA

Principal Component Analysis

pdf

probability density function

Proportional Hazards

RREF

Reduced Row Echelon Form

REML

Restricted Maximum Likelihood

RUL

Remaining Useful Life

r.v.

random variable(s)

SNR

Signal-to-Noise Ratio

1 Introduction

1.1 Background and Motivation

Today, we are facing a data rich world that is changing faster than ever before. The ubiquitous availability of data provides great opportunities for industrial enterprises to improve their process quality and productivity. Indeed, the fast development of sensing, communication, and information technology has turned modern industrial systems into a data rich environment. For example, in a modern manufacturing process, it is now common to conduct a 100% inspection of product quality through automatic inspection stations. In addition, many modern manufacturing machines are numerically-controlled and equipped with many sensors and can provide various sensing data of the working conditions to the outside world.

One particularly important enabling technology in this trend is the Internet of Things (IoT) technology. IoT represents a network of physical devices, which enables ubiquitous data collection, communication, and sharing. One typical application of the IoT technology is the remote condition monitoring, diagnosis, and failure prognosis system for after-sales services. Such a system typically consists of three major components as shown in Figure 1.1: (i) the in-field units (e.g., cars on the road), (ii) the communication network, and (iii) the back-office/cloud data processing center. The sensors embedded in the in-field unit continuously generate data, which are transmitted through the communication network to the back office. The aggregated data are then processed and analyzed at the back-office to assess system status and produce prognosis. The analytics results and the service alerts are passed individually to the in-field unit. Such a remote monitoring system can effectively improve the user experience, enhance the product safety, lower the ownership cost, and eventually gain competitive advantage for the manufacturer. Driven by the rapid development of information technology and the critical needs of providing fast and effective after-sales services to the products in a globalized market, the remote monitoring systems are becoming increasingly available.

Figure 1.1 A diagram of an IoT enabled remote condition monitoring system.

The unprecedented data availability provides great opportunities for more precise and contextualized system condition monitoring, diagnosis, and prognosis, which are very challenging to achieve if only scarce data are available. Industrial data analytics is the process of collecting, exploring, and analyzing data generated from industrial operations throughout the product life cycle in order to gain insights and improve decision-making. Industrial data analytics encompasses a vast set of applied statistics and machine learning tools and techniques, including data visualization, data-driven process modeling, statistical process monitoring, root cause identification and diagnosis, predictive analytics, system reliability and robustness, and design of experiments, to name just a few. The focus of this book is industrial data analytics approaches that can take advantage of the unprecedented data availability. Particularly, we focus on the concept of random effects and its applications in system diagnosis and prognosis.

The terms diagnosis and prognosis were originally used in the medical field. Diagnosis is the identification of the disease that is responsible to the symptoms of the patient’s illness, and prognosis is a forecast of the likely course of the disease. In the field of engineering, these terms have similar meanings: for an industrial system, diagnosis is the identification of the root cause of a system failure or abnormal working condition; and prognosis is the prediction of the system degradation status and the future failure or break down. Obviously, diagnosis and prognosis play a critical role in assuring smooth, efficient, and safe system operations. Indeed, diagnosis and prognosis have attracted ever-growing interest in recent years. This trend has been driven by the fact that capital goods manufacturers have been coming under increasing pressure to open up new sources of revenue and profit in recent years. Maintenance service costs constitute around 60–90% of the life-cycle costs of industrial machinery and installations. Systematic extension of the after-sales service business will be an increasingly important driver of profitable growth.

Due to the importance of diagnosis and prognosis in industrial system operations, a relatively large number of books/research monographs exist on this topic [Lewis et al., 2011, Niu, 2017, Wu et al., 2006, Talebi et al., 2009, Gertler, 1998, Chen and Patton, 2012, Witczak, 2007, Isermann, 2011, Ding, 2008, Si et al., 2017]. As implied by their titles, many of these books focus on model-based diagnosis and prognosis problems in dynamic systems. A model-based approach adopts a dynamic model, often in the form of a state space model, as the basis for diagnosis and prognosis. Then the difference between the observations and the model predictions, called residuals, are examined to achieve fault identification and diagnosis. For the prognosis, data-driven dynamic forecasting methods, such as time series modeling methods, are used to predict the future values of the interested system signals. The modeling and analysis of the system dynamics are the focus of the existing literature.

Different from the existing literature, this book focuses on the concept of random effects and its applications in system diagnosis and prognosis. Random effects, as the name implies, refer to the underlying random factors in an industrial process that impact on the outcome of the process. In diagnosis and prognosis applications, random effects can be used to model the sources of variation in a process and the variation among individual characteristics of multiple heterogeneous units. The following two examples illustrate the random effects in industrial processes.

Example 1.1 Random effects in automotive body sheet metal assembly processes

The concept of variation source is illustrated for an assembly operation in which two parts are welded together. In an automotive sheet metal assembly process, the sheet metals will be positioned and clamped on the fixture system through the matching of the locators (also called pins) on the fixture system and the holes on the sheet metals. Then the sheet metals will be welded together. Clearly, the accuracy of the positions of the locating pins and the tightness of the matching between the pins and the holes significantly influence the dimensional accuracy of the final assembly. Figure 1.2(a) shows the final product as designed. The assembly process is as follows: Part 1 is first located on the fixture and constrained by 4-way Pin L1 and 2-way Pin L2. A 4-way pin constrains the movement in two directions, while a 2-way pin only constrains the movement in one direction. Then, Part 2 is located by 4-way Pin L3 and 2-way Pin L4. The two parts are then welded together in a joining operation and released from the fixture.

Figure 1.2 Random effects in an assembly operation.

If the position or diameter of Pin L1 deviates from design nominal, then Part 1 will consequently not be in its designed nominal position, as shown in Figure 1.2(b). After joining Part 1 and Part 2, the dimensions of the final parts will deviate from the designed nominal values. One critical point that needs to be emphasized is that Figure 1.2(b) only shows one possible realization of produced assemblies. If we produce another assembly, the deviation of the position of Part 1 could be different. For instance, if the diameter of a pin is reduced due to pin wear, then the matching between the pin and the corresponding hole will be loose, which will lead to random wobble of the final position of part. This will in turn cause increased variation in the dimension of the produced final assemblies. As a result, mislocations of the pin can be manifested by either mean shift or variance change in the dimensional quality measurement such as M1 and M2 in the figure. In the case of mean shift error (for example due to a fixed position shift of the pin), the error can be compensated by process adjustment such as realignment of the locators. The variance change errors (for example due to a worn-out pin or the excessive looseness of a pin) cannot be easily compensated for in most cases. Also, note that each locator in the process is a potential source of the variance change errors, which is referred to as a variation source. The variation sources are random effects in the process that will impact on the final assembly quality. In most assembly processes, the pin wear is difficult to measure so the random effects are not directly observed. In a modern automotive body assembly process, hundreds of locators are used to position a large number of parts and sub-assemblies. An important and challenging diagnosis problem is to estimate and identify the variation sources in the process based on the observed quality measurements.

Example 1.2 Random effects in battery degradation processes

In industrial applications, the reliability of a critical unit is crucial to guarantee the overall functional capabilities of the entire system. Failure of such a unit can be catastrophic. Turbine engines of airplanes, power supplies of computers, and batteries of automobiles are typical examples where failure of the unit would lead to breakdown of the entire system. For these reasons, the working condition of such critical units must be monitored and the remaining useful life (RUL) of such units should be predicted so that we can take preventive actions before catastrophic failure occurs. Many system failure mechanisms can be traced back to some underlying degradation processes. An important prognosis problem is to predict RUL based on the degradation signals collected, which are often strongly associated with the failure of the unit. For example, Figure 1.3 shows the evolution of the internal resistance signals of multiple automotive lead-acid batteries. The internal resistance measurement is known to be one of the best condition monitoring signals for the battery life prognosis [Eddahech et al., 2012]. As we can see from Figure 1.3, the internal resistance measurement generally increases with the service time of the battery, which indicates that the health status of the battery is deteriorating.

Figure 1.3 Internal resistance measures from multiple batteries over time.

We can clearly see from Figure 1.3 that although similar, the progression paths of the internal resistance over time of different batteries are not identical. The difference is certainly expected due to many random factors in the material, manufacturing processes, and the working environment that vary from unit-to-unit. The random characteristics of degradation paths are random effects, which impact the observed degradation signals of multiple batteries.

The available data from multiple similar units/machines poses interesting intellectual opportunities and challenges for prognosis. As for opportunities, since we have observations from potentially a very large number of similar units, we can compare their operations/conditions, share the information, and extract common knowledge to enable accurate prediction and control at the individual level. As for challenges, because the data are collected in the field and not in a controlled environment, the data contain significant variation and heterogeneity due to the large variations in working conditions for different units. The data analytics approaches should not only be general (so that the common information can be learned and shared), but also flexible (so that the behavior of an individual subject can be captured and controlled).

Random effects always exist in industrial processes. The process variation caused by random effects is detrimental and thus random effects should be modeled, analyzed, and controlled, particularly in system diagnosis and prognosis. However, due to the limitation in the data availability, the data analytics approaches considering random effects have not been widely adopted in industrial practices. Indeed, before the significant advancement in communication and information technology, data collection in industries often occurs locally in very similar environments. With such limited data, the impact of random effects cannot be exposed and modeled easily. This situation has changed significantly in recent years due to the digital revolution as mentioned at the beginning of the section.

The statistical methods for random effects provide a powerful set of tools for us to model and analyze the random variation in an industrial process. The goal of this book is to provide a textbook for engineering students and a reference book for researchers and industrial practitioners to adapt and bring the theory and techniques of random effects to the application area of industrial system diagnosis and prognosis. The detailed scope of the book is summarized in the next section.

1.2 Scope and Organization of the Book

This book focuses on industrial data analytics methods for system diagnosis and prognosis with an emphasis on random effects in the system. Diagnosis concerns identification of the root cause of a failure or an abnormal working condition. In the context of random effects, the goal of diagnosis is to identify the variation sources in the system. Prognosis concerns using data to predict what will happen in the future. Regarding random effects, prognosis focuses on addressing unit-to-unit variation and making degradation/failure predictions for each individual unit considering the unique characteristic of the unit.

The book contains two main parts:

Statistical Methods and Foundation for Industrial Data Analytics

This part covers general statistical concepts, methods, and theory useful for describing and modelling the variation, the fixed effects, and the random effects for both univariate and multivariate data. This part provides necessary background for later chapters in part II. In part I, Chapter 2 introduces the basic statistical methods for visualizing and describing data variation. Chapter 3 introduces the concept of random vectors and multivariate normal distribution. Basic concepts in statistical modeling and inference will also be introduced. Chapter 4 focuses on the principal component analysis (PCA) method. PCA is a powerful method to expose and describe the variations in multivariate data. PCA has broad applications in variation source identification. Chapter 5 focuses on linear regression models, which are useful in modeling the fixed effects in a dataset. Statistical inference in linear regression including parameter estimation and hypothesis testing approaches will be discussed. Chapter 6 focuses on the basic theory of the linear mixed effects model, which captures both the fixed effects and the random effects in the data.

Random Effects Approaches for Diagnosis and Prognosis

This part covers the applications of the random effects modeling approach to diagnosis of variation sources and to failure prognosis in industrial processes/systems. Through industrial application examples, we will present variation pattern based variation source identification in Chapter 7. Variation source estimation methods based on the linear mixed effects model will be introduced in Chapter 8. A detailed performance comparison of different methods for practical applications is presented as well. In Chapter 9, the diagnosability issue for the variation source diagnosis problem will be studied. Chapter 10 introduces the mixed effects longitudinal modeling approach for forecasting system degradation and predicting remaining useful life based on the first time hitting probability. Some variations of the basic method such as the method considering mixture prior for unbalanced data in remaining useful life prediction are also presented. Chapter 11 introduces the concept of Gaussian processes as a nonparametric way for the modeling and analysis of multiple longitudinal signals. The application of the multi-output Gaussian process for failure prognosis will be presented as well. Chapter 12 introduces the method for failure prognosis combining the degradation signals and time-to-event data. The advanced joint prognosis model which integrates the survival regression model and the mixed effects regression model is presented.

1.3 How to Use This Book

This book is intended for students, engineers, and researchers who are interested in using modern statistical methods for variation modeling, diagnosis, and prediction in industrial systems.

This book can be used as a textbook for a graduate level or advanced undergraduate level courses on industrial data analytics. The book is fairly self-contained, although background in basic probability and statistics such as the concept of random variable, probability distribution, moments, and basic knowledge in linear algebra such as matrix operations and matrix decomposition would be useful. The appendix at the end of the book provides a summary of the necessary concepts and results in linear space and matrix theory. The materials in Part II of the book are relatively independent. So the instructor could combine selected chapters in Part II with Part I as the basic materials for different courses. For example, topics in Part I can be used for an advanced undergraduate level course on introduction to industrial data analytics. The materials in Part I and some selected chapters in Part II (e.g., Chapters 7, 8, and 9) can be used in a master’s level statistical quality control course. Similarly, materials in Part I and selected later chapters in Part II (e.g., Chapters 10, 11, 12) can be used in a master’s level course with emphasis on prognosis and reliability applications. Finally, Part II alone can be used as the textbook for an advanced graduate level course on diagnosis and prognosis.

One important feature of this book is that we provide detailed descriptions of software implementation for most of the methods and algorithms. We adopt the statistical programming language R in this book. R language is versatile and has a very large number of up-to-date packages implementing various statistical methods [R Core Team, 2020]. This feature makes this book fit well with the needs of practitioners in engineering fields to self study and implement the statistical modeling and analysis methods. All the R codes and data sets used in this book can be found at the book companion website.

Bibliographic Notes

Some examples of good books on system diagnosis and prognosis in engineering area are Lewis et al. [2011], Niu [2017], Wu et al. [2006], Talebi et al. [2009], Gertler [1998], Chen and Patton [2012], Witczak [2007], Isermann [2011], Ding [2008], Si et al. [2017]. Many good textbooks are available on industrial statistics. For example, Montgomery [2009], DeVor et al. [2007], Colosimo and Del Castillo [2006], Wu and Hamada [2011] are on statistical monitoring and design. On the failure event analysis and prognosis, Meeker and Escobar [2014], Rausand et al. [2004], Elsayed [2012] are commonly cited references.

Part I Statistical Methods and Foundation for Industrial Data Analytics

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: