73,99 €
The first book to discuss robust aspects of nonlinear regression—with applications using R software
Robust Nonlinear Regression: with Applications using R covers a variety of theories and applications of nonlinear robust regression. It discusses both parts of the classic and robust aspects of nonlinear regression and focuses on outlier effects. It develops new methods in robust nonlinear regression and implements a set of objects and functions in S-language under SPLUS and R software. The software covers a wide range of robust nonlinear fitting and inferences, and is designed to provide facilities for computer users to define their own nonlinear models as an object, and fit models using classic and robust methods as well as detect outliers. The implemented objects and functions can be applied by practitioners as well as researchers.
The book offers comprehensive coverage of the subject in 9 chapters: Theories of Nonlinear Regression and Inference; Introduction to R; Optimization; Theories of Robust Nonlinear Methods; Robust and Classical Nonlinear Regression with Autocorrelated and Heteroscedastic errors; Outlier Detection; R Packages in Nonlinear Regression; A New R Package in Robust Nonlinear Regression; and Object Sets.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 323
Veröffentlichungsjahr: 2018
Cover
Dedication
Preface
Acknowledgements
About the Companion Website
Part One: Theories
Chapter 1: Robust Statistics and its Application in Linear Regression
1.1 Robust Aspects of Data
1.2 Robust Statistics and the Mechanism for Producing Outliers
1.3 Location and Scale Parameters
1.4 Redescending M‐estimates
1.5 Breakdown Point
1.6 Linear Regression
1.7 The Robust Approach in Linear Regression
1.8 S‐estimator
1.9 Least Absolute and Quantile Esimates
1.10 Outlier Detection in Linear Regression
Chapter 2: Nonlinear Models: Concepts and Parameter Estimation
2.1 Introduction
2.2 Basic Concepts
2.3 Parameter Estimations
2.4 A Nonlinear Model Example
Chapter 3: Robust Estimators in Nonlinear Regression
3.1 Outliers in Nonlinear Regression
3.2 Breakdown Point in Nonlinear Regression
3.3 Parameter Estimation
3.4 Least Absolute and Quantile Estimates
3.5 Quantile Regression
3.6 Least Median of Squares
3.7 Least Trimmed Squares
3.8 Least Trimmed Differences
3.9 S‐estimator
3.10
‐estimator
3.11 MM‐estimate
3.12 Environmental Data Examples
3.13 Nonlinear Models
3.14 Carbon Dioxide Data
3.15 Conclusion
Chapter 4: Heteroscedastic Variance
4.1 Definitions and Notations
4.2 Weighted Regression for the Nonparametric Variance Model
4.3 Maximum Likelihood Estimates
4.4 Variance Modeling and Estimation
4.5 Robust Multistage Estimate
4.6 Least Squares Estimate of Variance Parameters
4.7 Robust Least Squares Estimate of the Structural Variance Parameter
4.8 Weighted M‐estimate
4.9 Chicken‐growth Data Example
4.10 Toxicology Data Example
4.11 Evaluation and Comparison of Methods
Chapter 5: Autocorrelated Errors
5.1 Introduction
5.2 Nonlinear Autocorrelated Model
5.3 The Classic Two‐stage Estimator
5.4 Robust Two‐stage Estimator
5.5 Economic Data
5.6 ARIMA(1,0,1)(0,0,1)7 Autocorrelation Function
Chapter 6: Outlier Detection in Nonlinear Regression
6.1 Introduction
6.2 Estimation Methods
6.3 Point Influences
6.4 Outlier Detection Measures
6.5 Simulation Study
6.6 Numerical Example
6.7 Variance Heteroscedasticity
6.8 Conclusion
Part Two: Computations
Chapter 7: Optimization
7.1 Optimization Overview
7.2 Iterative Methods
7.3 Wolfe Condition
7.4 Convergence Criteria
7.5 Mixed Algorithm
7.6 Robust M‐estimator
7.7 The Generalized M‐estimator
7.8 Some Mathematical Notation
7.9 Genetic Algorithm
Chapter 8: nlr Package
8.1 Overview
8.2
nl.form
Object
8.3 Model Fit by
nlr
8.4
nlr.control
8.5
Fault
Object
8.6 Ordinary Least Squares
8.7 Robust Estimators
8.8 Heteroscedastic Variance Case
8.9 Autocorrelated Errors
8.10 Outlier Detection
8.11 Initial Values and Self‐start
Chapter 9: Robust Nonlinear Regression in R
9.1 Lakes Data Examples
9.2 Simulated Data Examples
Appendix A: nlr Database
A.1 Data Set used in the Book
A.2 Nonlinear Regression Models
A.3 Robust Loss Functions Data Bases
A.4 Heterogeneous Variance Models
References
Index
End User License Agreement
Chapter 1
Table A.1 Chicken‐growth data (g).
Table A.2 Methane data.
Table A.3 Carbon dioxide data.
Table A.4 Lakes data.
Table A.5 Net money data.
Table A.6 Iran trademark data.
Table A.7 ntp data.
Table A.8 Cow milk production data for a single cow in a year.
Table A.9 Three cases of simulated outliers from logistic model.
Table A.10 Artificially contaminated data.
Table A.11 Robust rho functions.
Table A.12 Variance model functions.
Chapter 1
Table 1.1 Linear regression formulas.
Chapter 3
Table 3.1 The robust MM‐ and classical OLS estimates for the scaled exponential model, CH
4
data.
Table 3.2 The robust MM‐ and classical OLS estimates for the exponential convex model, CH
4
data.
Table 3.3 The robust MM‐ and classical OLS estimates for the power model, CH
4
data.
Table 3.4 The robust MM‐ and classical OLS estimates for the exponential with intercept model, CH
4
data
Table 3.5 The robust MM‐ and classical OLS estimates for the scaled exponential model, CO
2
data.
Table 3.6 The robust MM‐ and classical OLS estimates for the scaled exponential convex model, CO
2
data
Table 3.7 The robust MM‐ and classical OLS estimates for the power model, CO
2
data
Table 3.8 The robust MM‐ and classical OLS estimates for the exponential with intercept model, CO
2
data
Chapter 4
1 Parameter estimates for chicken‐growth data using a logistic model with a power heteroscedastic function.
Table 4.2 Parameter estimates of the Hill model for chromium concentration in the kidney of mouse (values in parentheses show estimated errors).
Chapter 5
Table 5.1 The robust MM‐estimates of fitted four models for the Iran trademark data.
Table 5.2 The RTS estimates of the fitted exponential with intercept models for the Iran trademark data, with ARIMA(1,0,1)(0,0,1)7 autocorrelated structure.
Chapter 6
Table 6.1 Measures based on OLS estimates for a data set with one outlier (Case1): mixture of OLS with tangential leverage and Jacobian leverage.
Table 6.2 Measures based on MM‐estimates for a data set with one outlier (Case1): mixture of MM with tangential leverage, Jacobian leverage, and robust Jacobian leverage.
Table 6.3 Measures based on OLS estimates for a data set with three outliers (Cases 6, 7, and 8): mixture of OLS with tangential leverage and Jacobian leverage.
Table 6.4 Measures based on MM‐estimates for a data set with three outliers (Cases 6, 7, and 8): mixture of MM with tangential leverage, Jacobian leverage and robust Jacobian leverage.
Table 6.5 Measures based on OLS estimates for a data set with six outliers (last six observations in Cases 6, 7, and 8): mixture of OLS with tangential leverage and Jacobian leverage.
Table 6.6 Measures based on MM‐estimates for a data set with six outliers (last six observations in Cases 6, 7, and 8): mixture of MM with tngential leverage, Jacobian leverage and robust Jacobian leverage.
Table 6.7 Outlier measures for lakes data, computed using the hat matrix and Jacobian leverage, and the MM‐estimator.
Chapter 9
Table 9.1 Parameter estimates for lakes data, using the
nlr
,
nlrq
,
nlrob
, and
nls
functions.
Table 9.2 Parameter estimates for artificially contaminated data, using nlr, nlrq, nlrob, and nls functions, and parameter biases.
Chapter 1
Figure 1.1 Contaminated normal densities: (a) mixture of two normal distributions with different means; (b) mixture of two normal distributions with different variances.
Figure 1.2 (a)
functions; (b)
functions.
Chapter 2
Figure 2.1 Fitted Wood model for a single cow's milk production data for a year.
Chapter 3
Figure 3.1 Simulated logistic model, artificially contaminated. (a) Logistic model: solid line, least squares estimate; dashed line, robust MM‐estimate. (b) Computed residuals. Upper chart: least squares estimate; lower chart: MM‐estimate.
Figure 3.2 Simulated linear model, artificially contaminated. (a) Least squares fit: solid line: least squares estimate; dashed line: robust MM‐estimate. (b) Computed residuals: top, least squares estimate; bottom, MM‐estimate.
Figure 3.3 Simulated nonlinear exponential model, artificially contaminated. (a) Exponential model: solid line, estimates using OLS; dotted line, LMS using Nelder–Mead optimization; dashed line, LMS using a genetic algorithm. (b) Median of error squares: objective function LMS for different parameter values.
Figure 3.4 Four models fitted to methane data using the robust MM‐estimator.
Figure 3.5 Four models fitted to carbon dioxide data using the robust MM‐estimator.
Chapter 4
Figure 4.1 Chicken‐growth data estimates using the classical multistage estimator.
Figure 4.2 Plot of the standardized residual and its square against the predictor and response to assess the adequacy of the weighted least squares of the model.
Source
: Riazoshams and Midi (2009). Reproduced with permission of European Journal of Science.
Figure 4.3 Graphs for choosing the model variance: (a) exponential variance model assessment; (b) power variance model assessment; (c) adequacy of estimated variance model; (d) assessment of the equality of standard deviation and estimate thereof.
Figure 4.4 Toxicology data, fitted by robust and classical methods: (a) classical estimates, (b) robust estimates.
Chapter 5
Figure 5.1 Fitted exponential with intercept model for the Iran trademark data using OLS and MM‐estimates.
Figure 5.2 Fitted exponential with intercept model for the Iran net money data using OLS and MM‐estimates.
Figure 5.3 ACF, PACF, and robust ACF for OLS and MM‐estimates. (a) MM‐estimate residuals, (b) MM‐estimate residuals, (c) OLS‐estimate residuals, (d) OLS‐estimate residuals, (e) MM‐estimate residuals, (f) OLS‐estimate residuals.
Figure 5.4 ACF, PACF, and robust ACF for OLS and MM‐estimates. The residual of the outlier was deleted from the computed residuals from model fits. (a) MM‐estimate, (b) MM‐estimate, (c) OLS‐estimate, (d) OLS‐estimate, (e) robust ACF MM‐estimate, (f) robust ACF OLS estimate.
Figure 5.5 ACF, PACF, and robust ACF for OLS and MM‐estimates. The fits were performed after the outlier had been deleted and the residuals were computed from the fitted model without the outlying point: (a) MM‐estimate, (b) MM‐estimate, (c) OLS‐estimate, (d) OLS‐estimate, (e) robust ACF, (f) robust ACF.
Figure 5.6 Model fit using OLS and MM‐estimators when the response of the outlier point was replaced by its prediction.
Figure 5.7 ACF, PACF, and robust ACF for OLS and MM‐estimates. Residuals were computed from the fitted model when the response of the outlier point was replaced by its prediction. (a) MM‐estimate, (b) MM‐estimate, (c) OLS‐estimate, (d) OLS‐estimate, (e) robust ACF, (f) robust ACF.
Chapter 6
Figure 6.1 Simulated outliers. Case 6.1: one outlier; Case 6.2: three outliers; Case 6.3: six high leverage points.
Figure 6.2 Artificial data: (a) without outliers, (b) contaminated data with outlier data points but variances are not outliers, (c) contaminated data with extremely small variance but data points are not outliers.
Figure 6.3 Artificial data, nonreplicated data.
Figure 6.4 Artificial data, replicated data.
Figure 6.5 Artificial data, replicated data: (a) without outlier, (b) with data outliers in the middle but their variance is not outlier.
Figure 6.6 Artificial replicated and nonreplicated data: (a) replicated data with small variance outlier at the end of the data series, (b) nonreplicated data with a single outlier in the middle.
Chapter 8
Figure 8.1 Plot of predicted values for t he scaled exponential model, computed using initial values from the
selfStart
slot for carbon dioxide data.
Figure 8.2 Output objects, inheritance hierarchy.
Figure 8.3 Fitted scale exponential convex model for the carbon data, using the
history=T
command.
Figure 8.4 Fitted scale exponential convex model for the carbon data, using Nelder–Mead algorithm: (a) using the default
; (b) using
, and both using
selfStart
as initial values.
Figure 8.5 Fitted scale exponential convex model for methane gas data using least mean squares.
Figure 8.6 Fitted logistic model with the power variance function model for chicken‐growth data.
Figure 8.7 Fitted Hill model and Hill variance function model for mouse kidney data. WM Weighted M-estimate, RME Robust Multi Stage Estimate.
Figure 8.8 Outlier detection measures for lakes data: studentized residual, Cook distance, and
.
Figure 8.9 Tangential plane leverage, robust Jacobian leverage and their differences for lakes data.
Figure 8.10 Fitted exponential with intercept for Iran net money data using: (a) LMS and (b) selfStart as the initial value.
Figure 8.11 Fitted power model for Iran net money data.
Chapter 9
Figure 9.1 Outlier detection measures based on manual initial values (
=1,
=1).
Figure 9.2 Artificially contaminated data fitted with six methods from three packages.
Cover
Table of Contents
Begin Reading
C1
vi
xi
xii
xiii
xv
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
31
32
33
34
35
36
37
38
39
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
61
62
63
64
65
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
141
143
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
194
195
196
197
198
199
200
201
202
203
204
205
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
233
234
235
236
237
238
239
240
241
242
Hossein Riazoshams
Lamerd Islamic Azad University, Iran Stockholm University, Sweden University of Putra, Malaysia
Habshah Midi
University of Putra, Malaysia
Gebrenegus Ghilagaber
Stockholm University, Sweden
This edition first published 2019
© 2019 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Hossein Riazoshams, Habshah Midi and Gebrenegus Ghilagaber to be identified as the authors of this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data
Names: Riazoshams, Hossein, 1971-- author. | Midi, Habshah, author. | Ghilagaber, Gebrenegus, author.
Title: Robust nonlinear regression: with applications using R / Hossein Riazoshams, Habshah Midi, Gebrenegus Ghilagaber.
Description: Hoboken, NJ : John Wiley & Sons, 2018. | Includes bibliographical references and index. |
Identifiers: LCCN 2017057347 (print) | LCCN 2018005931 (ebook) | ISBN 9781119010456 (pdf) | ISBN 9781119010449 (epub) | ISBN 9781118738061 (cloth)
Subjects: LCSH: Regression analysis. | Nonlinear theories. | R (Computer program language)
Classification: LCC QA278.2 (ebook) | LCC QA278.2 .R48 2018 (print) | DDC 519.5/36-dc23
LC record available at https://lccn.loc.gov/2017057347
Cover Design: Wiley
Cover Image: © Wavebreakmedia Ltd/Getty Images; © Courtesy of Hossein Riazoshams
To my wife Benchamat Hanchana, from Hossein
This book is the result of the first author's research, between 2004 and 2016, in the robust nonlinear regression area, when he was affiliated with the institutions listed. The lack of computer programs together with mathematical development in this area encouraged us to write this book and provide an R‐package called nlr for which a guide is provided in this book. The book concentrates more on applications and thus practical examples are presented.
Robust statistics describes the methods used when the classical assumptions of statistics do not hold. It is mostly applied when a data set includes outliers that lead to violation of the classical assumptions.
The book is divided into two parts. In Part 1, the mathematical theories of robust nonlinear regression are discussed and parameter estimation for heteroscedastic error variances, autocorrelated errors, and several methods for outlier detection are presented. Part 2 presents numerical methods and R‐tools for nonlinear regression using robust methods.
In Chapter 1, the basic theories of robust statistics are discussed. Robust approaches to linear regression and outlier detection are presented. These mathematical concepts of robust statistics and linear regression are then extended to nonlinear regression in the rest of the book. Since the book is about nonlinear regression, the proofs of theorems related to robust linear regression are omitted.
Chapter 2 presents the concepts of nonlinear regression and discusses the theory behind several methods of parameter estimation in this area. The robust forms of these methods are outlined in Chapter 3. Chapter 2 presents the generalized least square estimate, which will be used for non‐classical situations.
Chapter 3 discusses the concepts of robust statistics, such as robustness and breakdown points, in the context of nonlinear regression. It also presents several robust parameter estimation techniques.
Chapter 4 develops the robust methods for a null condition when the error variances are not homogeneous. Different kinds of outlier are defined and their effects are discussed. Parameter estimation for nonlinear function models and variance function models are presented.
Another null condition, when the errors are autocorrelated, is discussed in Chapter 5. Robust and classical methods for estimating the nonlinear function model and the autocorrelation structure of the error are presented. The effect of different kinds of outlier are explained, and appropriate methods for identifying the correlation structure of errors in the presence of outliers are studied.
Chapter 6 explains the methods for identifying atypical points. The outlier detection methods that are developed in this chapter are based mainly on statistical measures that use robust estimators of the parameters of the nonlinear function model.
In Chapter 7, optimization methods are discussed. These techniques are then modified to solve the minimization problems found in robust nonlinear regressions. They will then used to solve the mathematical problems discussed in Part 1 of the book and their implementation in a new R package called nlr is then covered in Chapter 8.
Chapter 8 is a guide to the R package implemented for this book. It covers object definition for a nonlinear function model, parameter estimation, and outlier detection for several model assumption situations discussed in the Part 1. This chapter shows how to fit nonlinear models to real‐life and simulated data.
In Chapter 9, another R packages for robust nonlinear regression are presented and compared to nlr. Appendix A presents and describes the databases embedded in nlr, and the nonlinear models and functions available.
At the time of writing, the nlr package is complete, and is available at The Comprehensive R Archive Network (CRAN‐project) at https://cran.r‐project.org/package=nlr.
Because of the large number of figures and programs involved, there are many examples that could not be included in the book. Materials, programs, further examples, and a forum to share and discuss program bugs are all provided at the author's website at http://www.riazoshams.com/nlr and at the book's page on the Wiley website.
Response Manager, Shabdiz Music School of Iran,
Full time faculty member of Islamic Azad University of Lamerd, Iran,
Department of Statistics, Stockholm University, Sweden,
Institute for Mathematical, Research University of Putra, Malaysia
November 2017
Hossein Riazoshams
I would like to thank the people and organizations who have helped me in all stages of they research that has culminated in this book. Firstly I would like to express my appreciation to Mohsen Ghodousi Zadeh and Hamid Koohbor for helping me in collecting data for the first time in 2005. This led me to a program of research in nonlinear modeling.
I would like to recognize the Department of Statistics at Stockholm University, Sweden, for financial support while writing most of this book during my stay as a post‐doctoral researcher in 2012–2014.
A special note of appreciation is also due to the Islamic Azad University of Abadeh and Lamerd for financial support in connection with collecting some materials for this book.
I would like note my appreciation for the Institute for Mathematical Research of University Putra Malaysia for financial support during my PhD in 2007–2010 and afterwards.
I owe my gratitude to the John Wiley editing team, specially Shyamala and others for their great editing process during the preparation of the book.
Last but by no means least, I would like to thank my wife, Benchamat Hanchan, for her great patience with the financial and physical adversity that we experienced during this research.
November 2017
Hossein Riazoshams
Don't forget to visit the companion website for this book:
www.wiley.com/go/riazoshams/robustnonlinearregression
There you will find valuable material designed to enhance your learning, including:
Figures
Examples
Scan this QR code to visit the companion website
This is an introductory chapter giving the mathematical background to the robust statistics that are used in the rest of the book. Robust linear regression methods are then generalized to nonlinear regression in the rest of the book.
The robust approach to linear regression is described in this chapter. It is the main motivation for extending statistical inference approaches used in linear regression to nonlinear regression. This is done by considering the gradient of a nonlinear model as the design matrix in a linear regression. Outlier detection methods used in linear regression are also extended to use in nonlinear regression.
In this chapter the consistency and asymptotic distributions of robust estimators and robust linear regression are presented. The validity of the results requires certain regularity conditions, which are presented here. Proofs of the theorems are very technical and since this book is about nonlinear regression, they have been omitted.
Robust statistics were developed to interpret data for which classical assumptions, such as randomness, independence, distribution models, prior assumptions about parameters and other prior hypotheses do not apply. Robust statistics can be used in a wide range of problems.
The classical approach in statistics assumes that data are collected from a distribution function; that is, the observed values follow the simultaneous distribution function . If the observations are identically independently distributed (i.i.d.) with distribution , we write (the tilde sign designates a distribution). In real‐life data, these explicit or other implicit assumptions might not be true. Outlier data effects are examples of situations that require robust statistics to be used for such null conditions.
Robust statistics were developed to analyse data drawn from wide range of distributions and particularly data that do not follow a normal distribution, for example when a normal distribution is mixed with another known statistical distribution:1
where is a small value representing the proportion of outliers, is the normal cumulative distribution function (CDF) with appropriate mean and variance, and belongs to a suitable class of CDFs. A normal distribution () with a large variance can produce a wide distribution, such as:
for a large value of (see Figure 1.1a). A mixture of two normal distributions with a large difference in their means can be generated by:
where the variance value is much smaller than , and the mean is the mean of the shifted distribution (see Figure 1.1b). The models in this book will be used to interpret data sets with outliers. Figure 1.1a shows the CDF of a mixture of two normal distributions with different means:
and Figure 1.1b shows the CDF of a mixture of two normal distributions with different variances:
Figure 1.1 Contaminated normal densities: (a) mixture of two normal distributions with different means; (b) mixture of two normal distributions with different variances.
Source: Maronna et al. (2006). Reproduced with permission of John Wiley and Sons.
In this section we discuss the location and scale models for random sample data. In later chapters these concepts will be extended to nonlinear regression. The location model is a nonlinear regression model and the scale parameter describes the nonconstant variance case, which is common in nonlinear regression.
Nonlinear regression, and linear regression in particular, can be represented by a location model, a scale model or simultaneously by a location model and a scale model (Maronna et al. 2006). Not only regression but also many other random models can be systematically studied using this probabilistic interpretation. We assume that an observation depends on the unknown true value and that a random process acts additively as
where the errors are random variables. This is called thelocation model and was defined by Huber (1964). If the errors are independent with common distribution then the outcomes are independent, with common distribution function
and density function . An estimate is a function of the observations . We are looking for estimates that, with high probability, satisfy . Themaximum likelihood estimate (MLE) of is a function of observations that maximize the likelihood function (joint density):
The estimate of the location can be obtained from:
Since is positive and the logarithm function is an increasing function, the MLE of a location can be calculated using a simple maximization logarithm statement:
If the distribution is known then the MLE will have desirable mathematical and optimality properties, in the sense that among unbiased estimators it has the lowest variance and an approximately normal distribution. In the presence of outliers, since the distribution and, in particular, the mixture distribution (1.1) are unknown or only approximately known, statistically optimal properties might not be achieved. In this situation, some optimal estimates can still be found, however. Maronna et al. (2006, p. 22) state that to achieve optimality, the goal is to find estimates that are:
nearly optimal
when
is normal
nearly optimal
when
is approximately normal.
To this end, since MLEs have good properties such as sufficiency, known distribution and minimal bias within an unbiased estimator but are sensitive to the distribution assumptions, an MLE‐type estimate of (1.4) can be defined. This is called an M‐estimate. As well as the M‐estimate for location, a more general definition can be developed. Let:
The negative logarithm of (1.3) can then be written as .
A more sophisticated form of M‐estimate can be defined by generalizing to give an estimator for a multidimensional unknown parameter of an arbitrary modeling of a given random sample .
If a random sample is given, and is an unknown ‐dimensional parameter of a statistical model describing the behavior of the data, any estimator of is a function of a random sample . The M‐estimate of can be defined in two different ways: by a minimization problem of the form (estimating equation and functional form are represented together):
or as the solution of the equation with the functional form
where the functional form means , is an empirical CDF, and (the robust loss function) and are arbitrary functions. If is partially differentiable, we can define the psi function as , which is specifically proportional to the derivative (), and the results of Equations (1.6) and (1.7) are equal. In this section we are interested in the M‐estimate of the location for which .
The M‐estimate was first introduced for the location parameter by Huber (1964). Later, Huber (1972) developed the general form of the M‐estimate, and the mathematical properties of the estimator (1973; 1981).
The M‐estimate of location is defined as the answer to the minimization problem:
or the answer to the equation:
If the function is differentiable, with derivative , the M‐estimate of the location (1.8) can be computed from the implicit equation (1.9).
If is a normal distribution, the function, ignoring constants, is a quadratic function and the parameter estimate is equivalent to the least squares estimate, given by:
which has the average solution .
If is a double exponential distribution with density , the rho function, apart from constants, is the absolute value function , and the parameter estimate is equivalent to the least median estimate given by:
which has median solution (see Exercise 1). Apart from the mean and median, the distribution of the M‐estimate is not known, but the convergence properties and distribution can be derived. The M‐estimate is defined under two different formulations: the approach from the estimating equation or by minimization of , where is a primitive function of with respect to . The consistency and asymptotic assumptions of the M‐estimate depend on a variety of assumptions. The approach does not have a unique root or an exact root, and a rule is required for selecting a root when multiple roots exist.
Let . Assume that:
has unique root
is continuous an either bounded or monotone.
Then the equation has a sequence of roots that converge in probability .
In most cases, the equation does not have an explicit answer and has to be estimated using numerical iteration methods. Starting from consistent estimates , one step of the Newton–Raphson estimate is , where . The consistency and normality of are automatic, but there is no root of and furthermore the iteration does not change the first‐order asymptotic properties.
Suppose the following assumptions are satisfied:
has unique root
is monotone in
t
exists and
in some neighborhood of
and is continuous at
.
Then any sequence of the root of satisfies
For a proof, see DasGupta (2008) and Serfling (2002).
Thus the location estimate from (1.9) or (1.8), under the conditions of Theorem 1.3, will converge in probability to the exact solution of as . Under the conditions of Theorem 1.5 it will have a normal distribution:
