100,99 €
A hands-on approach to statistical inference that addresses the latest developments in this ever-growing field This clear and accessible book for beginning graduate students offers a practical and detailed approach to the field of statistical inference, providing complete derivations of results, discussions, and MATLAB programs for computation. It emphasizes details of the relevance of the material, intuition, and discussions with a view towards very modern statistical inference. In addition to classic subjects associated with mathematical statistics, topics include an intuitive presentation of the (single and double) bootstrap for confidence interval calculations, shrinkage estimation, tail (maximal moment) estimation, and a variety of methods of point estimation besides maximum likelihood, including use of characteristic functions, and indirect inference. Practical examples of all methods are given. Estimation issues associated with the discrete mixtures of normal distribution, and their solutions, are developed in detail. Much emphasis throughout is on non-Gaussian distributions, including details on working with the stable Paretian distribution and fast calculation of the noncentral Student's t. An entire chapter is dedicated to optimization, including development of Hessian-based methods, as well as heuristic/genetic algorithms that do not require continuity, with MATLAB codes provided. The book includes both theory and nontechnical discussions, along with a substantial reference to the literature, with an emphasis on alternative, more modern approaches. The recent literature on the misuse of hypothesis testing and p-values for model selection is discussed, and emphasis is given to alternative model selection methods, though hypothesis testing of distributional assumptions is covered in detail, notably for the normal distribution. Presented in three parts--Essential Concepts in Statistics; Further Fundamental Concepts in Statistics; and Additional Topics--Fundamental Statistical Inference: A Computational Approach offers comprehensive chapters on: Introducing Point and Interval Estimation; Goodness of Fit and Hypothesis Testing; Likelihood; Numerical Optimization; Methods of Point Estimation; Q-Q Plots and Distribution Testing; Unbiased Point Estimation and Bias Reduction; Analytic Interval Estimation; Inference in a Heavy-Tailed Context; The Method of Indirect Inference; and, as an appendix, A Review of Fundamental Concepts in Probability Theory, the latter to keep the book self-contained, and giving material on some advanced subjects such as saddlepoint approximations, expected shortfall in finance, calculation with the stable Paretian distribution, and convergence theorems and proofs.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 974
Veröffentlichungsjahr: 2018
Cover
Preface
Part I: Essential Concepts in Statistics
Chapter 1: Introducing Point and Interval Estimation
1.1 Point Estimation
1.2 Interval Estimation via Simulation
1.3 Interval Estimation via the Bootstrap
1.4 Bootstrap Confidence Intervals in the Geometric Model
1.5 Problems
Chapter 2: Goodness of Fit and Hypothesis Testing
2.1 Empirical Cumulative Distribution Function
2.2 Comparing Parametric and Nonparametric Methods
2.3 Kolmogorov–Smirnov Distance and Hypothesis Testing
2.4 Testing Normality with KD and AD
2.5 Testing Normality with
and
2.6 Testing the Stable Paretian Distributional Assumption: First Attempt
2.7 Two-Sample Kolmogorov Test
2.8 More on (Moron?) Hypothesis Testing
2.9 Problems
Chapter 3: Likelihood
3.1 Introduction
3.2 Cramér–Rao Lower Bound
3.3 Model Selection
3.4 Problems
Chapter 4: Numerical Optimization
4.1 Root Finding
4.2 Approximating the Distribution of the Maximum Likelihood Estimator
4.3 General Numerical Likelihood Maximization
4.4 Evolutionary Algorithms
4.5 Problems
Chapter 5: Methods of Point Estimation
5.1 Univariate Mixed Normal Distribution
5.2 Alternative Point Estimation Methodologies
5.3 Comparison of Methods
5.4 A Primer on Shrinkage Estimation
5.5 Problems
Part II: Further Fundamental Concepts in Statistics
Chapter 6: Q-Q Plots and Distribution Testing
6.1 P-P Plots and Q-Q Plots
6.2 Null Bands
6.3 Q-Q Test
6.4 Further P-P and Q-Q Type Plots
6.5 Further Tests for Composite Normality
6.6 Combining Tests and Power Envelopes
6.7 Details of a Failed Attempt
6.8 Problems
Chapter 7: Unbiased Point Estimation and Bias Reduction
7.1 Sufficiency
7.2 Completeness and the Uniformly Minimum Variance Unbiased Estimator
7.3 An Example with i.i.d. Geometric Data
7.4 Methods of Bias Reduction
7.5 Problems
Chapter 8: Analytic Interval Estimation
8.1 Definitions
8.2 Pivotal Method
8.3 Intervals Associated with Normal Samples
8.4 Cumulative Distribution Function Inversion
8.5 Application of the Nonparametric Bootstrap
Problems
Part III: Additional Topics
Chapter 9: Inference in a Heavy-Tailed Context
9.1 Estimating the Maximally Existing Moment
9.2 A Primer on Tail Estimation
9.3 Noncentral Student's
t
Estimation
9.4 Asymmetric Stable Paretian Estimation
9.5 Testing the Stable Paretian Distribution
Chapter 10: The Method of Indirect Inference
10.1 Introduction
10.2 Application to the Laplace Distribution
10.3 Application to Randomized Response
10.4 Application to the Stable Paretian Distribution
Problems
Appendix A: Review of Fundamental Concepts in Probability Theory
A.1 Combinatorics and Special Functions
A.2 Basic Probability and Conditioning
A.3 Univariate Random Variables
A.4 Multivariate Random Variables
A.5 Continuous Univariate Random Variables
A.6 Conditional Random Variables
A.7 Generating Functions and Inversion Formulas
A.8 Value at Risk and Expected Shortfall
A.9 Jacobian Transformations
A.10 Sums and Other Functions
A.11 Saddlepoint Approximations
A.12 Order Statistics
A.13 The Multivariate Normal Distribution
A.14 Noncentral Distributions
A.15 Inequalities and Convergence
A.16 The Stable Paretian Distribution
A.17 Problems
A.18 Solutions
References
Index
End User License Agreement
Chapter 1
Table 1.1 Comparison of three point estimators for the geometric model
Chapter 2
Table 2.1 Power of the KD and AD tests for the mixture of geometric distributions example
Table 2.2 Cutoff values for the KD and AD composite tests of normality, as a function of sample size
and significance level
, to four significant digits, based on simulation with 10 million replications
Chapter 3
Table 3.1 Mean squared error values for five models. The true model is Student's
, column T, with five degrees of freedom
Table 3.2 Mean squared error values for five models. True model is NormLap, column L, with
Table 3.3 Mean squared error values for five models. True model is symmetric stable, column S, with
Chapter 5
Table 5.1 Empirical coverage of one-at-a-time 95% c.i.s of mixed normal models (5.28) (left) and (5.4) (right) based on
observations
Table 5.2 Similar to Table 5.1 but for the experiment 2 and contaminated models
Table 5.3 The time required to estimate 100 of the contaminated model data sets (5.7), each with
n
= 100, on a standard 3.2 GHz PC, and given in seconds unless otherwise specified. All methods using the generic optimizer are based on a convergence tolerance of
, while the EM algorithm used a convergence tolerance of
. The calculation of the direct m.l.e. is just denoted by m.l.e., whereas EM indicates the use of the EM algorithm, and q-B.e. denotes the quasi-Bayesian estimator with shrinkage prior and strength
w
= 4
Table 5.4 Actual coverage of nominal one-at-a-time c.i.s based on the bootstrap, for four models and two estimation methods
Table 5.5 Similar to Table 5.4 but using the qB(1) and qB(4) estimation methods
Chapter 6
Table 6.1 Coefficients in regression (6.3)
Table 6.2 Comparison of power for various normal tests of size 0.05, using the two-component mixed normal distribution as the alternative, obtained via simulation with 1 million replications for each model, and based on two sample sizes,
and
. Model #0 is the normal distribution, used to serve as a check on the size. The entry with the highest power for each alternative model and sample size
appears in
bold
. Entries with power lower than the nominal/actual value of 0.05, indicating a biased test, are given in
italic
Table 6.3 Correlation between tests for normality under the null, using sample size
, based on 1 million replications. For the
test, nine bins were used
Table 6.4 The power of various tests for Laplace, against the Gaussian alternative, for
and
Chapter 7
Table 7.1
as a function of
, for Problem 16
Chapter 8
Table 8.1 Comparison of lengths of 95% c.i.s for
in the normal model with sample size
Table 8.2 Quantiles and lengths for the equal-tail (et) and minimal-length (min) 95% c.i.s for
Table 8.3 Accuracy of the 95% c.i. of
that conditions on the observed sum. EC is the empirical coverage proportion, with lo and hi denoting the endpoints of the asymptotic 95% c.i. for EC
Chapter 9
Table 9.1 Actual sizes of the nominal 5% and 1%
test (9.11) for data of length
with tail index
. The entries in italics (for
and 1000) make use of the adjustment procedure via multiplicative factor
. The remaining entries do not use the adjustment procedure
Table 9.2 Actual sizes of the nominal 5% and 1%
test (9.15) for data of length
with tail index
Table 9.3 Actual sizes of the nominal 5% and 1%
test (9.16) for data of length
with tail index
. The rows labeled
and
indicate the use of the true value of
instead of using
and linear interpolation into the constructed table of
-values
Table 9.4 Actual sizes of tests, for i.i.d. symmetric stable data with tail index
, for sample size
, using the nominal size of 5%
Table 9.5 Power against the Student's
alternative, for degrees-of-freedom values
and sample size
, using the nominal size of 5%
Table 9.6 Power against the mixed normal alternative, with p.d.f.
, for second component scale values
, using the nominal size of 5%
Table 9.7 Power against the NIG alternative, with p.d.f. (9.17), using
,
,
, and shape values
, for the nominal size of 5%
Table 9.8 Power against the GA
alternative using
and
(and
,
, though the location and scale terms are irrelevant for power considerations), and shape values
, for nominal size of 5%
Table 9.9 Actual sizes of the
and ALHADI nominal 5% tests, as designed for symmetric stable data but applied to asymmetric stable data, using 10,000 replications, and based on
and
, using sample size
, and ignoring the asymmetry
Table 9.10 Similar to Table 9.9, showing actual size for a nominal size of 5%, again based on sample size
and using 10,000 replications, but accounting for asymmetry by having applied transform (9.19). Also shown are the actual sizes of the combined test
and l.r.t.
Table 9.11 For
and nominal size 5%, power values against asymmetric alternatives of the
, ALHADI, combined
, and l.r.t. (9.18) tests, using transform (9.19); and, in the last row, l.r.t. (9.21). The left panels show the power for the noncentral Student's
based on
degrees of freedom and noncentrality (asymmetry) parameters
. The center panels use the asymmetric NIG (9.17) with NIG shape parameters
and
. The rightmost column is for the IHS distribution (9.20) with
and
Chapter 1
Figure A.1 (a) True
expected shortfall of a standard skew normal random variable as a function of asymmetry parameter
(solid) and its s.p.a. based on (A.114) (dashed). (b) The relative percentage error of the s.p.a. based on (A.114) (denoted SPA1) and that of the less accurate of two second-order s.p.a.s (denoted SPA2) developed in Broda et al. (2017).
Figure A.2 The vertical axis is
, and the horizontal axis is
. This graphically verifies that
, where
is the region above the line
(left plot),
is the region indicated by horizontal lines, and
is the region indicated by vertical lines.
Figure A.3 The function
, where
is given in (A.291).
Figure A.4 (a) Asymmetric stable p.d.f. for
and two values of
. (b) Discrepancy between the two computation methods using the
case.
Figure A.5 Expected shortfall for
as a function of
, using
(solid) and
(dashed).
Figure A.6 (a) The exact expected shortfall (solid lines) and its saddlepoint approximation (dashed lines) as a function of
for an
random variable (truncated at
for visibility reasons), and three values of
. (b) The relative percentage error of the s.p.a., shown up to
. The relative percentage error is symmetric about
.
Figure A.7 Exact and s.p.a. skew normal p.d.f. with
.
Figure A.8 (a) The true (solid) and second-order s.p.a. (dashed) of the convolution of two independent skew normal r.v.s with
,
,
, and
,
, and
. (b) The relative percentage error of the first (solid) and second-order (dashed) s.p.a.
Figure A.9 Venn diagram for events such that
,
, and
.
Figure A.10 Mass functions (A.344) (solid lines) and kernel density estimates of the simulated density (dashed lines) based on 10,000 replications, for (from left to right)
,
, and
.
Figure A.11 Theoretical density and kernel density from simulation, as well as fitted beta, for
and two values of
.
Chapter 1
Figure 1.1 Distribution of point estimators
(a),
(b), and
(c) using output from the program in Listing 1.1 with
and
, based on simulation with 10,000 replications.
Figure 1.2 Histogram of point estimator
for
and four values of
, based on simulation with 1 million replications.
Figure 1.3 The m.s.e. of estimators
(lines) and
(lines with circles) for parameter
in the geometric model, as a function of
, for three sample sizes, obtained by simulation with 100,000 replications.
Figure 1.4 Simulations of
for
,
, for
and
(a) and
(b), based on 10,000 replications.
Figure 1.5 Mapping
between nominal and actual coverage probabilities for c.i.s of the success parameter in the i.i.d. Bernoulli model, based on the (single) parametric bootstrap, each computed via simulation with 100,000 replications.
Figure 1.6 (a) Actual coverage, based on simulation with 10,000 replications, of nominal 90% c.i.s using the (single) nonparametric bootstrap (with
). Graph is truncated at 0.6, with the actual coverage for
and
and
being about
. (b) Same but using the modified c.i. in (1.5) and (1.6).
Figure 1.7 Same as Figure 1.6(b), but using different numbers of bootstrap replications.
Figure 1.8 (a) Actual coverage of nominal 90% c.i.s using the double bootstrap (truncated at 0.3), based on 1000 replications. (b) Same but using the modified c.i. in (1.5) and (1.6) applied to each simulated data set and to each bootstrap sample in the outer bootstrap loop.
Figure 1.9 Similar to Figure 1.5 (mapping between nominal and actual coverage probabilities for c.i.s of the success parameter in the i.i.d. Bernoulli model) except that, instead of using the single bootstrap for the c.i.s, this uses the analytic method. In figure b) actual coverage for a given
is identical to that for
.
Figure 1.10 Actual coverage of nominal 90% c.i.s using the double bootstrap with inner loop replaced by the analytic c.i., and having used the modification (1.5) and (1.6) in the outer bootstrap loop. (a) uses
; (b) uses
.
Figure 1.11 Mapping between nominal and actual coverage probabilities for c.i.s of the success parameter
in the i.i.d. geometric model, using the parametric bootstrap. Based on 100,000 replications and
.
Figure 1.12 Same as Figure 1.11, also with
and
, but with the nonparametric bootstrap (NPB).
Figure 1.13 (a) Actual coverage of the three types of c.i.s (lines), along with the true nominal coverage,
, from (1.7), as dark circles. (b) The average length of the c.i.s.
Chapter 2
Figure 2.1 The true distribution, obtained via simulation with 10,000 replications, of the Kolmogorov–Smirnov goodness-of-fit test statistic, and its asymptotic distribution (2.7) for the standard (location-zero, scale-one) normal (a) and Cauchy (b) distributions.
Figure 2.2
in (2.15) versus
based on simulation.
Figure 2.3 (a) The e.c.d.f. (solid) based on 50 observations and true c.d.f. (dotted) of an
. (b) Same, but adds horizontal 95% error bounds obtained by simulation of order statistics using the true
model. The horizonal line at the 34th order statistic just serves as a reminder that the bounds are to be understood horizontally.
Figure 2.4 The same e.c.d.f. (solid) and true
c.d.f. (dotted) as in Figure 2.3 but with 95% nonparametric bootstrap c.i.s (a) and 95% asymptotic c.i.s (b).
Figure 2.5 (a) The e.c.d.f. (solid) based on 50 observations and true c.d.f. (dashed) of a
distribution. (b) Same, but adds vertical 95% error bounds (these are
not
c.i.s) obtained by simulation of order statistics using the true
model.
Figure 2.6 The same e.c.d.f. (solid) and true
c.d.f. (dotted) as in Figure 2.5 but with 95% nonparametric bootstrap c.i.s (a) and 95% asymptotic c.i.s (b).
Figure 2.7 The m.s.e. comparison of the three estimators nonparametric (solid), parametric using the m.l.e.
(dashed), and parametric using the efficient estimator
(dash-dotted), as a function of
, using sample size
, for estimating the probability of getting pregnant within (up to, and including) 4 (top) and 8 (bottom) months, where in the graphics
moi
stands for “month of interest.” Based on simulation with 100,000 replications.
Figure 2.8 Comparison of behavior of correctly specified (top) and misspecified (bottom) fitted c.d.f.s.
Figure 2.9
Top
: Actual coverage of 95% parametric (solid) and nonparametric (dashed) bootstrap c.i.s for
as a function of sample size
under study A (left) and study B (right).
Bottom
: Same, but for 90% c.i.s.
Figure 2.10 Distribution of KD statistic for the i.i.d.
model with
observations, with marked cutoff values
,
and
.
Figure 2.11 The actual acceptance probability
(
-axis) versus the nominal probabilities
(solid line), for the KD statistic and the geometric pregnancy example, for
and
(a) and
(b). The dashed
line indicates the case when nominal and actual are equal.
Figure 2.12 (Top) Power of the KD test for normality, using significance level
, for three different sample sizes, and the Student's
alternative (left) and skew normal alternative (right), based on 1 million replications. (Middle) Same, but for the AD test. (Bottom) Same, but power of the
(lines without circles) and
(lines with circles) test for normality. The
and
power curves for the Student's
alternative are graphically indistinguishable.
Figure 2.13 Actual size of the four tests, for nominal size 0.05, based on 10,000 replications.
Figure 2.14 (a) Boxplots of
resulting when estimating all four parameters of the stable model, but with the data generated as Student's
with various degrees of freedom. (b) Power of the proposed set of tests against a Student's
alternative, for various degrees of freedom, and based on 10,000 replications.
Figure 2.15 The true distribution, obtained via simulation with 10,000 replications, of the Kolmogorov–Smirnov two-sample goodness-of-fit test statistic, and its asymptotic distribution (2.7) for the normal (a) and Cauchy (b) distributions.
Chapter 3
Figure 3.1 Standardized log-likelihoods (solid) and quadratic approximation (3.4) (dashed) for the Poisson with
(a) and
(b), with solid and dashed vertical lines showing the m.l.e. and true parameter, respectively.
Figure 3.2 Percentage error of (3.26) when using the Laplace approximation to the
function for
(a) and
(b) for sample sizes
, and
(lines from top to bottom). The
-axis indicates the value of
in (3.26).
Figure 3.3 (a) Density of
for
and
, and
. (b) Bias of
as given in (3.25) and (3.28), for
(solid),
(dashed), and
(dash-dotted) as a function of
. There is no graphical difference when using the Laplace approximation for the
function instead of its exact values.
Figure 3.4 (a) The traditional Mahalanobis distances (3.33) based on the m.l.e.s
and
for the 1945 observations of the returns on the components of the DJIA 30 index. Fifteen percent of the observations lie above the cutoff line. (b) Similar, but having used the robust Mahalanobis distance (3.34) based on the mean vector and covariance matrix from the m.c.d. method, resulting in 33% of observations above the cutoff line.
Figure 3.5 Kernel density estimate using 10,000 replications of the coefficient of variation based on
and
(solid) and the asymptotic normal distribution (dashed), for
(a) and
(b).
Chapter 4
Figure 4.1 Kernel density estimates of the m.l.e. of scale parameter
based on
i.i.d. Cauchy observations,
, 50, and 100. The larger
is, the more mass is centered around the true value of
.
Figure 4.2 Estimation results for the location parameter of a Cauchy model and illustration of a likelihood with multiple roots.
Figure 4.3 Comparison of the m.l.e. of
and the median of Cauchy samples with
(a) and
(b).
Figure 4.4 The m.s.e. of
versus
as an estimator of the location parameter
of Student's
data with known scale 1 and degrees of freedom 1 (a), 3 (b), 10 (c), and 50 (d), based on a sample size of
observations. The vertical axis was truncated to improve appearance. The dashed line in each plot is the m.s.e. of
.
Figure 4.5 The top left panel plots
versus
for
, each obtained via simulation using 25,000 replications. The top right is the same, but using a log scale. The bottom panels show the least squares residuals for the linear (left) and quadratic (right) fits for
.
Figure 4.6 Same as the top right panel in Figure 4.5 but for three additional sample sizes.
Figure 4.7 Daily returns for the NASDAQ index.
Figure 4.8 (a) Kernel density (solid) and fitted Student's
density (dashed) of the NASDAQ returns. (b) Simulation results of the m.l.e. for the Student's
model, based on
observations and true parameter values taken to be the m.l.e. of the
model for the NASDAQ returns. The boxplots show their differences.
Figure 4.9 Convergence of the method of iterating on the score functions (a), method of steepest descent (b), and the BFGS algorithm (c), for the log-likelihood of a
sample of size 100. The number of iterations required to arrive at the m.l.e. of
with the same accuracy was 56, 16, and 11, respectively.
Figure 4.10 (a) Kernel density (dashed) and fitted GA
density of the NASDAQ returns. (b) Same, but showing only the left tail and including the Student's
fit (dash-dotted).
Figure 4.11 Evolution of the DE population over time for selected iteration states, showing (from left to right, top to bottom) iterations 1, 10, 20, 30, 40, and 46.
Figure 4.12 Evolution of the CMAES population over time for selected iteration states, showing (from left to right, top to bottom) iterations 1, 5, 10, 15, 20, and 28.
Chapter 5
Figure 5.1 (a) Mixed normal density with parameters (5.4) shown as the solid line. The two components (multiplied by their respective mixture weights) are shown as dashed lines. (b) Simulated data set from model (5.4) using
and seed value 50, illustrating the possibility of an outlier.
Figure 5.2 (a) Realization of model (5.4) with
. (b) Fitted models (5.10), both being from local maxima of the likelihood.
Figure 5.3 Using showcase model (5.4) with
and the data set shown in Figure 5.1b, the plots show the true density (thick solid line; the same as in Figure 5.1a) and 100 fitted densities (thin lines), each having used a different (randomly chosen) starting value. The box constraint numbers
are given in (5.11), and increasingly place more restrictions on the allowable parameter space.
Figure 5.4 (a) Of the 100 fitted densities shown in Figure 5.3, this shows the one corresponding to the highest likelihood, for each of the four constraints. (b) Same, but from the densities shown in Figure 5.5(a)–(d), which are based on the
-rounded likelihood for
.
Figure 5.5 Same as Figure 5.3 but using the
-rounded likelihood from (3.1) with
(a–d) and
(e–h).
Figure 5.6 (a) Same as Figure 5.3a but having used 1000 instead of 100 fitted densities. (b) Same, but having used the EM algorithm (which implicitly imposes the same constraints on the
and
as does our constraint 0) with 1000 fitted densities.
Figure 5.7 Comparison of log total m.s.e. for
(leftmost boxplot in all six panels) and
from (5.19), using shrinkage form (5.20), for
(left) and
(right), for a grid of values of
, dictating the strength of the shrinkage. The top panels are for the showcase constellation (5.4) with
observations. The middle and bottom panels correspond to (5.21) and (5.22), respectively. The simulation is based on 1000 replications. The horizontal dashed lines show the median m.l.e. value of
from (5.6). The other dashed line traces the mean of
.
Figure 5.8 Comparison of total m.s.e. for
(leftmost boxplot in both panels) and
using, for the latter, prior (5.27) with varying strength
. The simulation is based on 1000 replications. The horizontal dashed lines show the median m.l.e. value of
from (5.6). The other dashed line traces the mean of
.
Figure 5.9 Bias of the m.l.e. (left half of both panels) and q.B.e. (right half of both panels) based on
, for two sample sizes
(a) and
(b), all based on 1000 replications.
Figure 5.10 (a) For the contaminated normal model (5.7) with
, measure
from (5.6) for the m.l.e. computed via the direct method (denoted MLE), the m.l.e. computed via the EM algorithm (denoted EM), the m.m.e. restricted to have equal means, and the unrestricted m.m.e. for the 824 out of 1000 data sets for which the unrestricted (and restricted) m.m.e. existed. The horizontal dashed line shows the median m.l.e. value of
. (b) Same, but using sample size
.
Figure 5.11 (a) For showcase model (5.4), measure
from (5.6) for the m.l.e. and the m.m.e., using
, and based on 1000 replications. (b) Same, but using the four goodness-of-fit measures in Section 5.2.2.
Figure 5.12 Same as Figure 5.11, but using the contaminated normal model (5.7) with
.
Figure 5.13 (a) Same as Figure 5.11 but using the m.l.e. and q.l.s. estimators for several values of
, applied to the showcase model (5.4) for
. (b) Same, but for the
estimator for several values of
.
Figure 5.14
Left
: Similar to Figure 5.13a, using the
estimator applied to our showcase model (5.4),
replications and sample size
, for fixed number of bins
, and penalized according to (5.36) for
and two sets of
-values (top and bottom).
Right
: Same, but for the contaminated normal model (5.7), with
,
,
.
Figure 5.15 (a) Same as Figures 5.11 and 5.13, but using the m.l.e. and the empirical m.g.f. estimator (5.37) for several values of
. Q&R denotes the use of
, with the
being those suggested by Quandt and Ramsey (1978). (b) Same, but using the model for experiment 2 in (5.21).
Figure 5.16 (a) Similar to Figure 5.15(b), but using the empirical m.g.f. estimator, with
, with shrinkage, for
and a set of shrinkage values
, as in (5.36), and based on 1000 replications. The
-axis gives the value of
times
, that is, the values of
are very close to zero. (b) Same, but for the contaminated normal model (5.7), with
,
.
Figure 5.17 Horse race between the various methods of estimation for the models considered throughout the chapter. All are based on
replications and sample size
– except experiment 4, which uses
.
Figure 5.18 Mean squared error, as a function of
, based on simulation with 1000 replications, of
(a) and
(b) for the m.m.e. using two moment equations, with
. It is based on data
,
, where
, with
, and
and
are to be estimated. True values are
,
.
Figure 5.19 Boxplot of 1000 values of
using the Tailx estimator (and the last boxplot being the McCulloch quantile estimator), using the true values of
,
and sample size
as indicated in the titles of the plots.
Figure 5.20 Measure
from (5.6) for the m.l.e. and empirical m.g.f. estimator (with
) using the contaminated normal model (5.7) but for different values of
.
Chapter 6
Figure 6.1 Q-Q plots for the same Cauchy data set, just differing by the range on the
- and
-axes.
Figure 6.2 Q-Q plot for a random
sample of size
with 10% and 5% pointwise null bands obtained via simulation (top panels), using the estimated parameters (left) and the true parameters (right) of the data. The bottom panels are similar, but based on the asymptotic distribution in (A.189).
Figure 6.3 Q-Q plots with pointwise null bands, using a size of 0.05, for the same Cauchy data as shown in Figure 6.1.
Figure 6.4 The mapping between pointwise and simultaneous significance levels, for normal data (a) and Weibull data (b) using sample size
.
Figure 6.5 Power of Q-Q test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on simulation with 1000 replications.
Figure 6.6 (a) Cauchy S-P plot with null bands, obtained via simulation, using a pointwise significance level of 0.01. (b) Same, but using the horizontal format.
Figure 6.7 (Top) Stabilized P-P plot using the same random
sample of size
as in Figure 6.2 with 10% and 5% pointwise null bands obtained via simulation, using the estimated parameters (left) and the true parameters (right) of the data. (Bottom) Same as top, but with constant-width null bands.
Figure 6.8 Same as Figure 6.7, but plotted in horizontal format.
Figure 6.9 (a) The solid, dashed, and dash-dotted lines are the widths for the pointwise null bands of the normal MSP plot, as a function of the pointwise significance level
, computed using simulation with 50,000 replications. The overlaid dotted curves are the same, but having used the instantaneously computed approximation from (6.4) and (6.3). There is no optical difference between the simulation and the approximation. (b) For the normal MSP plot, the mapping between pointwise and simultaneous significance levels using sample size
.
Figure 6.10 Power of the MSP test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on 1 million replications.
Figure 6.11 Kernel density and fitted skew normal distribution of sample size
times the MSP test statistic (6.6), computed under the null, and based on 1 million replications.
Figure 6.12 One million
-values from the MSP test with
, under the null (a) and for a Student's
with
degrees of freedom alternative (b).
Figure 6.13 Normal Fowlkes-MP (left) and normal MSP (right) plots, with simultaneous null bands, for normal data (top) and mixed normal data (bottom).
Figure 6.14 Power of Fowlkes-MP test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on 1 million replications.
Figure 6.15 The average and smallest
-values of the MSP univariate test of normality from Section 6.4.3, for the
stocks comprising the DJIA, in each of the two separated mixed normal components, and based on moving windows of sample size
.
Figure 6.16 Kernel density estimate (solid) of the log of the JB test statistic, under the null of normality and using a sample size of
, based on 10 million replications (and having used Matlab's
ksdensity
function with 300 equally spaced points). (a) fitted GA
density (dashed). (b) fitted noncentral
(dashed) and asymmetric stable (dash-dotted).
Figure 6.17 Simulated
-values of the JB test statistic, based on 1 million replications, using the GA
approximation (a) and the two-component GA
mixture (b).
Figure 6.18 Power of JB test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on 100,000 replications.
Figure 6.19 Power of the Ghosh (1996) test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on 100,000 replications.
Figure 6.20 Power of KL1 test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on 100,000 replications.
Figure 6.21 Power of the Torabi et al. (2016) (TMG) test for normality, for three different sample sizes, and Student's
alternative (a) and skew normal alternative (b), based on 100,000 replications.
Figure 6.22 Histograms of 1000
values from the p.i.t., using 30 bins and
the normal c.d.f. with mean and variance parameters estimated from the data.
Figure 6.23 Actual size of
test as a function of the number of bins
, used as a composite test of normality (two unknown parameters), based on 100,000 replications, using the built-in Matlab function
chi2gof
(solid), and the custom implementation in Listing 6.9, using the asymptotically valid cutoff values from the
distribution (dashed), and cutoff values obtained via simulation (dash-dotted).
Figure 6.24 The power of the
test for normality, against Student's
alternatives with various degrees of freedom and for four sample sizes
, with nominal size
, using the method from Listing 6.9 with simulated cutoff values, and based on 1 million replications.
Figure 6.25 Same as Figure 6.24 but using the skew normal as the alternative, with various asymmetry parameters
and sample sizes
.
Figure 6.26 (a) Power of size 0.05 Pearson
test for normality, based on 100,000 replications, using three bins (the optimal number, as indicated in Figure 6.24) and simulated cutoff values, for three different sample sizes, and Student's
alternative. (b) Same but using 11 bins (the compromise value from Figure 6.25) and the skew normal alternative.
Figure 6.27 Power of the MSP, JB, and joint tests using
and 100,000 replications.
Figure 6.28 (a) The power of the JB test (6.7) against the alternative of a Student's
(lines with circles; same power curves as given in Figure 6.18(a)), along with the power of the likelihood ratio test (using the Student's
as the specific alternative), based on 10,000 replications. (b) The power of the MSP test for normality against the alternative of skew normal (lines with circles; same power curves as in Figure 6.10(b)), along with the power of the likelihood ratio test (using the skew normal as the specific alternative), based on 10,000 replications.
Figure 6.29 (a) (left) histogram of an i.i.d. normal sample of size
; (right) its MSP plot. (b) similar, but using a resample from the original data set.
Figure 6.30 (a) The nonparametric bootstrap distribution of the
-value based on a random normal sample of size
, using the MSP test and
resamples. Its
-value is marked with the vertical line. (b) Same, but for a different random normal sample.
Figure 6.31 (a) Scatterplot based on 10,000 replications, with
-axis showing the
-value
from the MSP test, using a data set from the null, for
, and
-axis showing the fraction of bootstrap
-values (
), based on that data set, that were less than 0.05. The lines were obtained from quantile regression using regressors a constant,
,
and
. (b) Similar to the top panel, but the scatterplot corresponds to points obtained using a skew normal alternative with
,
but the lines are the same as those in the top panel
, that is, correspond to the quantiles under the null.
Figure 6.32 Similar to Figure 6.31 but using the JB test, and only showing the median and 95% quantile fitted lines.
Figure 6.33 (a) Power of the Laplace
test against the normal distribution as a function of
, for
, for three different test sizes (see legend in the bottom panel). (b) Power of
test for Laplace with
against Student's
alternative with
degrees of freedom, for three different test sizes.
Figure 6.34 The mapping between pointwise and simultaneous significance levels, for the Laplace Q-Q test using sample size
, with the actual points obtained from the simulation (circles) and the regression line with intercept, linear, and quadratic term (dashed).
Figure 6.35 (a) The exact and second-order s.p.a. p.d.f. (on the half-line) of
, where
,
. (b) The p.d.f. (on the half-line) of the standardized sum of 30 independent Laplace r.v.s, using (positive) random values as scale parameters, and the central limit theorem approximation.
Figure 6.36 For the same data set used in Figure 6.2, the top panel shows the normal Q-Q plot using the correct pointwise significance level
to obtain a simultaneous one of 0.05. The bottom uses the value of
determined using the “fast but wrong” method.
Figure 6.37 Normal Q-Q plots with size 0.05 as in Figure 6.36, but using a random sample of 50 observations from a Student's
distribution with 3 degrees of freedom.
Chapter 7
Figure 7.1 Bias (a) and m.s.e. (b) for estimators
(solid),
(dashed) and
(dash-dotted) for sample size
for the model in Example 7.15.
Figure 7.2 Bias (a) and m.s.e. (b) as a function of sample size
for estimators of parameter
in the discrete uniform example, for the m.l.e. (solid), u.m.v.u.e. (dashed), m.m.e. (dash-dotted) and bias adjusted estimator (dotted). The m.s.e. of the u.m.v.u.e. and bias-adjusted estimator are graphically indistinguishable.
Figure 7.3 Variance of
(the u.m.v.u.e. for
) (solid) and the CRlb (dashed), as a function of
.
Figure 7.4 Bias for the m.l.e. of the geometric distribution parameter for sample size
.
Figure 7.5 Illustration of how the mean-adjusted estimator is determined. The graph shows the function
for
. If the observed value of
,
is
, then, as indicated with arrows in the figure,
.
Figure 7.6 Based on output from the program in Listing 7.1, this shows the mean and median bias, and the m.s.e., of the five estimators: the m.l.e.
given in (3.25) (denoted MLE in the legend); the mean-bias-adjusted estimator
given in (7.24) (ADJ); the median-unbiased estimator
given in (7.26) (MED); the u.m.v.u.e.
given in (7.25) (UNB); and the mode-adjusted estimator
given in (7.28) (MOD), as a function of
, based on 10,000 replications, for
(left) and
(right).
Figure 7.7 Based on output from the program in Listing 7.1, this shows kernel density estimates of the five estimators: the m.l.e.
given in (3.25) (denoted MLE in the legend); the mean-bias-adjusted estimator
given in (7.24) (ADJ); the median-unbiased estimator
given in (7.26) (MED); the u.m.v.u.e.
given in (7.25) (UNB); and the mode-adjusted estimator
given in (7.28) (MOD), for
(left) and
(right), based on 10,000 replications, for
(top) and
(bottom). The vertical dashed line indicates the true value of
.
Figure 7.8 Same as Figure 7.6 but with overlaid results, as the new, thicker line, corresponding to the properties of the estimator resulting from taking the value
if
is less than 0.5, and
otherwise.
Figure 7.9 Same as Figure 7.6 but with overlaid results, as the new, thicker line, corresponding to the estimator (7.31).
Figure 7.10 The bias (a) and m.s.e. (b) of the m.l.e.
(solid), the jackknife estimator
(dashed), and the unbiased estimator
given in (7.25) (dash-dotted), based on sample size
and 50,000 replications. The smoothness of the curves is obtained by using the same seed value when generating the data for each value of
, but this is almost irrelevant given the large number of replications.
Figure 7.11 For Problem 19, this shows the bias (left) and m.s.e. (right) of the m.l.e.
(solid), the jackknife
(dashed) and (7.12) (dash-dotted), as a function of
, based on 2000 replications, for
(top) and
(bottom).
Chapter 8
Figure 8.1 Simulated lengths of 95% c.i.s for
in the
model assuming
unknown.
Figure 8.2 Length of the c.i. (8.4) for parameter
of the location exponential model.
Figure 8.3
Left
: The normal c.d.f.
versus
for
(solid),
(dashed) and
(dash-dotted).
Middle
:
for
(solid),
(dashed) and
(dash-dotted).
Right
:
versus
.
Figure 8.4
Left
: The c.d.f. of
in Example 8.9 for
(solid),
(dashed) and
(dash-dotted).
Right
: Ratio of pivotal method c.i. length to c.d.f. inversion method c.i. length versus
.
Figure 8.5 The length of the 90% c.i. of
in Example 8.14, with
and
, as a function of
.
Chapter 9
Figure 9.1 Moment plots for
(top, from left to right) and
(bottom, from left to right) for 2000 simulated i.i.d. Student's
realizations.
Figure 9.2 Moment plots for
(from left to right) for the NASDAQ return series.
Figure 9.3 (a) Estimated values of the degrees-of-freedom parameter for the Student's
distribution, but for data sets that are symmetric stable with tail index 1.6. (b) Estimated values of tail index
for the symmetric stable distribution, but the data are Student's
with
degrees of freedom.
Figure 9.4 Thin, solid lines correspond to the 30 estimated tail index values
for the location–scale asymmetric stable Paretian model, while the thick, empty boxes correspond to the 30 estimated degrees of freedom values
for the (symmetric) location–scale Student's
model.
Figure 9.5 Hill estimates as a function of
, known as Hill plots, for simulated Pareto, symmetric stable Paretian, and Student's
, with tail index for Pareto and stable
, and Student's
degrees of freedom
(a) and
,
(b), based on sample size
.
Figure 9.6 Comparison of four estimators of the NCT distribution based on 10,000 replications, two sample sizes, and two parameter constellations. (a) Correspond to
,
,
; (b) to
,
,
. True parameter values are indicated by vertical dashed lines. The m.l.e.-based distributions are optically almost indistinguishable.
Figure 9.7 Performance comparison via boxplots of the Hint, McCulloch, and ML estimators of tail index
for i.i.d. symmetric stable Paretian data based on sample size
.
Figure 9.8 Comparison of the small-sample distribution of the McCulloch and maximum likelihood estimators of the parameters of the
model for an i.i.d. data set with
and
, based on values
,
,
, and
.
Figure 9.9 Mean squared error of
for the McCulloch estimator (solid) and m.l.e. (dashed) for
(a) and
(b), for the i.i.d. model with
observations and
distribution. For both McCulloch and the m.l.e., all four parameters are assumed unknown and are estimated.
Figure 9.10 Mean squared error of
for the McCulloch estimator (solid), the Hint estimator (9.7) (dashed), the m.l.e. (dash-dotted), and the method of moments estimator
from Example 5.6 (circles), for
(a) and
(b), for the i.i.d. model with
observations and
distribution. For the m.l.e., maximization was done only with respect to
; parameters
,
and
were fixed at their known values.
Figure 9.11
First row
: Kernel density, based on 10,000 replications, of the McCulloch estimator (left) and the Kogon and Williams (1998) empirical c.f. estimator (right) of
, for sample size
.
Second row
: Same, for the m.l.e. of
, but based on only 1000 replications, using the FFT method to calculate the stable density (and, thus, the log-likelihood) (left) and the fast spline-approximation routine for the stable density provided in Nolan's toolbox (function
stableqkpdf
) (right).
Third and fourth rows
: The bottom four panels are the same as the top four, but using
observations.
Figure 9.12 (a) The 90%, 95%, and 99% Wald confidence intervals for
for each of the 30 DJIA stock return series, obtained from having estimated the four-parameter location–scale asymmetric stable distribution. (b) Likelihood ratio test statistics and associated 90%, 95%, and 99% cutoff values.
Figure 9.13 The first boxplot represents the 30 estimated stable Paretian asymmetry parameters,
, for the 30 daily return series on the Dow Jones Industrial Average index, using the McCulloch estimator. The dashed line illustrates their median. Each of the other 19 boxplots is based on the 30 values,
, the
th of which was estimated from a simulated data set of 2020 i.i.d.
values.
Figure 9.14 Values of
, as a function of stable tail index
, based on 10,000 replications, for the
test (9.11) for the two sample sizes
(a) and
(b).
Figure 9.15 Plots associated with the
summability test, based on
, using (a) symmetric stable data with
, (b) Student's
with three degrees of freedom.
Figure 9.16 Boxplots of
,
, and
based on 1000 simulated symmetric stable data sets, each of length
and for tail index
.
Figure 9.17 Similar to Figure 9.16 but based on simulated Student's
data with
degrees of freedom (here denoted by df), and using
. MLE refers to the maximum likelihood estimator of stable tail index
.
Figure 9.18 Boxplots of
,
, and
under four non-stable-Paretian distributional assumptions, based on 1000 replications, each of length
.
Figure 9.19 The Hint (thick circle); the m.l.e. estimating all four parameters as unknown (star); the m.l.e. estimating just
,
, and
, taking
(thin circle); and McCulloch (square) estimates of stable tail index
for each of the 30 DJIA daily stock return series. The lines indicate the interval of
using the four-parameter m.l.e.
Figure 9.20
Top left
: Simulated distribution of the ALHADI test statistic (9.16),
, using 2000 series of i.i.d.
data of length
, where the parameter vector
is the m.l.e. of the daily returns of the AT&T closing stock price, this being the fourth component of the DJIA index.
Top right
: The nonparametric bootstrap distribution of
, using
bootstrap draws from the AT&T return series. The thin vertical line shows the actual value of
for the AT&T returns.
Bottom
: Similar, but using
instead of the ALHADI test statistic.
Figure 9.21
Top
: The ALHADI test statistic for each of the 30 DJIA return series: For each, the left boxplot corresponds to the distribution of the ALHADI test statistic based on simulation of
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by Walter A. Shewhart and Samuel S. Wilks
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels
The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.
Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.
A complete list of titles in this series can be found at http://www.wiley.com/go/wsps
Marc S. Paolella
Department of Banking and Finance University of Zurich Switzerland
This edition first published 2018
© 2018 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Marc S. Paolella to be identified as the author of this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.