M-statistics - Eugene Demidenko - E-Book

M-statistics E-Book

Eugene Demidenko

0,0
100,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

M-STATISTICS A comprehensive resource providing new statistical methodologies and demonstrating how new approaches work for applications M-statistics introduces a new approach to statistical inference, redesigning the fundamentals of statistics, and improving on the classical methods we already use. This book targets exact optimal statistical inference for a small sample under one methodological umbrella. Two competing approaches are offered: maximum concentration (MC) and mode (MO) statistics combined under one methodological umbrella, which is why the symbolic equation M=MC+MO. M-statistics defines an estimator as the limit point of the MC or MO exact optimal confidence interval when the confidence level approaches zero, the MC and MO estimator, respectively. Neither mean nor variance plays a role in M-statistics theory. Novel statistical methodologies in the form of double-sided unbiased and short confidence intervals and tests apply to major statistical parameters: * Exact statistical inference for small sample sizes is illustrated with effect size and coefficient of variation, the rate parameter of the Pareto distribution, two-sample statistical inference for normal variance, and the rate of exponential distributions. * M-statistics is illustrated with discrete, binomial, and Poisson distributions. Novel estimators eliminate paradoxes with the classic unbiased estimators when the outcome is zero. * Exact optimal statistical inference applies to correlation analysis including Pearson correlation, squared correlation coefficient, and coefficient of determination. New MC and MO estimators along with optimal statistical tests, accompanied by respective power functions, are developed. * M-statistics is extended to the multidimensional parameter and illustrated with the simultaneous statistical inference for the mean and standard deviation, shape parameters of the beta distribution, the two-sample binomial distribution, and finally, nonlinear regression. Our new developments are accompanied by respective algorithms and R codes, available at GitHub, and as such readily available for applications. M-statistics is suitable for professionals and students alike. It is highly useful for theoretical statisticians and teachers, researchers, and data science analysts as an alternative to classical and approximate statistical inference.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 301

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

Dedication

Preface

Chapter 1: Limitations of classic statistics and motivation

1.1 Limitations of classic statistics

1.2 The rationale for a new statistical theory

1.3 Motivating example: normal variance

1.4 Neyman-Pearson lemma and its extensions

References

Chapter 2: Maximum concentration statistics

2.1 Assumptions

2.2 Short confidence interval and MC estimator

2.3 Density level test

2.4 Efficiency and the sufficient statistic

2.5 Parameter is positive or belongs to a finite interval

References

Chapter 3: Mode statistics

3.1 Unbiased test

3.2 Unbiased CI and MO estimator

3.3 Cumulative information and the sufficient statistic

References

Chapter 4:

P

-value and duality

4.1

P

-value for the double-sided hypothesis

4.2 The overall powerful test

4.3 Duality: converting the CI into a hypothesis test

4.4 Bypassing assumptions

4.5 Overview

References

Chapter 5: M-statistics for major statistical parameters

5.1 Exact statistical inference for standard deviation

5.2 Pareto distribution

5.3 Coefficient of variation for lognormal distribution

5.4 Statistical testing for two variances

5.5 Inference for two-sample exponential distribution

5.6 Effect size and coefficient of variation

5.7 Binomial probability

5.8 Poisson rate

5.9 Meta-analysis model

5.10 M-statistics for the correlation coefficient

5.11 The square multiple correlation coefficient

5.12 Coefficient of determination for linear model

References

Chapter 6: Multidimensional parameter

6.1 Density level test

6.2 Unbiased test

6.3 Confidence region dual to the DL test

6.4 Unbiased confidence region

6.5 Simultaneous inference for normal mean and standard deviation

6.6 Exact confidence inference for parameters of the beta distribution

6.7 Two-sample binomial probability

6.8 Exact and profile statistical inference for nonlinear regression

References

Index

End User License Agreement

List of Tables

Chapter 4

Table 4.1 Summary of the optimal double-sided exact statistical inference.

List of Illustrations

Chapter 1

Figure 1.1: The Gaussian kernel density for a sample of 234,986 hourly wages...

Figure 1.2: Image of comet C/2001 Q4 (NEAT) taken at the WIYN 0.9-meter tele...

Figure 1.3: Comparison of the short unequal-tail and equal-tail 95% confiden...

Figure 1.4: Percent length reduction of the short CI compared to the traditi...

Figure 1.5: The power functions of three tests for the two-sided variance hy...

Figure 1.6: Sample size determination for the stock prices in Example 1.2. T...

Figure 1.7: Illustration of Proposition 1.8. The shortest interval (bold seg...

Chapter 2

Figure 2.1: Illustration of the MC estimator of the normal variance, . The c...

Figure 2.2: Illustration of the MC estimator with the cdf of the exponential...

Figure 2.3: Concentration probabilities for the MC and classic unbiased esti...

Figure 2.4: The plot of the derivative versus the cdf for a sufficient sta...

Chapter 3

Figure 3.1: The power function for biased equal-tailed and unbiased unequal-...

Figure 3.2: Comparison of two MO estimators for normal variance using cumula...

Chapter 4

Figure 4.1: Four -dependent -values versus the alternative -value for the DL...

Figure 4.2: Four -values as functions of the normalized variance, with and...

Figure 4.3: Three tests for the normal variance. The line version of the pow...

Figure 4.4: The relative area above the power function for three tests for n...

Chapter 5

Figure 5.1: The power functions of equal- and unequal-tail tests for the two...

Figure 5.2: The length of three 95% CIs for as a function of the sample siz...

Figure 5.3: Comparison of the relative length of three 95% CIs as a function...

Figure 5.4: Power functions of three exact tests (). Symbols represent valu...

Figure 5.5: Two power functions of the MO exact unbiased and asymptotic test...

Figure 5.6: Power functions for unbiased and equal-tail -tests with small sa...

Figure 5.7: The optimal sample size yields the desired power of detection wi...

Figure 5.8: The power functions for the two tests: equal-tail and unbiased. ...

Figure 5.9: Optimal and to achieve the powers 0.8 and 0.9. The line is ta...

Figure 5.10: Simulation derived coverage probability and relative length of ...

Figure 5.11: Illustration of the MC estimator of with and when for the s...

Figure 5.12: Percent difference from the sample ES for different sample size...

Figure 5.13: The power function of three tests for the null hypothesis vers...

Figure 5.14: Simulation-derived probabilities of the equal-tailed CI for the...

Figure 5.15: Simulation-derived coverage probability of six CIs for the coef...

Figure 5.16: MC and MO estimates of the CV as the percent difference from th...

Figure 5.17: Three power functions for testing the double-sided null hypothe...

Figure 5.18: Cdf of the binomial distribution and its approximations as func...

Figure 5.19: Comparison of CIs for a binomial probability. Top: The coverage...

Figure 5.20: Power functions and simulation-derived values via rejection rul...

Figure 5.21: Four power functions as functions of, the number of individual...

Figure 5.22: Coverage probability and the average length on the log scale as...

Figure 5.23: The power functions of the equal-tailed and unbiased Poisson te...

Figure 5.24: Simulation-derived densities of three estimators of the heterog...

Figure 5.25: Comparison of the equal-tail and MCL CIs for heterogeneity vari...

Figure 5.26: Concentration probabilities around the true against The MC es...

Figure 5.27: The MC estimate of shown by a circle as the inflection point o...

Figure 5.28: Comparison of the Olkin-Pratt, MC, and MO estimates of the corr...

Figure 5.29: The power function for three tests with and This graph was c...

Figure 5.30: The MC estimate as the limit of the optimized CI when for the ...

Figure 5.31: Four 95% CIs as a function of the observed correlation coeffici...

Figure 5.32: The power function of the unbiased test for two null values, a...

Figure 5.33: The upper-sided 95% CI does not exist for the obseved when an...

Figure 5.34: The 95% two-sided CIs for MSCC, the CI on the log scale, and th...

Figure 5.35: Three estimators of : the traditional (observed) MSCC the MCL ...

Figure 5.36: Density functions of the MCC () and CoD () for several and Fo...

Figure 5.37: The power functions for the unbiased and equal-tailed tests for...

Figure 5.38: MCL and adjusted CoD, as functions of the observed CoD for wi...

Chapter 6

Figure 6.1: The contour plot of the power functions for the DL test (6.1) an...

Figure 6.2: Three confidence regions for The minimum volume and unbiased co...

Figure 6.3: Contours of the pdf (6.22) as a function of the shape parameters...

Figure 6.4: The confidence regions for three methods: likelihood-ratio, Wal...

Figure 6.5: Solution of (6.27) for and with two sample sizes and This gr...

Figure 6.6: Two acceptance regions for testing and with the sample sizes ...

Figure 6.7: Three confidence regions for binomial probabilities with and ...

Figure 6.8: Two examples of confidence regions with confidence levels 95, 8...

Figure 6.9: The simulation-derived coverage probability for three 95% CIs fo...

Figure 6.10: Contours of the power function for testing in the exponential ...

Figure 6.11: The CIs and the power functions for in the exponential regress...

Guide

Cover

Title Page

Copyright

Dedication

Preface

Table of Contents

Begin Reading

Index

End User License Agreement

Pages

iii

iv

v

xi

xii

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

219

220

221

222

223

M-statistics

Optimal Statistical Inference for a Small Sample

 

 

 

Eugene Demidenko

Dartmouth College

Hanover, NH, USA

 

 

 

 

 

Copyright © 2023 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data

Name: Demidenko, Eugene, 1948- author.

Title: M-statistics : optimal statistical inference for a small sample /  Eugene Demidenko.

Description: Hoboken, NJ : Wiley, 2023. | Includes bibliographical  references and index.

Identifiers: LCCN 2023000520 (print) | LCCN 2023000521 (ebook) | ISBN  9781119891796 (hardback) | ISBN 9781119891802 (adobe pdf) | ISBN  9781119891819 (epub)

Subjects: LCSH: Mathematical statistics.

Classification: LCC QA276.A2 D46 2023 (print) | LCC QA276.A2 (ebook) |  DDC 519.5/4–dc23/eng20230410

LC record available at https://lccn.loc.gov/2023000520

LC ebook record available at https://lccn.loc.gov/2023000521

Cover Design: WileyCover Image: © 500px Asia/Getty Images; Equations by Eugene Demidenko

 

To my family

Preface

Mean and variance are the pillars of classic statistics and statistical inference. These metrics are appropriate for symmetric distributions but are routinely used when distributions are not symmetric. Moreover, the mean is not coherent with human perception of centrality – we never sum but instead look for a typical value, the mode, in statistics language. Simply put, the mean is a suitable metric for computers, but the mode is for people. It is difficult to explain why the mean and median are parts of any statistical package but not the mode. Mean and variance penetrate the theory of statistical inference: the minimum variance unbiased estimator is the landmark of classic statistics. Unbiased estimators rarely exist outside of linear statistical models, but even when they do, they may produce incomprehensible values of the squared correlation coefficient or variance component. This work offers a new statistical theory for small samples with an emphasis on the exact optimal tests and unequal-tail confidence intervals (CI) using the cumulative distribution function (cdf) of a statistic as a pivotal quantity.

The organization of the book is as follows. The first chapter outlines the limitations of classic statistics and uses the normal variance statistical inference for a brief introduction to our theory. It contains a section on some extensions of the Neyman-Pearson lemma to be used in subsequent chapters. The second and third chapters introduce two competing approaches: maximum concentration and mode statistics with optimal confidence intervals and tests. An optimal confidence interval (CI) with coverage probability going to zero gives birth to a maximum concentration or mode estimator depending on the choice of the CI. Two tracks for statistical inference are offered combined under one umbrella of M-statistics: maximum concentration (MC) theory stems from the density level test, short-length CI and implied MC parameter estimator. Mode (MO) theory includes the test, unbiased CI, and implied parameter MO estimator. Unlike the classic approach, we suggest different CI optimality criteria and respective estimators depending on the parameter domain. For example, the CI with the minimum length on the log scale yields a new estimator of the binomial probability and Poisson rate, both positive, even when the number of successes is zero. New criteria for efficient estimation are developed as substitutes for the variance and the mean square error. Chapter 4 discusses definitions of the -value for asymmetric distributions – a generally overlooked but an important statistical problem. M-statistics is demonstrated in action in Chapter 5. Novel exact optimal CIs and statistical tests are developed for major statistical parameters: effect size, binomial probability, Poisson rate, variance component in the meta-analysis model, correlation coefficient, squared correlation coefficient, and coefficient of determination in linear model. M-statistics is extended to the multidimensional parameter in Chapter 6. The exact confidence regions and unbiased tests for normal mean and variance, the shape parameters of the beta distribution, and nonlinear regression illustrate the theory.

The R codes can be freely downloaded from my website

www.eugened.org

stored at GitHub. By default, each code is saved in the folder C:\Projects\Mode\ via dump command every time it is called. If you are using a Mac or do not want to save in this directory, remove or comment (#) this line before running the code (sometimes external R packages must be downloaded and installed beforehand).

I would like to hear comments, suggestions, and opinions from readers. Please e-mail me at [email protected].

Eugene Demidenko

Dartmouth College

Hanover, NH, USA

April 2023

Chapter 1Limitations of classic statistics and motivation

In this chapter, we discuss the limitations of classic statistics that build on the concepts of the mean and variance. We argue that the mean and variance are appropriate measures of the center and the scatter of symmetric distributions. Many distributions we deal with are asymmetric, including distributions of positive data. The mean not only has a weak practical appeal but also may create theoretical trouble in the form of unbiased estimation – the existence of an unbiased estimator is more an exception than the rule.

Optimal statistical inference for normal variance in the form of minimum length or unbiased CI was developed more than 50 years ago and has been forgotten. This example serves as a motivation for our theory. Many central concepts, such as unbiased tests, mode, and maximum concentration estimators for normal variance serve as prototypes for the general theory to be deployed in subsequent chapters.

The Neyman-Pearson lemma is a fundamental statistical result that proves maximum power among all tests with fixed type I error. In this chapter, we prove two results, as an extension of this lemma, to be later used for demonstrating some optimal properties of M-statistics such as the superiority of the sufficient statistic and minimum volume of the density level test.

1.1 Limitations of classic statistics

1.1.1 Mean

A long time ago, several prominent statisticians pointed out to limitations of the mean as a measure of central tendency or short center (Deming 1964; Tukey 1977). Starting from introductory statistics textbooks the mean is often criticized because it is not robust to outliers. We argue that the mean's limitations are conceptually serious compared to other centers, the median and the mode.

For example, when characterizing the distribution of English letters the mean is not applicable, but the mode is “e.”. Occasionally, statistics textbooks discuss the difference between mean, mode, and median from the application standpoint. For example, consider the distribution of house prices on the real estate market in a particular town. For a town clerk, the most appropriate measure of the center is the mean because the total property taxes received by the town are proportional to the sum of house values and therefore the mean. For a potential buyer, who compares prices between small nearby towns, the most appropriate center is the mode as the typical house price. A person who can afford a house at the median price knows that they can afford 50% of the houses on the market.

Remarkably, modern statistical packages, like R, compute the mean and median as mean(Y) and median(Y), but not the mode, although it requires just two lines of code

where Y is the array of data. The centerpiece of the mode computation is the density function, which by default assumes the Gaussian kernel and the bandwidth computed by Silverman's “rule of thumb” (1986).

Consider another example of reporting the summary statistics for U.S. hourly wages (the data are obtained from the Bureau of Labor Statistics at https://www.bls.gov/mwe). Figure 1.1 depicts the Gaussian kernel density of hourly wages for 234,986 employees. The mean is almost twice as large as the mode because of the heavy right tail. What center should be used when reporting the average wage? The answer depends on how the center is used. The mean may be informative to the U.S. government because the sum of wages is proportional to consumer buying power and collected income taxes. The median has a clear interpretation: 50% of workers earn less than $17.10 per hour. The mode offers a better interpretation of the individual level as the typical wage – the point of maximum concentration of wages. In parentheses, we report the proportion of workers who earn $1 within each center. The mode has maximum data concentration probability – that is why we call the mode typical value. The mean ($20.40) may be happily reported by the government, but $11.50 is what people typically earn.

Figure 1.1:The Gaussian kernel density for a sample of 234,986 hourly wages in the country. The percent value in the parentheses estimates the probability that the wage is within $1 of the respective center.

Mean is a convenient quantity for computers, but humans never count and sum – they judge and compare samples based on the typical value.

Figure 1.2 illustrates this statement. It depicts a NASA comet image downloaded from https://solarviews.com/cap/comet/cometneat.htm. The bull's-eye of the comet is the mode where the concentration of masses is maximum. Mean does not have a particular interpretation.

Mean is for computers, and mode is for people. People immediately identify the mode as the maximum concentration of the distribution, but we never sum the data in our head and divide it by the number of points – this is what computers do. This picture points out the heart of this book: the mean is easy to compute because it requires arithmetic operations suitable to computers. The mode requires more sophisticated techniques such as density estimation – unavailable at the time when statistics was born. Estimation of the mode is absent even in comprehensive modern statistics books. The time has come to reconsider and rewrite statistical theory.

Figure 1.2:Image of comet C/2001 Q4 (NEAT) taken at the WIYN 0.9-meter telescope at Kitt Peak National Observatory near Tucson, Arizona, on May 7, 2004. Source: NASA.

1.1.2 Unbiasedness

The mean dominates not only statistical applications but also statistical theory in the form of an unbiased estimator. Finding a new unbiased estimator is regarded as one of the most rewarding works of a statistician. However, unbiasedness has serious limitations:

The existence of unbiased estimator is an exception, not the rule. Chen (2003) writes “… the condition on unbiasedness is generally a strong one.” Unbiased estimators mostly exist in the framework of linear statistical models and yet classic statistical inference like the Cramér-Rao lower bound or Lehmann-Sheffe theorem relies on unbiasedness (Rao

1973

; Zacks

1971

; Casella and Berger

1990

; Cox and Hinkley

1996

; Lehmann and Casella

1998

; Bickel and Docksom 2001; Lehmann and Romano

2005

). Unbiased estimators do not exist for simple nonlinear quantities such as the coefficient of variation or the ratio of regression coefficients (Fieller

1932

).

Unbiasedness is not invariant to nonlinear transformation. For example, the square root of the sample variance is a positively biased estimator of the standard deviation. If an unbiased estimator exists for a parameter, it rarely exists for its reciprocal.

An unbiased estimator of a positive parameter may take a negative value, especially with small degrees of freedom. The most notorious examples are the unbiased estimation of variance components and the squared correlation coefficient. For example, as shown by Ghosh (1996), the unbiased nonnegative estimator of the variance component does not exist.

Variance and mean square error (MSE), as established criteria of statistical efficiency, suffer from the same illness: they are appropriate measures of scattering for symmetric distributions and may not exist even in simple statistical problems.

Note that while we criticize the unbiased estimators, there is nothing wrong with unbiased statistical tests and CIs – although the same term unbiasedness is used, these concepts are not related. Our theory embraces unbiased tests and CIs and derives the mode (MO) and maximum concentration (MC) estimator as the limit point of the unbiased and minimum length CI, respectively, when the coverage probability approaches zero.

1.1.3 Limitations of equal-tail statistical inference

Classic statistics uses the equal-tail approach for statistical hypothesis testing and CIs. This approach works for symmetric distributions or large sample sizes. It was convenient in the pre-computer era when tables at the end of statistics textbooks were used. The unequal approach, embraced in the present work, requires computer algorithms and implies optimal statistical inference for any sample size. Why use a suboptimal equal-tail approach when a better one exists? True, for a fairly large sample size, the difference is negligible but when the number of observations is small say, from 5 to 10 we may gain up to 20% improvement measured as the length of the CI or the power of the test.

1.2 The rationale for a new statistical theory

The classic statistical inference was developed almost 100 years ago. It tends to offer simple procedures, often relying on the precomputed table of distributions printed at the end of books. This explains why until now equal-tail tests and CIs have been widely used even though for asymmetric distributions the respective inference is suboptimal. Certainly, for a moderate sample size, the difference is usually negligible but when the sample size is small the difference can be considerable. Classic equal-tail statistical inference is outdated. Yes, unequal-tail inferences do not have a closed-form solution, but this should not serve as an excuse for using suboptimal inference. Iterative numeric algorithms are an inherent part of M-statistics – Newton's algorithm is very effective starting from classic equal-tail values.

In classic statistics, CIs and estimators are conceptually disconnected. In M-statistics, they are on the same string of theory – the maximum concentration (MC) and mode (MO) estimators are derived as the limit points of the CI when the confidence level approaches zero.

Finally, we want to comment on the style of this work. Novel ideas are illustrated with statistical inference for classic distribution parameters along with numeric implementation and algorithms that are readily available for applications.

Our theory of M-statistics proceeds on two parallel competing tracks:

MC-statistics: density level test, short CI, and implied MC estimator.

MO-statistics: unbiased test, unbiased CI, and implied MO estimator.

In symbolic form,

Our new tests and CIs are exact and target small samples. Neither the expected value of the estimator nor its variance/mean square error plays a role in our theory. In classic statistics, the CI and the estimator are not related conceptually. In M-statistics, the latter is derived from the former when the confidence level approaches zero.

Our theory relies on the existence of a statistic with the cumulative distribution function (cdf) . Since has the uniform distribution on this cdf can be viewed as a pivotal quantity. Statistical inference based on pivotal quantity is well known but limited mainly because it exists in rare cases. On the other hand, derivation of the cdf of any statistic is often feasible via integration.

The organization of the book is as follows: We start with a motivating example of the CI for the normal variance dating back to classic work by Neyman and Pearson (1933). These CIs serve as a template for our further development of unbiased CIs and tests in the general case. Not surprisingly, sufficient statistics produce optimal CIs. Classic fundamental concepts such as the Neyman-Pearson lemma and Fisher information receive a new look in M-statistics. We argue that for a positive parameter, the length of the CI should be measured on a relative scale, such as the logarithmic scale leading to a new CI and implied MC estimator. This concept applies to binomial probability and the Poisson rate. The cdf is just a special case of a pivotal quantity for constructing unbiased CIs and MC estimators, which are illustrated with a simple meta-analysis variance components model. This example emphasizes the optimal properties of the MC estimator of the variance, which is in the center of the maximum concentration probability and is always positive. The correlation coefficient, squared correlation coefficient, and coefficient of determination in the linear model with fixed predictors are the next subjects of application of M-statistics. Although these fundamental statistics parameters were introduced long ago until recently exact unbiased CIs and statistical tests were unknown. Our work is finalized with the extension of M-statistics to the multidimensional parameter. Novel exact confidence regions and tests are derived and illustrated with two examples: (a) simultaneous testing of the normal mean and standard deviation, and (b) nonlinear regression.

Finally, the goal of the book is not to treat every single statistical inference problem and apply every method for CI or hypothesis testing – instead, the goal is to demonstrate a new theory. Much work must be done to apply the new statistical methodology to other practically important statistical problems and develop efficient numerical algorithms.

1.3 Motivating example: normal variance

This section reviews the exact statistical inference for the variance of the normal distribution: shortly, normal variance. First, we discuss CI, and then we move on to statistical hypothesis testing. Our point of criticism is the widespread usage of the equal-tail CI and hypothesis test that silently assume a symmetric distribution. For example, the equal-tail statistical inference for the normal variance is used in a recent R package DescTools. However, the distribution of the underlying statistic, that is, the sum of squares is not symmetric and has a chi-square distribution. We argue that equal-tail CIs were convenient in the pre-computer era when tables were used. Instead, we popularize a computationally demanding unequal-tail CI: specifically, the unbiased CI and the CI with the minimum expected length (the short CI). Although unequal-tail CIs have been known for many years they are rarely applied and implemented in modern statistical software.

Here we apply unequal-tail statistical inference to normal variance as a motivating example and an introduction to our M-statistics theory. This example will serve as a benchmark for many novel methods of statistical inference developed in this book.

1.3.1 Confidence interval for the normal variance

To motivate our M-statistics, we revive the classic confidence interval (CI) for the normal variance, having independent normally distributed observations The basis for the CI is the fact that where has a chi-square distribution with degrees of freedom (df).

Traditionally to construct a double-sided CI for normal variance , the equal-tail CI is used where the limits are computed by solving the equations and that yield

(1.1)

where denotes the chi-square distribution with degrees of freedom (df) and is the desired coverage probability. If and are the quantiles of the chi-square distribution the lower and upper confidence limits are and This interval covers the true with the exact probability . The equal-tail CI is offered in various textbooks, handbooks, and statistical software, such as SAS, STATA, and R, despite being biased and not having a minimal expected length.

The short unequal-tail CI was introduced almost 100 years ago by Neyman and Pearson (1936). Anderson and Bancroft (1952) and Tate and Klett (1959) applied this CI to the normal variance; El-Bassiouni (1994) carried out comparisons with other CIs. Two asymmetric CIs for are revived next: the minimum expected length and the unbiased CI. The former is a representative of the maximum concentration (MC) and the latter is a representative of the mode (MO) statistics.

Short CI

To achieve exact coverage of the quantiles do not necessarily have to be chosen to have equal-tail probabilities. An obvious choice, given the observed is to find and such that they satisfy (1.1) and is minimum. Since minimization of is equivalent to minimization of upon change of variables and we replace (1.1) with

(1.2)

and arrive at the minimization of as

(1.3)

This problem is solved by the Lagrange multiplier technique with the function defined as

where is the Lagrange multiplier. Differentiating with respect to the unknown quantiles, we obtain the necessary conditions for the minimum:

where denotes the probability density function (pdf) of the chi-square distribution with df:

(1.4)

After eliminating the Lagrange multiplier, we arrive at the following equation for and :

(1.5)

or equivalently,

(1.6)

Hence, the optimal quantiles that produce the minimal expected length CI are found by solving equations (1.2) and (1.5). After and are determined, the short CI has the limits and Shao (2003, Theorem 7.3) generalized the short-length CI to any unimodal distribution having a pivotal quantity of a special kind.

Now we turn our attention to solving equations (1.2) and (1.5) using Newton's algorithm (Ortega and Rheinboldt 2000) starting from the equal-tail quantiles. Using the density function (1.4) one can show that equation (1.5) simplifies to

(1.7)

The system of equations (1.2) and (1.7) is solved iteratively for as

(1.8)

where is the iteration index and the adjustment vector is given by (iteration index not shown)

(1.9)

We start iterations from the equal-tail quantiles: and . Our practice shows that it requires only three or four iterations to converge. Note that the optimal quantiles are not random and depend on the confidence level, and but not on statistic After and are computed the short CI for normal variance takes the form The R function that implements Newton's iterations is var.ql with the call

In this function, n is the sample size (, adj is the adjustment parameter for n in the form n-adj as the coefficient at , alpha passes the value of eps is the tolerance difference between iterations, and maxit is the maximum number of Newton's iterations. The function returns a two-dimensional vector at the final iteration, An example of the call and output is shown here.

> var.ql(n=20,adj=-1,alpha=0.05)

[1] 9.899095 38.327069

If is the observed sum of squares with the short 95% CI for is We use the adj argument to accommodate computation of other quantiles for normal variance, such as for unbiased CIs and tests (see the following section).

The limits of the equal- and unequal-tail CI are compared in Figure 1.3 as functions of . Since the limits of the CI have the form the reciprocal of quantiles, are shown on the -axis. The plot at left depicts and and the plot at right depicts and where and are solutions of (1.2) and (1.7) found by Newton's iterations (1.9). The limits of the short CI are shifted downward. Figure 1.4 depicts the percent length reduction of the short CI compared with the traditional equal-tail CI. Not surprisingly, the difference disappears with , but for small sample sizes, such as the unequal-tail CI is shorter by about 30%.

Figure 1.3:Comparison of the short unequal-tail and equal-tail 95% confidence limits for the normal variance using reciprocal quantiles.

Figure 1.4:Percent length reduction of the short CI compared to the traditional equal-tailed CI for normal variance.

Unbiased CI

The traditional equal-tail and short-length CIs are biased for asymmetric distributions, such as the chi-square, meaning the coverage probability of the “wrong” variance, may be greater than the coverage probability of the true variance . In this section, we choose the quantiles and that make the CI unbiased. The coverage probability of is

We demand that this probability reaches its maximum at which requires the necessary condition

Thus we conclude that to make the CI unbiased the following must hold:

(1.10)

After some trivial algebra, we arrive at a similar system of equations where (1.7) is replaced with

(1.11)

Again Newton's algorithm (1.8) applies with in (1.9) replaced with More specifically, the delta-vector takes the form

The quantiles are computed by the same R function var.ql with adj=1. The call var.ql(n=20,adj=1) returns 9.267006 33.920798. If is the sum of squares with the 95% unbiased CI for is Since

the length of the unbiased CI is greater than the length of the short CI, as it is supposed to be.

1.3.2 Hypothesis testing for the variance