E-Book
109,99 €

Advanced Analysis of Variance E-Book

Chihiro Hirotsu

0,0

109,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Serie: Wiley Series in Probability and Statistics
Sprache: Englisch

Beschreibung

Introducing a revolutionary new model for the statistical analysis of experimental data In this important book, internationally acclaimed statistician, Chihiro Hirotsu, goes beyond classical analysis of variance (ANOVA) model to offer a unified theory and advanced techniques for the statistical analysis of experimental data. Dr. Hirotsu introduces the groundbreaking concept of advanced analysis of variance (AANOVA) and explains how the AANOVA approach exceeds the limitations of ANOVA methods to allow for global reasoning utilizing special methods of simultaneous inference leading to individual conclusions. Focusing on normal, binomial, and categorical data, Dr. Hirotsu explores ANOVA theory and practice and reviews current developments in the field. He then introduces three new advanced approaches, namely: testing for equivalence and non-inferiority; simultaneous testing for directional (monotonic or restricted) alternatives and change-point hypotheses; and analyses emerging from categorical data. Using real-world examples, he shows how these three recognizable families of problems have important applications in most practical activities involving experimental data in an array of research areas, including bioequivalence, clinical trials, industrial experiments, pharmaco-statistics, and quality control, to name just a few. * Written in an expository style which will encourage readers to explore applications for AANOVA techniques in their own research * Focuses on dealing with real data, providing real-world examples drawn from the fields of statistical quality control, clinical trials, and drug testing * Describes advanced methods developed and refined by the author over the course of his long career as research engineer and statistician * Introduces advanced technologies for AANOVA data analysis that build upon the basic ANOVA principles and practices Introducing a breakthrough approach to statistical analysis which overcomes the limitations of the ANOVA model, Advanced Analysis of Variance is an indispensable resource for researchers and practitioners working in fields within which the statistical analysis of experimental data is a crucial research component. Chihiro Hirotsu is a Senior Researcher at the Collaborative Research Center, Meisei University, and Professor Emeritus at the University of Tokyo. He is a fellow of the American Statistical Association, an elected member of the International Statistical Institute, and he has been awarded the Japan Statistical Society Prize (2005) and the Ouchi Prize (2006). His work has been published in Biometrika, Biometrics, and Computational Statistics & Data Analysis, among other premier research journals.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 587

Veröffentlichungsjahr: 2017

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Cover

Title Page

Preface

Notation and Abbreviations

Notation

Abbreviations

1 Introduction to Design and Analysis of Experiments

1.1 Why Simultaneous Experiments?

1.2 Interaction Effects

1.3 Choice of Factors and Their Levels

1.4 Classification of Factors

1.5 Fixed or Random Effects Model?

1.6 Fisher’s Three Principles of Experiments vs. Noise Factor

1.7 Generalized Interaction

1.8 Immanent Problems in the Analysis of Interaction Effects

1.9 Classification of Factors in the Analysis of Interaction Effects

1.10 Pseudo Interaction Effects (Simpson’s Paradox) in Categorical Data

1.11 Upper Bias by Statistical Optimization

1.12 Stage of Experiments: Exploratory, Explanatory or Confirmatory?

References

2 Basic Estimation Theory

2.1 Best Linear Unbiased Estimator

2.2 General Minimum Variance Unbiased Estimator

2.3 Efficiency of Unbiased Estimator

2.4 Linear Model

2.5 Least Squares Method

2.6 Maximum Likelihood Estimator

2.7 Sufficient Statistics

References

3 Basic Test Theory

3.1 Normal Mean

3.2 Normal Variance

3.3 Confidence Interval

3.4 Test Theory in the Linear Model

3.5 Likelihood Ratio Test and Efficient Score Test

References

4 Multiple Decision Processes and an Accompanying Confidence Region

4.1 Introduction

4.2 Determining the Sign of a Normal Mean – Unification of One‐ and Two‐Sided Tests

4.3 An Improved Confidence Region

Reference

5 Two‐Sample Problem

5.1 Normal Theory

5.2 Non‐parametric Tests

5.3 Unifying Approach to Non‐inferiority, Equivalence and Superiority Tests

References

6 One‐Way Layout, Normal Model

6.1 Analysis of Variance (Overall

‐Test)

6.2 Testing the Equality of Variances

6.3 Linear Score Test (Non‐parametric Test)

6.4 Multiple Comparisons

6.5 Directional Tests

References

7 One‐Way Layout, Binomial Populations

7.1 Introduction

7.2 Multiple Comparisons

7.3 Directional Tests

References

8 Poisson Process

8.1 Max acc.

1 for the Monotone and Step Change‐Point Hypotheses

8.2 Max acc.

2 for the Convex and Slope Change‐Point Hypotheses

References

9 Block Experiments

9.1 Complete Randomized Blocks

9.2 Balanced Incomplete Blocks

9.3 Non‐parametric Method in Block Experiments

References

10 Two‐Way Layout, Normal Model

10.1 Introduction

10.2 Overall ANOVA of Two‐Way Data

10.3 Row‐wise Multiple Comparisons

10.4 Directional Inference

10.5 Easy Method for Unbalanced Data

References

11 Analysis of Two‐Way Categorical Data

11.1 Introduction

11.2 Overall Goodness‐of‐Fit Chi‐Square

11.3 Row‐wise Multiple Comparisons

11.4 Directional Inference in the Case of Natural Ordering Only in Columns

11.5 Analysis of Ordered Rows and Columns

References

12 Mixed and Random Effects Model

12.1 One‐Way Random Effects Model

12.2 Two‐Way Random Effects Model

12.3 Two‐Way Mixed Effects Model

12.4 General Linear Mixed Effects Model

References

13 Profile Analysis of Repeated Measurements

13.1 Comparing Treatments Based on Upward or Downward Profiles

13.2 Profile Analysis of 24‐Hour Measurements of Blood Pressure

14 Analysis of Three‐Way Categorical Data

14.1 Analysis of Three‐Way Response Data

14.2 One‐Way Experiment with Two‐Way Categorical Responses

14.3 Two‐Way Experiment with One‐Way Categorical Responses

References

15 Design and Analysis of Experiments by Orthogonal Arrays

15.1 Experiments by Orthogonal Array

15.2 Ordered Categorical Responses in a Highly Fractional Experiment

15.3 Optimality of an Orthogonal Array

References

Appendix

Index

End User License Agreement

List of Tables

bapp

Table A Upper percentiles

(

) of max acc.

Table B Upper percentiles of max acc.

Table C Upper percentiles of the largest eigenvalue of Wishart matrix.

Chapter 01

Table 1.1 Fixing time of special aluminum printing.

Table 1.2 Simpson’s paradox.

Chapter 02

Table 2.1 The efficiency of sample mean

and median

ỹ

Chapter 03

Table 3.1 Comparing the critical values of the

(3.11) and

(3.18) tests.

Table 3.2 Preparation for calculating statistics.

Table 3.3 Critical values of the uniformly most powerful unbiased test.

Table 3.4 Power of the right one‐sided test (df 6,

Table 3.5 Power of the two‐sided test (

Table 3.6 2 × 2 contingency table.

Chapter 05

Table 5.1 Half‐life of NFLX at two doses.

Table 5.2 Cholesterol measurements before and after treatment.

Table 5.3 Measurements in the chemical analysis.

Table 5.4 Density of food (Moriguti, 1976).

Table 5.5 Degree of general improvement in a phase II trial.

Table 5.6 Calculation of Wilcoxon rank test statistic.

Table 5.7 Transpose of the data in Table 7.1.

Table 5.8 Collapsing columns of Table 5.5 at five cut points.

Table 5.9 Power comparison at

Table 5.10 A phase III trial in the infectious disease of respiratory organs.

Table 5.11 Consumer’s risk at non‐inferiority margin

. 1.

Table 5.12 Required sample size per arm

Table 5.13 Phase III trial for chronic urticarial.

Table 5.14 Comparison of log

max

between Japanese and Caucasians at 6 mg dose.

Table 5.15 Comparison of log

max

between Japanese and Caucasians at 3 mg dose.

Chapter 06

Table 6.1 ANOVA table for one‐way layout.

Table 6.2 Magnetic strength (

) of a ferrite core (Moriguti, 1976).

Table 6.3 Squared data.

Table 6.4 ANOVA table for one‐way layout.

Table 6.5 Calculation of Kruskal–Wallis test statistic.

Table 6.6 Simultaneous confidence intervals by Tukey’s and Scheffé’s methods.

Table 6.7 Decrease in red cell counts (10 – original data).

Table 6.8 Simultaneous lower bounds SLB

for

(

Table 6.9 Half‐life of NFLX (antibiotics).

Table 6.10 SLB

for the isotonic contrasts.

Table 6.11 (1) Power comparisons by numerical integration.

Table 6.12 Probability of selecting acceptable models by max acc.

1 and max acc.

Table 6.13 Accuracy of approximation (6.40) at upper five percentile.

Table 6.14 Power comparisons of max acc.

2, maximin, and polynomial linear tests.

Table 6.15 Power comparisons of max acc.

3, maximin, and polynomial linear tests.

Chapter 07

Table 7.1 Improvement rate of a drug for heart disease.

Table 7.2 Simultaneous lower bounds SLB

Table 7.3 Process to determine the conformable sequence.

Table 7.4 Bottom‐up process for constructing the probability distribution.

Table 7.5 Number of conformable Poisson sequences and computing time of probabilities.

Table 7.6 Mortality by sex of donor and degree of erythroblastosis.

Table 7.7 Power comparisons of max acc.

† 2

, and the polynomial test.

Table 7.8 Results of diesel fuel aerosol experiment.

Chapter 08

Table 8.1 Spontaneous reporting of adverse events per month at PMDA.

Chapter 09

Table 9.1 ANOVA table for block experiment.

Table 9.2 Measurements of enzymatic activation.

Table 9.3 ANOVA table for measurements of enzymatic activation.

Table 9.4 ANOVA table for BIBD.

Table 9.5 Efficacy of antibiotics.

Table 9.6 ANOVA table for the antibiotic data.

Table 9.7 Data structure of the

th matched pair.

Table 9.8 Unlike pair.

Table 9.9 Maxwell’s data.

Table 9.10 Quantitative analysis of IgE antibody.

Table 9.11 Data structure of the

th matched pair.

Table 9.12 Summary of the culture test

Table 9.13 Data structure of the

th block,

Table 9.14 Accumulated data according to step change at

Table 9.15 Ranks of four metals at 25 points.

Table 9.16 Data structure of the first block.

Table 9.17 Table of

Table 9.18 Table of

Table 9.19 Score sheet of the Pacific League in Japan in 1981.

Table 9.20 Fitted values to the score sheet of the Pacific League in Japan in 1981.

Table 9.21 Squared distances between teams of the Pacific League in Japan in 1981.

Chapter 10

Table 10.1 ANOVA table for two‐way layout.

Table 10.2 Corrosion resistance of aluminum alloys.

Table 10.3 ANOVA table for Example 10.1.

Table 10.4 Yields of corn in bushels per acre (JASA, Johnson and Graybill, 1972).

Table 10.5 Matrix of squared distance.

Table 10.6 Estimation of

Table 10.7 Matrix of squared distance.

Table 10.8 Averaged response.

Table 10.9 Repetition number.

Table 10.10 Size of the naïve approximate test and related measure of unbalance.

Table 10.11 Increase in systolic pressure due to the treatment.

Table 10.12 Analysis of variance by easy method.

Table 10.13 ANOVA by a linear model.

Table 10.14 Comparison of

‐values of the easy methods with a standard inference.

Chapter 11

Table 11.1 Number of cancer patients by occupation and severity of illness at their first visit at the National Cancer Institute of Japan.

Table 11.2 Israeli adults cross‐classified by principal worries and residence area.

Table 11.3 Collapsed data.

Table 11.4 Comparing upper percentiles by several approximation methods.

Table 11.5 Calculation of the components of cumulative chi‐squared.

Table 11.6 Chi‐squared distances between two rows (rows rearranged).

Table 11.7 Collapsed data and simple departure measure.

Table 11.8 Original data from phase II trial for an antibiotic.

Table 11.9 Sub‐tables for maximal contrast test for rows.

Table 11.10 Sub‐tables for doubly accumulated chi‐square

**2

Chapter 12

Table 12.1 Content rate of sulfur.

Table 12.2 ANOVA table for one‐way random effects model.

Table 12.3 Repetition number.

Table 12.4 Comparing the relative variance of several estimators.

Table 12.5 Length of the ramus bone (mm).

Chapter 13

Table 13.1 Total cholesterol amounts.

Table 13.2 The squared distances (13.3) between two rows.

Table 13.3 Observed distribution for active drug and placebo.

Table 13.4 Summary of the classification into five subgroups.

Table 13.5 Summary of blood pressure data

Table 13.6 Process of classification.

Chapter 14

Table 14.1 Data of 2 × 2 × 2 table expressed by

111

and two‐way marginal totals.

Table 14.2 Cancer patients cross‐classified by levels of age

, metastasis

, and saturation

Table 14.3 Expected cell frequencies under

Table 14.4 Table of metastasis and saturation obtained by collapsing age levels.

Table 14.5 Allele frequencies at the D20S95 locus.

Table 14.6

‐Values of the proposed methods.

Table 14.7 Haplotype allele frequencies at two loci in fragile

and normal chromosome.

Table 14.8 Configuration of

ijk

(normalizing constant omitted).

Table 14.9 Cell frequencies of pooled 2 × 2 × 2 table expressed in the function of

Table 14.10 The values of statistics and their

‐values.

Table 14.11 Maximum likelihood estimates

under

Table 14.12 Odds ratio for high‐risk mammographic patterns according to HRT use starting time within HRT duration categories.

Table 14.13 Coal miners classified by breathlessness, wheezing, and age.

Chapter 15

Table 15.1 Factors for improving the fixing time of special aluminum printing.

Table 15.2 Design of 16 experiments.

Table 15.3 Orthogonal array

Table 15.4 Column number where the interaction of two specified columns appears.

Table 15.5 Analysis of variance table for the aluminum experiment.

Table 15.6 Design and workability data for arc‐welding experiment.

Table 15.7 Summarized data with respect to factors

, and

Table 15.8 Summarized data with respect to the factors

D, F

, and

Table 15.9 Worked example.

Table 15.10 The marginal tables for the worked example.

Table 15.11 Estimated cell frequencies under

List of Illustrations

Chapter 01

Figure 1.1 Average plots at (

Figure 1.2 No interaction.

Figure 1.3 Cause‐and‐effect diagram.

Chapter 02

Figure 2.1

‐distribution

(

Figure 2.2 Observation vector

and its projection onto the estimation space.

Chapter 03

Figure 3.1 An upper 100

percentile of N(0, 1).

Figure 3.2 Rejection region

and

Figure 3.3 Power function.

Figure 3.4 Construction of

‐test.

Chapter 04

Figure 4.1 The acceptance region.

Figure 4.2 The accepted hypotheses.

Figure 4.3 An improved confidence region.

Chapter 05

Figure 5.1 A confidence region with confidence coefficient

Chapter 06

Figure 6.1 Decrease in red cell counts.

Figure 6.2 NFLX half‐life data.

Chapter 07

Figure 7.1 Fitting a logit linear regression curve.

Chapter 10

Figure 10.1 Yields of corn in bushels per acre.

Figure 10.2 Yields of corn in bushels per acre.

Chapter 13

Figure 13.1 Interaction plots for drug and placebo.

Figure 13.2 Interaction plots for each of three subgroups.

Figure 13.3 Interaction plots of blood pressure for each of five sub‐classes.

Chapter 15

Figure 15.1 The required interaction diagram.

Figure 15.2 The interaction diagrams for

Figure 15.3 The required pattern taken on Fig. 15.2 (2).

Figure 15.4 The required pattern taken on Fig. 15.2 (3).

Guide

Cover

Table of Contents

Begin Reading

Pages

iii

xii

xiii

xiv

xvi

xvii

xviii

xix

100

101

102

103

104

105

106

107

108

109

110

111

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

193

194

195

196

197

198

199

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

401

402

403

404

405

407

408

409

410

411

412

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by Walter A. Shewhart and Samuel S. Wilks

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay

Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels

The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches.

This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.

A complete list of titles in this series can be found at http://www.wiley.com/go/wsps

Advanced Analysis of Variance

Chihiro Hirotsu

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Chihiro Hirotsu to be identified as the author of this work has been asserted in accordance with law.

Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Office111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of WarrantyThe publisher and the authors make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties; including without limitation any implied warranties of fitness for a particular purpose. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for every situation. In view of on‐going research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. The fact that an organization or website is referred to in this work as a citation and/or potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this works was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising here from.

Library of Congress Cataloging‐in‐Publication Data

Names: Hirotsu, Chihiro, 1939– author.Title: Advanced analysis of variance / by Chihiro Hirotsu.Description: Hoboken, NJ : John Wiley & Sons, 2017. | Series: Wiley series in probability and statistics | Includes bibliographical references and index. |Identifiers: LCCN 2017014501 (print) | LCCN 2017026421 (ebook) | ISBN 9781119303343 (pdf) | ISBN 9781119303350 (epub) | ISBN 9781119303336 (cloth)Subjects: LCSH: Analysis of variance.Classification: LCC QA279 (ebook) | LCC QA279 .H57 2017 (print) | DDC 519.5/38–dc23LC record available at https://lccn.loc.gov/2017014501

Preface

Scheffé’s old book (The Analysis of Variance, Wiley, 1959) still seems to be best for the basic ANOVA theory. Indeed, his interpretation of the identification conditions on the main and interaction effects in a two‐way layout is excellent, while some textbooks give an erroneous explanation even of this. Miller’s book Beyond ANOVA (BANOVA; Chapman & Hall/CRC, 1998) intended to go beyond this a long time after Scheffé and succeeded to some extent in bringing new ideas into the book – such as multiple comparison procedures, monotone hypothesis, bootstrap methods, and empirical Bayes. He also gave detailed explanations of the departures from the underlying assumptions in ANOVA – such as non‐normality, unequal variances, and correlated errors. So, he gave very nicely the basics of applied statistics. However, I think this would still be insufficient for dealing with real data, especially with regard to the points below, and there is a real need for an advanced book on ANOVA (AANOVA). Thus, this book intends to provide some new technologies for data analysis following the precise and exact basic theory of ANOVA.

A Unifying Approach to the Shape and Change‐point Hypotheses

The shape hypothesis (e.g., monotone) is essential in dose–response analysis, where a rigid parametric model is usually difficult to assume. It appears also when comparing treatments based on ordered categorical data. Then, the isotonic regression is the most well‐known approach to the monotone hypothesis in the normal one‐way layout model. It has been, however, introduced rather intuitively and has no obvious optimality for restricted parameter spaces like this. Further, the restricted maximum likelihood approach employed in the isotonic regression is too complicated to extend to non‐normal distributions, to the analysis of interaction effects, and also to other shape constraints such as convexity and sigmoidicity. Therefore, in the BANOVA book by Miller, a choice of Abelson and Tukey’s maximin linear contrast test is recommended for isotonic inference to escape from the complicated calculations of the isotonic regression. However, such a one‐degree‐of‐freedom contrast test cannot keep high power against the wide range of the monotone hypothesis, even by a careful choice of the contrast. Instead, the author’s approach is robust against the wide range of the monotone hypothesis and can be extended in a systematic way to various interesting problems, including analysis of the two‐way interaction effects. It starts from a complete class lemma for the tests against the general restricted alternative, suggesting the use of singly, doubly, and triply accumulated statistics as the basic statistics for the monotone, convexity, and sigmoidicity hypotheses, respectively. It also suggests two‐way accumulated statistics for two‐way data with natural ordering in rows and columns. Two promising statistics derived from these basic statistics are the cumulative chi‐squared statistics and the maximal contrast statistics. The cumulative chi‐squared is very robust and nicely characterized as a directional goodness‐of‐fit test statistic. In contrast, the maximal contrast statistic is characterized as an efficient score test for the change‐point hypothesis. It should be stressed here that there is a close relationship between the monotone hypothesis and the step change‐point model. Actually, each component of the step change‐point model is a particular monotone contrast, forming the basis of the monotone hypothesis in the sense that every monotone contrast can be expressed by a unique and positive linear combination of the step change‐point contrasts. The unification of the monotone and step change‐point hypotheses is also important in practice, since in monitoring the spontaneous reporting of the adverse events of a drug, for example, it is interesting to detect a change point as well as a general increasing tendency of reporting. The idea is extended to convexity and slope change‐point models, and sigmoidicity and inflection point models, thus giving a unifying approach to the shape and change‐point hypotheses generally. The basic statistics of the newly proposed approach are very simple and have a nice Markov property for elegant and exact probability calculation, not only for the normal distribution but also for the Poisson and multinomial distributions. This approach is of so simple a structure that many of the procedures for a one‐way layout model can be extended in a systematic way to two‐way data, leading to the two‐way accumulated statistics. These approaches have been shown repeatedly to have excellent power (see Chapters 6 to 11 and 13 to 15).

The Analysis of Two‐way Data

One of the central topics of data science is the analysis of interactions in the generalized sense. In a narrow sense, interactions are a departure from the additive effects of two factors. However, in the one‐way layout the main effects of a treatment also become the interaction effects between the treatment and the response if the response is given by a categorical response instead of quantitative measurements. In this case the data yij are the frequency of cell (i, j) for the ith treatment and the jth categorical response. If we denote the probability of cell (i, j) by pij, the treatment effect is a change of the profile (pi1, pi2, …, pib) of the ith treatment, and the interaction effects in terms of pij are concerned. In this case, however, the naïve additive model is often inappropriate and a log linear model

is assumed. Then, the interaction factor (αβ)ij denotes the treatment effects. In this sense the regression analysis is also a sort of interaction analysis between the explanation and the response variables. Further, the logit model, the probit model, the independence test of a contingency table, and the canonical correlation analysis are all regarded as a sort of interaction analysis. In previous books, however, interaction analysis has been paid less attention than it deserves, and mainly an overall F‐ or χ2‐ test has been described in the two‐way ANOVA. Now, there are several immanent problems in the analysis of two‐way data which are not described everywhere.

The characteristics of the rows and columns – such as controllable, indicative, variational, and response – should be taken into consideration.

The degrees of freedom are often so large that an overall analysis can tell almost nothing about the details of the data. In contrast, the multiple comparison procedures based on one‐degree‐of‐freedom contrasts as taken in BANOVA (1998) are too lacking in power and also the test result is usually unclear.

There is often natural ordering in the rows and/or columns, which should be taken into account in the analysis. The isotonic regression is, however, too complicated for the analysis of two‐way interaction effects.

In the usual two‐way ANOVA with controllable factors in the rows and columns, the purpose of the experiment will be to determine the best combination of the two factors that gives the highest productivity. However, let us consider an example of the international adaptability test of rice varieties, where the rows represent the 44 regions [e.g., Niigata (Japan), Seoul, Nepal, Egypt, and Mexico] and the columns represent the 18 varieties of rice [e.g., Rafaelo, Koshihikari, Belle Patna, and Hybrid]. Then the columns are controllable but the rows are indicative, and the problem is by no means to choose the best combination of row and column as in the usual ANOVA. Instead, the purpose should be to assign an optimal variety to each region. Then, the row‐wise multiple comparison procedures for grouping rows with a similar response profile to columns and assigning a common variety to those regions in the same group should be an attractive approach. As another example, let us consider a dose–response analysis based on the ordered categorical data in a phase II clinical trial. Then, the rows represent dose levels and are controllable. The columns are the response variables and the data are characterized by the ordinal rows and columns. Of course, the purpose of the trial is to choose an optimal dose level based on the ordered categorical responses. Then, applying the step change‐point contrasts to rows should be an attractive approach to detecting the effective dose. There are several ideas for dealing with the ordered columns, including the two‐way accumulated statistics. The approach should be regarded as a sort of profile analysis and can also be applied to the analysis of repeated measurements. These examples show that each of the two‐way data requires its own analysis. Indeed, the analysis of two‐way data is a rich source of interesting theories and applications (see Chapters 10, 11, 13, and 14).

Multiple Decision Processes

Unification of non‐inferiority, equivalence, and superiority tests

Around the 1980s there were several serious problems in the statistical analysis of clinical trials in Japan, among which two major problems were the multiplicity problem and non‐significance regarded as equivalence. These were also international problems. The outline of the latter problem is as follows.

In a phase III trial for a new drug application in Japan, the drug used to be compared with an active control instead of a placebo, and admitted for publication if it was evaluated as equivalent to the control in terms of efficacy and safety. Then the problem was that the non‐significance by the usual t or Wilcoxon test had long been regarded as proof of equivalence in Japan. This was stupid, since non‐significance can so easily be achieved by an imprecise clinical trial with a small sample size. The author (and several others) fought against this, and introduced a non‐inferiority test which requires rejecting the handicapped null hypothesis

against the one‐sided alternative

where p1 and p0 are the efficacy rates of the test and control drugs, respectively.

Further, the author found that usually , with one‐sided significance level 0.05, would be appropriate in the sense that the approximately equal observed efficacy proportions of two drugs will clear the non‐inferiority criterion by the usual sample sizes employed in Japanese phase III clinical trials. Actually, the Japanese Statistical Guideline employed the procedure six years in advance of the International Guideline (ICH E9), which employed it in 1998. However, there still remains the problem of how to justify the usual practice of superiority testing after proving non‐inferiority. This has been overcome by a unifying approach to non‐inferiority and superiority tests based on multiple decision processes. It nicely combines the one‐ and two‐sided tests, replacing the usual simple confidence interval for normal means by a more useful confidence region. It does not require a pre‐choice of the non‐inferiority or superiority test, or the one‐ or two‐sided test. The procedure gives essentially the power of the one‐sided test, keeping the two‐sided statistical inference without any prior information (see Chapter 4 and Section 5.4).

Mixed and Random Effects Model

In the factorial experiments, if all the factors except error are fixed effects, it is called a fixed effects model. If the factors are all random except for a general mean, it is called a random effects model. If both types of factor are involved in the experiment, it is called a mixed effects model. In this book mainly fixed effects models are described, but there are cases where it is better to consider the effects of a factor to be random; we discuss basic ideas regarding mixed and random effects models in Chapter 12. In particular, the recent development of the mixed effects model in the engineering field profile analysis is introduced in Chapter 13. There is a factor like the variation factor which is dealt with as fixed in the laboratory, but acts as if it were random in the extension to the real world. Therefore, this is a problem of interpretation of data rather than of mathematics (see Chapters 12 and 13).

Software and Tables

The algorithms for calculating the p‐value of the maximal contrast statistics introduced in this book have been developed widely and extensively by my colleague and I decided to support some of them on my website. They are based on Markov properties of the component statistics. As described in the text, they are simple in principle; the reader is also recommended to develop their own algorithms. Presently, the probabilities of popular distributions such as the normal, t, F, and chi‐squared are obtained very easily on the Internet (see keisan.casio.com, for example), so only a few tables are given in the Appendix, which are not available everywhere. Among them, Tables A and B are original ones calculated by the proposed algorithm.

Examples

Finally it should be stressed that all the newly introduced methods have originated from real problems which the author experienced in his activities in the real field of statistical quality control, clinical trials, and the evaluation of the New Drug Application from the regulatory side. There are naturally plenty of real examples supplied in this book, compared with previous books. Also, this book is not restricted to ANOVA in the narrow sense, but extends these methodologies to discrete data (including contingency tables). Thus, the book intends to provide some advanced techniques for applied statistics beyond the previous elementary books for ANOVA.

Acknowledgments

I would like to thank first the late Professor Tetsuichi Asaka for inviting me to take an interest in applied statistics through the real field of quality control. I would also like to thank Professor Kei Takeuchi for inviting me to study statistical methods based on rigid mathematical statistics. I would also like to thank Sir Professor David Cox for sharing his interest in the wide range of statistical methods available. In particular, my stay at Imperial College in 1978 when visiting him was stimulating and had a significant impact on my later career. The publication of this book is itself due to his encouragement. I would like to thank my research colleagues in foreign countries – Muni Srivastava, Fortunate Pesarin, Ludwig Hothorn, and Stanislaw Mejza in particular – for long‐term discussions and also some direct comments on the draft of this book.

I must not forget to thank my students at the University of Tokyo, including Tetsuhisa Miwa, Hiroe Tsubaki, and Satoshi Kuriki, but they are too many to mention one by one. The long and heated discussions with them at seminars were indeed helpful for me to widen and deepen my interest in both theoretical and applied statistics. In particular, as seen in the References, most of my papers published after 2000 are co‐authored with these students.

My research would never have been complete without the help of my colleagues who developed various software supporting the newly proposed statistical methods. They include Kenji Nishihara, Shoichi Yamamoto, and most recently Harukazu Tsuruta, who succeeded and extended these algorithms which had been developed for a long time. He also read carefully an early draft of this book and gave many valuable comments, as well as a variety of technical support in preparing this book. For technical support, I would also like to thank Yasuhiko Nakamura and Hideyasu Karasawa.

Financially, my research has been supported for a long time by a grant‐in‐aid for scientific research of the Japan Society for Promotion of Science. My thanks are also due to Meisei University who provided me with a laboratory and managerial support for a long time, and even after my retirement from the faculty. My thanks are also due to the Wiley Executive Statistics Editor Jon Gurstelle, Project Editor Divya Narayanan, Production Editor Vishnu Priya and other staff for their help and useful suggestions in publishing this book. Finally, thanks to my wife Mitsuko who helped me in calculating Tables 10.10 and 12.4 a long time ago and for continuous support of all kinds since then.

Chihiro HirotsuTokyo, March 2017

Notation and Abbreviations

Notation

Asterisks on the number (e.g.,

statistical significance at level 0.05 or 0.01

Column vector:

bold lowercase italic letter,

Matrix:

bold uppercase italic letter:

Transpose of vector and matrix:

′

Observation vector:

one‐way,

two‐way,

Dot and bar notation:

one‐way,

two‐way,

Dot and bar notation in vectors:

one‐way,

two‐way,

a zero vector of size

, the suffix is omitted when it is obvious

a vector of size

with all elements unity, the suffix is omitted when it is obvious

an identity matrix of size

, the suffix is omitted when it is obvious

orthonormal matrix satisfying

determinant of a matrix

tr(

trace of a matrix

‖

and

a diagonal matrix with diagonal elements

, …,

arranged in dictionary order and

direct (Kronecker’s) product of two matrices

Kronecker’s delta

is nearly equal to

(

normal distribution with mean

and variance

(

multivariate normal distribution with mean

and variance–covariance matrix

upper

point of standard normal distribution

(0, 1)

(

upper

point of

‐distribution with degrees of freedom

upper

point of

‐distribution with degrees of freedom

upper

point of

‐distribution with degrees of freedom (

)

(

upper

point of Studentized range

(

binomial distribution

(

multinomial distribution

(

hypergeometric distribution

(

multivariate hypergeometric distribution for two‐way data

(

) and

(

density function and probability function

Pr (

), Pr {

probability of event

(

likelihood function

(

) and

(

expectation

(

1Introduction to Design and Analysis of Experiments

1.1 Why Simultaneous Experiments?

Let us consider the problem of estimating the weight μ of a material W using four measurements by a balance. The statistical model for this experiment is written as

where the ei are uncorrelated with expectation zero (unbiasedness) and equal variance σ2. Then, a natural estimator

is an unbiased estimator of μ with minimum variance σ2/4 among all the linear unbiased estimators of μ. Further, if the normal distribution is assumed for the error ei, then is the minimum variance unbiased estimator of μ among all the unbiased estimators, not necessarily linear.

In contrast, when there are four unknown means μ1, μ2, μ3, μ4, we can estimate all the μi with variance σ2/4 and unbiasedness simultaneously by the same four measurements. This is achieved by measuring the total weight and the differences among the μi's according to the following design, where means putting the material on the right or left side of the balance:

(1.1)

Then, the estimators

are the best linear unbiased estimators (BLUE; see Section 2.1), each with variance σ2/4. Therefore, a naïve method to replicate four measurements for each μi to achieve variance σ2/4 is a considerable waste of time. More generally, when the number of measurements n is a multiple of 4, we can form the unbiased estimator of all n weights with variance σ2/n. This is achieved by applying a Hadamard matrix for the coefficients of μi's on the right‐hand side of equation (1.1) (see Section 15.3 for details, as well as the definition of a Hadamard matrix).

1.2 Interaction Effects

Simultaneous experiments are not only necessary for the efficiency of the estimator, but also for detecting interaction effects. The data in Table 1.1 show the result of 16 experiments (with averages in parentheses) for improving a printing machine with an aluminum plate. The measurements are fixing time (s); the shorter, the better. The factor F is the amount of ink and G the drying temperature. The plots of averages are given in Fig. 1.1.

Table 1.1 Fixing time of special aluminum printing.

Temperature

Ink supply

: large

5.9

3.7

4.6

4.4

5.7

5.0

4.9

2.1

(4.65)

(4.43)

: small

4.7

3.3

4.5

1.0

8.2

5.9

10.7

8.5

(3.38)

(8.33)

Figure 1.1 Average plots at (Fi, Gj).

From Fig. 1.1, (F2, G1) is suggested as the best combination. On the contrary, if we compare the amount of ink first, fixing at the drying temperature (G2), we shall erroneously choose F1. Then we may fix the ink level at F1 and try to compare the drying temperature. We may reach the conclusion that (F1, G2) should be an optimal combination without trying the best combination, (F2, G1). In this example the optimal level of ink is reversed according to the levels G1 and G2 of the other factor. If there is such an interaction effect between the two factors, then a one‐factor‐at‐a‐time experiment will fail to find the optimal combination. In contrast, if there is no such interaction effect, then the effects of the two factors are called additive. In this case, denoting the mean for the combinations (Fi, Gj) by μij, the equation

(1.2)

holds, where the dot and overbar denote the sum and average with respect to the suffix replaced by the dot throughout the book. Therefore, implies the overall average (general mean), for example. If equation (1.2) holds, then the plot of the averages becomes like that in Fig. 1.2. Although in this case a one‐factor‐at‐a‐time experiment will also reach the correct decision, simultaneous experiments to detect the interaction effects are strongly recommended in the early stage of the experiment.

Figure 1.2 No interaction.

1.3 Choice of Factors and Their Levels

A cause affecting the target value is called a factor. Usually, there are assumed to be many affecting factors at the beginning of an experiment. To write down all those factors, a ‘cause‐and‐effect diagram’ like in Fig. 1.3 is useful. This uses the thick and thin bones of a fish to express the rough and detailed causes, arranged in order of operation. In drawing up the diagram it is necessary to collect as many opinions as possible from the various participants in the different areas. However, it is impossible to include all factors in the diagram at the very beginning of the experiment, so it is necessary to examine the past data or carry out some preliminary experiments. Further, it is essential to obtain as much information as possible on the interaction effects among those factors. For every factor employed in the experiment, several levels are set up – such as the place of origin of materials and the reaction temperature . The levels of the nominal variable are naturally determined by the environment of the experiment. However, choosing the levels of the quantitative factor is rather arbitrary. Therefore, sometimes sequential experiments are required first to outline the response surface roughly then design precise experiments near the suggested optimal points. In Fig. 1.1, for example, the optimal level of temperature G with respect to F2 is unknown – either below G1 or between G1 and G2. Therefore, in the first stage of the experiment, it is desirable to design the experiment so as to obtain an outline of the response curve. The choice of factors and their levels are discussed in more detail in Cox (1958).

Figure 1.3 Cause‐and‐effect diagram.

1.4 Classification of Factors

This topic is discussed more in Japan than in other countries, and we follow here the definition of Takeuchi (1984).

Controllable factor.

The level of the controllable factor can be determined by the experimenter and is reproducible. The purpose of the experiment is often to find the optimal level of this factor.

Indicative factor.

This factor is reproducible but not controllable by the experimenter. The region in the international adaptability test of rice varieties is a typical example, while the variety is a controllable factor. In this case the region is not the purpose of the optimal choice, and the purpose is to choose an optimal variety for each region – so that an interaction analysis between the controllable and indicative factors is of major interest.

Covariate factor.

This factor is reproducible but impossible to define before the experiment. It is known only after the experiment, and used to enhance the precision of the estimate of the main effects by adjusting its effect. The covariate in the analysis of covariance is a typical example.

Variation (noise) factor.

This factor is reproducible and possible to specify only in laboratory experiments. In the real world it is not reproducible and acts as if it were noise. In the real world it is quite common for users to not follow exactly the specifications of the producer. For example, a drug for an infectious disease may be used before identifying the causal germ intended by the producer, or administered to a subject with some kidney difficulty who has been excluded in the trial. Such a factor is called a noise factor in the Taguchi method.

Block factor.

This factor is not reproducible but can be introduced to eliminate the systematic error in fertility of land or temperature change with passage of time, for example.

Response factor.

This factor appears typically as a categorical response to a contingency table and there are two important cases: nominal and ordinal. The response is usually not called a factor, but mathematically it can be regarded and dealt with as a factor, with categories just like levels.

One should also refer to Cox (1958) for a classification of the factors from another viewpoint.

1.5 Fixed or Random Effects Model?

Among the factors introduced in Section 1.4, the controllable, indicative and covariate factors are regarded as fixed effects. The variation factor is dealt with as fixed in the laboratory but dealt with as random in extending laboratory results to the real world. Therefore, the levels specified in the laboratory should be wide enough to cover the wide range of real applications. The block is premised to have no interaction with other factors, so that the treatment either as fixed or random does not affect the result. However, it is necessarily random in the recovery of inter‐block information in the incomplete block design (see Section 9.2).

The definition of fixed and random effects models was first introduced by Eisenhart (1947), but there is also the comment that these are mathematically equivalent and the definitions are rather misleading. Although it is a little controversial, the distinction of fixed and random still seems to be useful for the interpretation and application of experimental results, and is discussed in detail in Chapters 12 and 13.

1.6 Fisher’s Three Principles of Experiments vs. Noise Factor

To compare the treatments in experiments, Fisher (1960) introduced three principles: (1) randomization, (2) replication and (3) local control.

To explain randomization, Fisher introduced the sensory test of tasting a cup of tea made with milk. The problem then is to know whether it is true or not that a lady can declare correctly whether the milk or the tea infusion was added to the cup first. The experiment consists of mixing eight cups of tea, four in one way and four in the other, and presenting them to the subject for judgment. There are, however, numerous uncontrollable causes which may influence the result: the requirement that all the cups are exactly alike is impossible; the strength of the tea infusion may change between pouring the first and last cup; and the temperature at which the tea is tasted will change in the course of the experiment. One procedure that is used to escape from such systematic noise is to randomize the order of the eight cups for tasting. This process converts the systematic noise to random error, giving the basis of statistical inference.

Secondly, it is necessary to replicate the experiments to raise the sensitivity of comparison. It is also necessary to separate and evaluate the noise from treatment effects, since the outcomes of experiments under the same experimental conditions can vary due to unknown noise. The treatment effects of interest should be beyond such random fluctuations, and to ensure this several replications of experiments are necessary to evaluate the effects of noise.

Local control is a technique to ensure homogeneity within a small area for comparing treatments by splitting the total area with large deviations of noise. In field experiments for comparing a plant varieties, the whole area is partitioned into n blocks so that the fertility becomes homogeneous within each block. Then, the precision of comparisons is improved compared with randomized experiments of all an treatments.

Fisher’s idea to enhance the precision of comparisons is useful in laboratory experiments in the first stage of research development. However, in a clinical trial for comparing antibiotics, for example, too rigid a definition of the target population and the causal germs may not coincide with real clinical treatment. This is because, in the real world, antibiotics may be used by patients with some kidney trouble who might be excluded from the trial, by older patients beyond the range of the trial, before identifying the causal germ exactly, or with poor compliance of the taking interval. Therefore, in the final stage of research development it is required to introduce purposely variations in users and environments in the experiments to achieve a robust product in the real world. It should be noted here that the purpose of experiments is not to know all about the sample, but to know all about the background population from which the sample is taken – so the experiment should be designed to simulate or represent well the target population.

1.7 Generalized Interaction

A central topic of data science is the analysis of interaction in a generalized sense. In a narrow sense, it is the departure from the additive effects of two factors. If the effect of one factor differs according to the levels of the other factor, then the departure becomes large (as in the example of Section 1.2).

In the one‐way layout also, the main effects of a treatment become the interaction between the treatment and the response if the response is given by a categorical response instead of quantitative measurements. In this case, the data yij are the frequency of the (i, j) cell for the ith treatment and the jth categorical response. If we denote the probability of cell (i, j) by pij, then the treatment effect is a change in the profile (pi1, pi2, …, pib) of the ith treatment and the interaction effects in terms of pij are concerned. In this case, however, a naïve additive model like (1.2) is often inappropriate, and the log linear model

is assumed. Then, the factor (αβ)ij denotes the ith treatment effect. In this sense, the regression analysis is also a sort of interaction analysis between the explanation and the response variables. Further, the logit model, probit model, independence test of a contingency table, and canonical correlation analysis are all regarded as a sort of interaction analysis. One should also refer to Section 7.1 regarding this idea.

1.8 Immanent Problems in the Analysis of Interaction Effects

In spite of its importance, the analysis of interaction is paid much less attention than it deserves, and often in textbooks only an overall F‐ or χ2‐test is described. However, the degrees of freedom for interaction are usually large, and such an overall test cannot tell any detail of the data – even if the test result is highly significant. The degrees of freedom are explained in detail in Section 2.5.5. In contrast, the multiple comparison procedure based on one degree of freedom statistics is far less powerful and the interpretation of the result is usually unclear. Usually in the text books it is recommended to estimate the combination effect μij by the cell mean , if the interaction exists. However, it often occurs that only a few degrees of freedom can explain the interaction very well, and in this case we can recover information for μij from other cells and improve the naïve estimate of μij. This also implies that it is possible to separate the essential interaction from the noisy part without replicated experiments. Further, the purpose of the interaction analysis has many aspects – although the textbooks usually only describe how to find an optimal combination of the controllable factors. In this regard the classification of factors plays an essential role (see Chapters 10, 11, 13, and 14).

1.9 Classification of Factors in the Analysis of Interaction Effects

In case of a two‐factor experiment, one factor should be controllable since otherwise the experiment cannot result in any action. In case of controllable vs. controllable, the purpose of the experiment will be to specify the optimal combination of the levels of those two factors for the best productivity. Most of the textbooks describe this situation. However, the usual F test is not useful in practice, and the simple interaction model derived from the multiple comparison approach would be more useful.

In case of controllable vs. indicative, the indicative factor is not the object of optimization but the purpose is to specify the optimal level of the controllable factor for each level of the indicative factor. In the international adaptability test of rice varieties, for example, the purpose is obviously not to select an overall best combination but to specify an optimal variety (controllable) for each region (indicative). Then, it should be inconvenient to hold an optimal variety for each of a lot of regions in the world, and the multiple comparison procedure for grouping regions with similar response profiles is required.

The case of controllable vs. variation is most controversial. If the purpose is to maximize the characteristic value, then the interaction is a sort of noise in extending the laboratory result to the real world, where the variation factor cannot be specified rigidly and may take diverse levels. Therefore, it is necessary to search for a robust level of the controllable factor to give a large and stable output beyond the random fluctuations of the variation factor. Testing main effects by interaction effects in the mixed effects model of controllable vs. variation factors is one method in this line (see Section 12.3.5).

1.10 Pseudo Interaction Effects (Simpson’s Paradox) in Categorical Data

In case of categorical responses, the data are presented as the number of subjects satisfying a specified attribute. Binary (1, 0) data with or without the specified attribute are a typical example. In such cases it is controversial how to define the interaction effects, see Darroch (1974). In most cases an additive model is inappropriate, and is replaced by a multiplicative model. The numerical example in Table 1.2 will explain well how the additive model is inappropriate, where denotes useful and useless. In Table 1.2

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: