An Introduction to Cochran-Mantel-Haenszel Testing and Nonparametric ANOVA - J. C. W. Rayner - E-Book

An Introduction to Cochran-Mantel-Haenszel Testing and Nonparametric ANOVA E-Book

J. C. W. Rayner

0,0
111,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

An Introduction to Cochran-Mantel-Haenszel Testing and Nonparametric ANOVA Complete reference for applied statisticians and data analysts that uniquely covers the new statistical methodologies that enable deeper data analysis An Introduction to Cochran-Mantel-Haenszel Testing and Nonparametric ANOVA provides readers with powerful new statistical methodologies that enable deeper data analysis. The book offers applied statisticians an introduction to the latest topics in nonparametrics. The worked examples with supporting R code provide analysts the tools they need to apply these methods to their own problems. Co-authored by an internationally recognised expert in the field and an early career researcher with broad skills including data analysis and R programming, the book discusses key topics such as: * NP ANOVA methodology * Cochran-Mantel-Haenszel (CMH) methodology and design * Latin squares and balanced incomplete block designs * Parametric ANOVA F tests for continuous data * Nonparametric rank tests (the Kruskal-Wallis and Friedman tests) * CMH MS tests for the nonparametric analysis of categorical response data Applied statisticians and data analysts, as well as students and professors in data analysis, can use this book to gain a complete understanding of the modern statistical methodologies that are allowing for deeper data analysis.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 301

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

Dedication

Preface

1 Introduction

1.1 What Are the CMH and NP ANOVA Tests?

1.2 Outline

1.3

1.4 Examples

Bibliography

Note

2 The Basic CMH Tests

2.1 Genesis: Cochran (1954), and Mantel and Haenszel (1959)

2.2 The Basic CMH Tests

2.3 The Nominal CMH Tests

2.4 The CMH Mean Scores Test

2.5 The CMH Correlation Test

Bibliography

3 The Completely Randomised Design

3.1 Introduction

3.2 The Design and Parametric Model

3.3 The Kruskal–Wallis Tests

3.4 Relating the Kruskal–Wallis and ANOVA

F

Tests

3.5 The CMH Tests for the CRD

3.6 The KW Tests Are CMH MS Tests

3.7 Relating the CMH MS and ANOVA F Tests

3.8 Simulation Study

3.9 Wald Test Statistics in the CRD

Bibliography

4 The Randomised Block Design

4.1 Introduction

4.2 The Design and Parametric Model

4.3 The Friedman Tests

4.4 The CMH Test Statistics in the RBD

4.5 The Friedman Tests are CMH MS Tests

4.6 Relating the CMH MS and ANOVA F Tests

4.7 Simulation Study

4.8 Wald Test Statistics in the RBD

Bibliography

5 The Balanced Incomplete Block Design

5.1 Introduction

5.2 The Durbin Tests

5.3 The Relationship Between the Adjusted Durbin Statistic and the ANOVA F Statistic

5.4 Simulation Study

5.5 Orthogonal Contrasts for Balanced Designs with Ordered Treatments

5.6 A CMH MS Analogue Test Statistic for the BIBD

Bibliography

6 Unconditional Analogues of CMH Tests

6.1 Introduction

6.2 Unconditional Univariate Moment Tests

6.3 Generalised Correlations

6.4 Unconditional Bivariate Moment Tests

6.5 Unconditional General Association Tests

6.6 Stuart's Test

Bibliography

7 Higher Moment Extensions to the Ordinal CMH Tests

7.1 Introduction

7.2 Extensions to the CMH Mean Scores Test

7.3 Extensions to the CMH Correlation Test

7.4 Examples

Bibliography

8 Unordered Nonparametric ANOVA

8.1 Introduction

8.2 Unordered NP ANOVA for the CMH Design

8.3 Singly Ordered Three‐Way Tables

8.4 The Kruskal–Wallis and Friedman Tests Are NP ANOVA Tests

8.5 Are the CMH MS and Extensions NP ANOVA Tests?

8.6 Extension to Other Designs

8.7 Latin Squares

8.8 Balanced Incomplete Blocks

Bibliography

9 The Latin Square Design

9.1 Introduction

9.2 The Latin Square Design and Parametric Model

9.3 The RL Test

9.4 Alignment

9.5 Simulation Study

9.6 Examples

9.7 Orthogonal Trend Contrasts for Ordered Treatments

9.8 Technical Derivation of the RL Test

Bibliography

10 Ordered Non‐parametric ANOVA

10.1 Introduction

10.2 Ordered NP ANOVA for the CMH Design

10.3 Doubly Ordered Three‐Way Tables

10.4 Extension to Other Designs

10.5 Latin Square Rank Tests

10.6 Modelling the Moments of the Response Variable

10.7 Lemonade Sweetness Data

10.8 Breakfast Cereal Data Revisited

Bibliography

11 Conclusion

11.1 CMH or NP ANOVA?

11.2 Homosexual Marriage Data Revisited for the Last Time!

11.3 Job Satisfaction Data

11.4 The End

Bibliography

Appendix A: Appendix

A.1 Kronecker Products and Direct Sums

A.2 The Moore–Penrose Generalised Inverse

Subject Index

References Index

Data Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Growth of strawberry plants after applying pesticides.

Table 1.2 Opinions on homosexual marriage by religious beliefs and education...

Chapter 2

Table 2.1 The style of contingency table addressed.

Table 2.2 Three jams ranked for sweetness by eight judges.

a)

Table 2.3 Cross classification of age and whiskey.

Table 2.4 Analysis of Jams data.

Chapter 3

Table 3.1 Corn data.

Table 3.2 Proportion of rejections using Kruskal–Wallis and F tests for a no...

Chapter 4

Table 4.1 Cross‐classified counts of categorical responses made by 15 subjec...

Table 4.2 Long form of the same data in Table 4.1.

Table 4.3 Observed frequencies with expected frequencies in parentheses for ...

Table 4.4 Observed frequencies with expected frequencies in parentheses for ...

Table 4.5 Rankings of five applicants by a selection committee.

Table 4.6 Proportion of rejections using the Friedman and ANOVA F tests for ...

Chapter 5

Table 5.1 Scores for ice creams.

Table 5.2 Rankings for breakfast cereals.

Table 5.3 Proportion of rejections using the unadjusted Durbin, adjusted Dur...

Table 5.4 Linear and quadratic coefficients.

Table 5.5 Lemonades ranked by five tasters.

Table 5.6 Proportion of rejections using chi‐squared and

F

contrasts for a n...

Chapter 6

Table 6.1 Wechsler adult intelligence scores.

Table 6.2 Lizards data summarising the number of ants consumed based on mont...

Table 6.3 Generalised correlations with

t

‐test and permutation test

p

‐valu...

Table 6.4 Genuine trivariate generalised correlations together with

p

‐valu...

Table 6.5 Homosexual marriage opinions

p

‐values.

Table 6.6 Cross‐over clinical data.

Table 6.7 Saltiness scores for three products.

Chapter 7

Table 7.1 Jams orthonormal polynomials for (a) response and (b) treatment.

Table 7.2 CMH GC

p

‐values when testing for zero generalised correlations for...

Table 7.3 CMH generalised correlation

p

‐values for the homosexual marriage d...

Chapter 8

Table 8.1 Unused red‐light time in minutes.

Chapter 9

Table 9.1 Row and column block effects together with treatment effects for e...

Table 9.2 Dynamite data cross‐tabulated with batches and operators, with tre...

Table 9.3 Peanut yields and the corresponding varieties.

Chapter 10

Table 10.1

P

‐values for generalised correlations tested for consistency with...

Table 10.2 ANOVA

F

test

p

‐values assessing if generalised correlations vary ...

Table 10.3 Lemonade sweetness ranked by 10 judges.

Table 10.4 Summary of the non‐parametric unordered analysis for the lemonade...

Table 10.5 Lemonade polynomial means.

Chapter 11

Table 11.1 Homosexual marriage opinions p‐values for conditional and uncondi...

Table 11.2 Satisfaction and income in males and females. The satisfaction co...

Table 11.3 Job satisfaction p‐values.

List of Illustrations

Chapter 6

Figure 6.1 Scatter plot of the ranked intelligence scores versus age group w...

Chapter 9

Figure 9.1 A scatter plot for the means of the ARL test statistics against t...

Figure 9.2 Side‐by‐side boxplots showing test sizes for each of the tests br...

Figure 9.3 A subset of indicative power curves generated for the indicated i...

Figure 9.4 Rank sums of the aligned and then ranked dynamite data plotted ag...

Figure 9.5 Rank sums of the aligned and then ranked peanuts data plotted aga...

Figure 9.6 Rank sums of the aligned and then ranked traffic data plotted aga...

Chapter 10

Figure 10.1 Scatter plot of the response variable versus pesticide type for ...

Figure 10.2 Exact (circles) pesticide means shown using with modelled (trian...

Figure 10.3 Lemonade rank sums versus lemonades.

Figure 10.4 Exact means (circles) and models with just the (1, 2)th generali...

Figure 10.5 Exact variances (circles) and modelled variance using the (1, 2)...

Figure 10.6 Ranks versus breakfast cereal type.

Guide

Cover Page

Title Page

Copyright

Dedication

Preface

Table of Contents

Begin Reading

Subject Index

References Index

Data Index

Wiley End User License Agreement

Pages

iii

iv

v

xiii

xiv

xv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

217

218

219

220

221

222

223

224

225

An Introduction to Cochran–Mantel–Haenszel Testing and Nonparametric ANOVA

 

J.C.W. Rayner and G. C. Livingston Jr.

University of NewcastleAustralia

 

 

This edition first published 2023

© 2023 John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of J.C.W. Rayner and G. C. Livingston Jr. to be identified as the authors of this work has been asserted in accordance with law.

Registered Office

John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial Office

The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. JohnWiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data is Applied for:

Hardback ISBN: 9781119831983

Cover Design: Wiley

Cover Images: © metamorworks/Shutterstock, KatieDobies/Getty Images

John: For John Best, my friend and colleague for over half a century

Glen: For my boys, Huxley and Duke

Preface

My, that is, John Rayner's first acquaintance with the Cochran‐Mantel‐Haenszel (CMH) tests was through my long‐term colleague and friend, John Best. We were undergraduate students together. Afterwards he made a career in the CSIRO, a government science organization in Australia, while I worked in academia. When we started researching together, we found that his focus was primarily on the applications and computational matters, whereas mine gravitated to the theory and writing. There was considerable overlap, so we could have meaningful dialogue, but there was sufficient difference such that we each brought something different to the collaboration. I like to think that together we achieved more than what would have been possible separately.

Our early focus was on goodness of fit testing, but of course we delved into other matters. For example, from time to time we tried and failed to find satisfactory nonparametric tests for the Latin squares design.

When I did come to grips with the CMH methodology, I was largely frustrated. Inference is conditional, and that is a dated paradigm. Software is not commonly available, so application is inconvenient. In addition data entry is often inconvenient. The design, that I call the CMH design, is very restricted. If I want a nonparametric test for categorical response data, I want it for a wide range of designs – such as the Latin square design. In the text of this manuscript we quite deliberately try to get as much as possible from CMH methods for the Latin square and balanced incomplete block designs. The outcome is far from satisfactory.

The nonparametric ANOVA methodology came about as an offshoot of partitioning the Pearson statistic. Once developed, it is natural to compare it with CMH, but there are issues here. The CMH tests are of two types: nominal and ordinal. To my mind the Pearson tests should be compared with the nominal CMH tests, and a natural competitor for the ordinal CMH tests is the nonparametric ANOVA tests. But the latter are able to assess differences in treatment effects beyond location, and if treatments are ordinal as well as the response, beyond the simple correlation. It was therefore necessary to develop the ordinal CMH tests so that they could assess higher moment effects, to enable a meaningful comparison.

The text following attempts to clearly and fully develop the CMH methodology with some recent enhancements. It is interesting that the Kruskal–Wallis and Friedman tests are particular cases of CMH mean scores tests, although they were, of course, developed from different perspectives. The completely randomised design and the randomised block design both get individual chapters in the text firstly because of their fundamental importance. Moreover the ordinal CMH tests in these cases have simple forms, and therefore are accessible to users of statistics who will benefit from applying them for these designs, and who will benefit from knowing these methods are available in more complex designs.

Towards the end of the text, we outline the nonparametric ANOVA methodology, so that the reader can make comparisons. In the first two papers developing these methods, the context was simple factorial models subsequently extended to higher order models. For this text we develop the methods for the CMH design, and then extend them. The presentation is, we believe, clearer than in the original papers.

We are aware there are issues with some of the tests derived using the rank transform method. Nevertheless these tests, and others, arise from the nonparametric ANOVA. It is worth noting here that nonparametric ANOVA starts with a table of counts of categorical response data. The Pearson test for association can be applied to such tables and decomposed into components. This is an arithmetic decomposition and requires no model. The components are indexed by what might be called ‘degree’, and it is natural to enquire what degree means and what the components tell us about the data. One way of investigating this is to apply the components of a particular degree to the ANOVA to which the data was to be applied. It is then of interest to investigate the properties and performance of these tests. Here we apply both the nominal CMH and nonparametric ANOVA tests to data for which both are applicable. In the cases where we have done so, we find similar outcomes.

Another important aspect of nonparametric ANOVA is that if we assume weak multinomial models, then inference based on components of different degrees are uncorrelated. Thus, for example, first degree inference does not influence second degree inference, and so on. In other words, a significant mean effect cannot cause a significant second degree effect.

The climax of this manuscript is the examples in the concluding chapter where we apply all the CMH and nonparametric ANOVA. Our conclusion from doing this is that although the tools are different, the conclusions are largely the same. A significant advantage of the nonparametric ANOVA is that it is more generally applicable than the ordinal CMH. For this reason we would recommend the use the Pearson tests instead of the nominal CMH tests, and the unordered and ordered nonparametric ANOVA tests instead of the ordinal CMH tests. However, the CMH tests are familiar to many users and appropriate in the areas of application of greatest relevance to them; their use is embedded. For these users we hope that the extensions and new results here enrich their future data analysis.

Another issue with the application of the CMH methodology is the availability of suitable software. For example, there does not appear to be an function that is able to apply all four tests to a data set that allows for it. Other popular software such as SAS and SPSS do not easily allow for the application of all four tests. The manuscript and package written in conjunction with it will hopefully provide a suite of examples and data sets to the reader. These will hopefully facilitate familiarity of the methodologies outlined for data analysts, such that they can be applied to their own data sets of interest. The R package is available at: https://cran.r-project.org/web/packages/CMHNPA/index.html.

J.C.W. Rayner & G. C. Livingston Jr

1Introduction

1.1 What Are the CMH and NP ANOVA Tests?

The Cochran–Mantel–Haenszel (CMH) tests are a suite of four nonparametric (NP) tests used to test the null hypothesis of no association against various alternatives. They are applicable when several treatments are applied on several independent strata, or blocks. Every treatment is applied on every stratum. The responses are categorical, and so are recorded as counts. Two of the tests require scores, and so are called ordinal tests; the two that don't are called nominal tests. Ranks, or if observations are not distinct then mid‐ranks, are often used as category scores. However, there are many other options, such as the class midpoints if the data are real‐valued and continuous. Often 'natural' scores are convenient: just score the categories 1, 2, ….

From time to time in the material following we will refer to the CMH design, meaning that categorical responses are recorded for a number of treatments applied to a number of independent strata or blocks. The simplest CMH designs are arguably the completely randomised design (CRD) and the randomised block design (RBD). The methods do not directly apply to the balanced incomplete block design (BIBD) and the Latin square design (LSD). Both of these have, in a sense, missing observations.

One reason why the CMH tests are important is they provide a third option for the two simplest designs: the CRD and the RBD. For the CRD if the data are consistent with certain assumptions the parametric one‐way ANOVA test is available. If these assumptions aren't satisfied then the Kruskal–Wallis rank test is often applicable. If the responses are categorical then the CMH tests may be used. Similarly for the RBD the options of most interest are the parametric two‐way ANOVA test, the Friedman rank test, and the CMH tests.

Where possible we seek to give analyses that will prove to be as accessible as possible. The nominal tests are simply related to what are often called chi‐squared tests, but we prefer to call them Pearson tests. These are well‐understood and often available in 'click‐and‐point packages' such as JMP, and are on call in 1. The Pearson tests are natural alternatives to the nominal CMH tests. We give a simple expression for the CMH correlation test. This expression involves familiar sums of squares and gives useful additional information as a corollary. For the CRD and the RBD we give simple expressions for the other CMH ordinal test, the CMH mean scores test.

This simplicity means that some analyses that can be done by hand, meaning pencil and paper. Some are better suited to 'click‐and‐point' packages such as JMP. Otherwise analyses are best done by computer packages.

In summary the CMH tests are nonparametric tests for categorical response data. The applicable designs include, but are not limited to, the CRD and the RBD. These designs are of fundamental importance in many areas of application.

The nonparametric (NP) ANOVA tests are competitor tests for the ordinal CMH tests. They apply to data sets for which a fixed effects ANOVA is an appropriate analysis. These tests involve transforming the responses, and possibly, if treatments are ordered, the ordered treatment scores as well, both using orthonormal polynomials. The ANOVA is then applied to the transformed data of given degree. The resulting analysis permits testing univariate treatment effects and bivariate treatment effects. Under weak assumptions tests of different degrees do not affect each other. The NP ANOVA methodology is available more generally than the CMH methodology.

1.2 Outline

A fundamental aim of this book is to introduce users of statistics to new methods in CMH and related testing. Before discussing the new CMH methods, it is necessary to give the old, or basic methods. This will be done in Chapter 2.

This is followed by a discussion of the CMH tests in the CRD and RBD. This will introduce the Kruskal–Wallis and Friedman tests that can be shown to be CMH tests. Although CMH methods are not directly applicable to the BIBD and the LSD, consideration will subsequently also be given to these designs.

Data for the traditional CMH tests are given as tables of counts. Inference assumes that all marginal totals in such a table is known before the data are collected. This is conditional inference, inference conditional on these quantities being known. There are dual methods for which the marginals are not known. This is unconditional inference. Confusion can arise from a lack of clarity as to whether a particular test is conditional or not.

Next we turn to what we call nonparametric ANOVA. The data may be unranked or ranked, categorical or not. The primary objective is to analyse the data using an ANOVA model available via the linear model platform inherent in, for example, JMP and . If only the responses are ordered the method enables higher moment effects to be scrutinised. If both treatments and responses are ordered, then as well as the usual order (1, 1) correlation, those of degree (1, 2) and (1, 2) may be assessed. These reflect umbrella effects. Higher degree generalised correlations may also be scrutinised.

Given that nonparametric ANOVA can assess higher degree effects, it is natural to generalise the ordered CMH tests so that they can do so too. It is then possible to give a comparison of analyses by both methods.

Our discussion will involve some well‐known rank tests: the Kruskal–Wallis, Friedman, and Durbin tests. It is therefore sensible to make some general comments about ranking. Of particular interest is the treatment of ties. For general ranking methods see, for example, https://en.wikipedia.org/wiki/Ranking. For a treatment of ties in the sign test, see Rayner and Best (1999).

We now say what we mean by ranks. Given a set of data, , there exists a transformation such that are ordered. To say that has rank, namely, , means and (or the reverse set of inequalities). That the ranks are distinct means these inequalities are strict: . Ties occur when the ranks are not distinct. Of the many ways of dealing with ties, mid‐ranks is perhaps the most used. Mid‐ranks assigns to a group of tied data the mean of the ranks they would otherwise have been assigned. Thus if then the mid‐rank for these data is .

Essentially ranking takes a set of observations, categorises them into categories, and assigns to these categories distinct scores, , say, with . Clearly for untied data and for . Suppose that if an observation of treatment falls in category then the indicator variable , and zero otherwise. Then , the rank for this observation.

1.3

We have written an package called CMHNPA, which will serve as an accompaniment to this text. The package contains all the data sets which are analysed as well as functions written for the statistical methods and techniques discussed. Within each of the chapters there is code where example data sets are used. If the output from the functions is excessive, it will sometimes be suppressed; however, the code will be presented for the reader to execute the functions themselves.

In the following set of example code, the package is loaded along with the car package. Code is also shown to attach a data set called dataset to the workspace. Attaching the data files to the workspace allows the variables within the data frame to be accessed directly when using the functions, as opposed to using dataframe$variable syntax.

All of the code that follows in this text has the type of code shown above omitted. Therefore, if the reader wishes to recreate the output in later chapters, the packages will need to be loaded, and the data set attached to the workspace.

The package is currently available from: https://cran.r-project.org/web/packages/CMHNPA/index.html. It will undergo ongoing development and so output of functions may change and additional options added for functions over time.

This brief introduction will be ended with two examples. The first involves a non‐standard design to which the CMH methods do not apply. The purpose is to demonstrate another nonparametric approach: the rank transform (RT) method discussed by Conover and Iman (1981). If, in performing a parametric test, the assumptions are found to be dubious, then the idea is to replace the original data by their ranks – usually mid‐ranks if ties occur – and perform the intended analysis on these. It will often be the case that the assumptions will be closer to what is required. On the other hand, the hypotheses will now be about the ranks rather than the original data. Also the reader should be aware there are caveats concerning the rank transform method. See, for example, https://en.wikipedia.org/wiki/ANOVA_on_ranks.

The second example is in archetypical CMH format. We return to the strawberry data in Chapters 8 and 10 and to the homosexual marriage data in Chapters 2, 3, 6, 7, and 11.

1.4 Examples

1.4.1 Strawberry Data

The data in Table 1.1 are from Pearce (1960). Pesticides are applied to strawberry plants to inhibit the growth of weeds. The response represents the total spread in inches of twelve plants per plot approximately two months after the application of the weedkillers. The question is, do they also inhibit the growth of the strawberries? Pesticide O is a control. The design is a supplemented balanced design.

The means for pesticides A–O are 142.4, 144.4, 103.6, 123.8, and 174.625 respectively. Pearce (1960) found a strong treatment effect: the pesticides do appear to inhibit the growth of the strawberries.

Using , a pesticides ‐value of less than 0.0001 is found. Blocks have a ‐value of 0.0304 indicating blocking is important. The Shapiro–Wilk test applied to the residuals gave a ‐value of 0.1313. There appears to be no problem with the parametric model.

Table 1.1 Growth of strawberry plants after applying pesticides.

Block I

Block II

Block III

Block IV

C, 107 (5)

A, 136 (14)

B, 118 (8)

O, 173 (23)

A, 166 (21.5)

O, 146 (16)

A, 117 (7)

C, 95 (1)

D, 133 (13)

C, 104 (4)

O, 176 (24)

C, 109 (6)

B, 166 (21.5)

B, 152 (18)

D, 132 (11.5)

A, 130 (10)

O, 177 (25)

D, 119 (9)

B, 139 (15)

D, 103 (2.5)

A, 163 (19)

O, 164 (20)

O, 186 (27)

O, 185 (26)

O, 190 (28)

D, 132 (11.5)

C, 103 (2.5)

B, 147 (17)

Nevertheless the same analysis was applied to the ranked data that are given in parentheses in Table 1.1. A pesticides ‐value of less than 0.0001 is found. Blocks have a ‐value of 0.0308 indicating blocking is important. The Shapiro–Wilk test applied to the residuals gave a ‐value of 0.0909. There appears to be no problem with this model.

The two analyses are almost identical.

1.4.2 Homosexual Marriage Data

Scores of 1, 2, and 3 are assigned to the responses agree, neutral, and disagree respectively to the proposition ‘Homosexuals should be able to marry’ and scores of 1, 2, and 3 are assigned to the religious categories fundamentalist, moderate, and liberal respectively. See Table 1.2. Pearson tests on the overall table and on the table obtained by aggregating the strata School and College both have ‐values 0.000. There is strong evidence of an association between the proposition responses and religion, but perhaps there is more information in the data. Analysis from Agresti (2003) finds the three CMH tests have ‐values 0.000, agreeing with the conclusion that there is strong evidence of an association between the proposition responses and religion. The ordinal CMH tests that we will describe in Chapter 2 find evidence of mean differences in the responses and of a (linear–linear) correlation between responses and religion, corroborating the CMH inference.

Table 1.2 Opinions on homosexual marriage by religious beliefs and education levels for ages 18–25.

Homosexuals should be able to marry

Education

Religion

Agree

Neutral

Disagree

School

Fundamentalist

 6

2

10

Moderate

 8

3

 9

Liberal

11

5

 6

College

Fundamentalist

 4

2

11

Moderate

21

3

 5

Liberal

22

4

 1

The parametric analysis finds the mean responses for fundamentalist, moderate, and liberal are 2.314, 1.694, and 1.469, respectively. Using , a religion ‐value of 0.000 is found. Education (blocks) has a ‐value of 0.011 indicating blocking is important. The Shapiro–Wilk test applied to the residuals gave a ‐value of 0.000. The parametric model is problematic.

Applying the same analysis to the mid‐ranks of the response categories finds the mean responses for fundamentalist, moderate, and liberal are 87.30, 63.79, and 55.71, respectively. Using , a religion ‐value of 0.000 is found. Education (blocks) has a ‐value of 0.010 indicating blocking is important. The Shapiro–Wilk test applied to the residuals gave a ‐value of 0.000. The parametric model is again problematic.

In both examples the two analyses are almost identical.

Chapter 2 describes the basic CMH tests in detail.

Bibliography

Agresti, A. (2003).

Categorical Data Analysis

. Hoboken, NJ: John Wiley & Sons.

Conover, W. J. and Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics.

The American Statistician

, 35(3):124–129.

Pearce, S. C. (1960). Supplemented balance.

Biometrika

, 47(3/4):263–271.

Rayner, J. C. W. and Best, D. J. (1999). Modelling ties in the sign test.

Biometrics

, 55(2):663–665.

Note

1

Source: The R Foundation.

2The Basic CMH Tests

2.1 Genesis: Cochran (1954), and Mantel and Haenszel (1959)

The origins of the Cochran–Mantel–Haenszel (CMH) methodology go back to Cochran (1954), a wide‐ranging paper about the use of the Pearson tests of goodness of fit and of association, and Mantel and Haenszel (1959), which focused on methodological issues. In the former a test for average partial association was proposed, while the latter developed the test using a hypergeometric model. A quarter‐century of development was consolidated by Landis and his colleagues in two important papers: the review of Landis et al. (1978) and the description of the computer program PARCAT in Landis et al. (1979). PARCAT implemented the CMH methodology and the program was made available to users ‘for a modest cost’. Landis et al. (1978) is our starting point, the basis of the methodology described in subsequent sections of this chapter. But first we review the genesis of that methodology.

First, we need to discuss notation. We use for the test statistic and for the distribution. We use subscripts on to distinguish between variants of the test statistics, such as for the Pearson statistic that involves no parameter estimation and for the Pearson–Fisher statistic that involves maximum likelihood estimation of the unknown parameters when the data are categorical. There are several other named tests. Some of the early papers, Cochran (1954) included, used where we would use .

The title of Cochran (1954) announces that the paper is about ‘strengthening the common tests’: the Pearson tests of goodness of fit and of association. It begins by observing that these are omnibus tests, and hence cannot be expected to be as powerful in detecting alternatives of specified interest as more focused and alternative tests. Subsequently he discusses several tests based on components.

There is a section that addresses strengthening the Pearson tests, where he revisits advice on the issue of small cell expectations. Early studies had indicated that the approximation of the distribution to the null distribution of was poor unless the cell expectations all exceeded 5. Cochran felt that was conservative, and resulted in excessive pooling, especially in the tails. An example is given when such pooling causes a substantial loss of power. Recommendations for minimum cell expectations are given.

Sections 3, 4, and 5 concern goodness of fit for the Poisson, binomial and normal distributions and consider using components of and alternative tests to test for specific alternatives. Section 6 is titled Subdivision of degrees of freedom in thecontingency table. In a similar theme the test statistic