Analysis of Ordinal Categorical Data - Alan Agresti - E-Book

Analysis of Ordinal Categorical Data E-Book

Alan Agresti

0,0
134,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Statistical science's first coordinated manual of methods for analyzing ordered categorical data, now fully revised and updated, continues to present applications and case studies in fields as diverse as sociology, public health, ecology, marketing, and pharmacy. Analysis of Ordinal Categorical Data, Second Edition provides an introduction to basic descriptive and inferential methods for categorical data, giving thorough coverage of new developments and recent methods. Special emphasis is placed on interpretation and application of methods including an integrated comparison of the available strategies for analyzing ordinal data. Practitioners of statistics in government, industry (particularly pharmaceutical), and academia will want this new edition.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 799

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Series

Title Page

Copyright

Preface

Chapter 1: Introduction

1.1 Ordinal Categorical Scales

1.2 Advantages of Using Ordinal Methods

1.3 Ordinal Modeling Versus Ordinary Regession Analysis

1.4 Organization of This Book

Chapter 2: Ordinal Probabilities, Scores, and Odds Ratios

2.1 Probabilities and Scores for an Ordered Categorical Scale

2.2 Ordinal Odds Ratios for Contingency Tables

2.3 Confidence Intervals for Ordinal Association Measures

2.4 Conditional Association in Three-Way Tables

2.5 Category Choice for Ordinal Variables

Chapter Notes

Exercises

Chapter 3: Logistic Regression Models Using Cumulative Logits

3.1 Types of Logits for An Ordinal Response

3.2 Cumulative Logit Models

3.3 Proportional ODDS Models: Properties and Interpretations

3.4 Fitting and Inference for Cumulative Logit Models

3.5 Checking Cumulative Logit Models

3.6 Cumulative Logit Models Without Proportional Odds

3.7 Connections with Nonparametric Rank Methods

Chapter Notes

Exercises

Chapter 4: Other Ordinal Logistic Regression Models

4.1 Adjacent-Categories Logit Models

4.2 Continuation-Ratio Logit Models

4.3 Stereotype Model: Multiplicative Paired-Category Logits

Chapter Notes

Exercises

Chapter 5: Other Ordinal Multinomial Response Models

5.1 Cumulative Link Models

5.2 Cumulative Probit Models

5.3 Cumulative Log-Log Links: Proportional Hazards Modeling

5.4 Modeling Location and Dispersion Effects

5.5 Ordinal ROC Curve Estimation

5.6 Mean Response Models

Chapter Notes

Exercises

Chapter 6: Modeling Ordinal Association Structure

6.1 Ordinary Loglinear Modeling

6.2 Loglinear Model of Linear-by-Linear Association

6.3 Row or Column Effects Association Models

6.4 Association Models for Multiway Tables

6.5 Multiplicative Association and Correlation Models

6.6 Modeling Global Odds Ratios and Other Associations

Chapter Notes

Exercises

Chapter 7: Non-Model-Based Analysis of Ordinal Association

7.1 Concordance and Discordance Measures of Association

7.2 Correlation Measures for Contingency Tables

7.3 Non-Model-Based Inference for Ordinal Association Measures

7.4 Comparing Singly Ordered Multinomials

7.5 Order-Restricted Inference with Inequality Constraints

7.6 Small-Sample Ordinal Tests of Independence

7.7 Other Rank-Based Statistical Methods for Ordered Categories

Appendix: Standard Errors for Ordinal Measures

Chapter Notes

Exercises

Chapter 8: Matched-Pairs Data with Ordered Categories

8.1 Comparing Marginal Distributions for Matched Pairs

8.2 Models Comparing Matched Marginal Distributions

8.3 Models for The Joint Distribution in A Square Table

8.4 Comparing Marginal Distributions for Matched Sets

8.5 Analyzing Rater Agreement on an Ordinal Scale

8.6 Modeling Ordinal Paired Preferences

Chapter Notes

Exercises

Chapter 9: Clustered Ordinal Responses: Marginal Models

9.1 Marginal Ordinal Modeling with Explanatory Variables

9.2 Marginal Ordinal Modeling: Gee Methods

9.3 Transitional Ordinal Modeling, Given the Past

Chapter Notes

Exercises

Chapter 10: Clustered Ordinal Responses: Random Effects Models

10.1 Ordinal Generalized Linear Mixed Models

10.2 Examples of Ordinal Random Intercept Models

10.3 Models with Multiple Random Effects

10.4 Multilevel (Hierarchical) Ordinal Models

10.5 Comparing Random Effects Models and Marginal Models

Chapter Notes

Exercises

Chapter 11: Bayesian Inference for Ordinal Response Data

11.1 Bayesian Approach to Statistical Inference

11.2 Estimating Multinomial Parameters

11.3 Bayesian Ordinal Regression Modeling

11.4 Bayesian Ordinal Association Modeling

11.5 Bayesian Ordinal Multivariate Regression Modeling

11.6 Bayesian Versus Frequentist Approaches to Analyzing Ordinal Data

Chapter Notes

Exercises

Appendix: Software for Analyzing Ordinal Categorical Data

SAS

R

Stata

SPSS

Other Programs

Bibliography

Example Index

Subject Index

Series

Copyright © 2010 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Agresti, Alan.

Analysis of ordinal categorical data / Alan Agresti.—2nd ed.

p. cm.

Includes bibliographical references and index.

ISBN 978-0-470-08289-8 (cloth)

1. Multivariate analysis. I. Title.

QA278.A35 2010

519.5′35—dc22

2009038760

Preface

In recent years methods for analyzing categorical data have matured considerably in their development. There has been a tremendous increase in the publication of research articles on this topic. Several books on categorical data analysis have introduced the methods to audiences of nonstatisticians as well as to statisticians, and the methods are now used frequently by researchers in areas as diverse as sociology, public health, and wildlife ecology. Yet some types of methods are still in the process of development, such as methods for clustered data, Bayesian methods, and methods for sparse data sets with large numbers of variables.

What distinguishes this book from others on categorical data analysis is its emphasis on methods for response variables having ordered categories, that is, ordinal variables. Specialized models and descriptive measures are discussed that use the information on ordering efficiently. These ordinal methods make possible simpler description of the data and permit more powerful inferences about population characteristics than do models for nominal variables that ignore the ordering information.

This is the second edition of a book published originally in 1984. At that time many statisticians were unfamiliar with the relatively new modeling methods for categorical data analysis, so the early chapters of the first edition introduced generalized linear modeling topics such as logistic regression and loglinear models. Since many books now provide this information, this second edition takes a different approach, assuming that the reader already has some familiarity with the basic methods of categorical data analysis. These methods include descriptive summaries using odds ratios, inferential methods including chi-squared tests of the hypotheses of independence and conditional independence, and logistic regression modeling, such as presented in Chapters 1 to 6 of my books An Introduction to Categorical Data Analysis (2nd ed., Wiley, 2007) and Categorical Data Analysis (2nd ed., Wiley, 2002).

On an ordinal scale, the technical level of this book is intended to fall between that of the two books just mentioned. I intend the book to be accessible to a broad audience, particularly professional statisticians and methodologists in areas such as public health, the pharmaceutical industry, the social and behavioral sciences, and business and government. Although there is some discussion of the underlying theory, the main emphasis is on presenting various ordinal methodologies. Thus, the book has more discussion of interpretation and application of the methods than of the technical details. However, I also intend the book to be useful to specialists who may want to become aware of recent research advances, to supplement the background provided. For this purpose, the Notes section at the end of each chapter provides supplementary technical comments and embellishments, with emphasis on references to related research literature.

The text contains significant changes from and additions to the first edition, so it seemed as if I were writing a new book! As mentioned, the basic introductions to logistic regression and loglinear models have been removed. New material includes chapters on marginal models and random effects models for clustered data (Chapters 9 and 10) and Bayesian methods (Chapter 11), coverage of additional models such as the stereotype model, global odds ratio models, and generalizations of cumulative logit models, coverage of order-restricted inference, and more detail throughout on established methods.

Nearly all the methods presented can be implemented using standard statistical software packages such as R and S-Plus, SAS, SPSS, and Stata. The use of software for ordinal methods is discussed in the Appendix. The web site www.stat.ufl.edu/∼aa/cda/software.html gives further details about software for applying methods of categorical data analysis. The web site www.stat.ufl.edu/∼aa/ordinal/ord.html displays data sets not shown fully in the text (in the form of SAS programs), several examples of the use of a R function (mph.fit) that can conduct many of the nonstandard analyses in the text, and a list of known errata in the text.

The first edition was prepared mainly while I was visiting Imperial College, London, on sabbatical leave in 1981–1982. I would like to thank all who commented on the manuscript for that edition, especially Sir David Cox and Bent Jørgensen.

For this edition, special thanks to Maria Kateri and Joseph Lang for reading a complete draft and making helpful suggestions and critical comments. Maria Kateri also very generously provided bibliographic checking and pointed out many relevant articles that I did not know about. Thanks to Euijung Ryu for computing help with a few examples, for help with improving a graphic and with my LaTeX code, and for many helpful suggestions on the text and the Bibliography. Bhramar Mukherjee very helpfully discussed Bayesian methods for ordinal data and case–control methods and provided many suggestions about Chapter 11. Also, Ivy Liu and Bernhard Klingenberg made helpful suggestions based on an early draft, Arne Bathke suggested relevant research on rank-based methods, Edgar Brunner provided several helpful comments about rank-based methods and elegant ways of constructing statistics, and Carla Rampichini suggested relevant research on ordinal multilevel models. Thanks to Stu Lipsitz for data for Example 9.2.3 and to John Williamson and Kyungmann Kim for data for Example 9.1.3. Thanks to Beka Steorts for WinBUGS help, Cyrus Mehta for the use of StatXact, Jill Rietema for arranging for the use of SPSS, and Oliver Schabenberger for arranging for the use of SAS. I would like to thank co-authors of mine on various articles for permission to use materials from those articles. Finally, thanks as always to my wife, Jacki Levine, for her unwavering support during the writing of this book.

A truly wonderful reward of my career as a university professor has been the opportunity to work on research projects with Ph.D. students in statistics and with statisticians around the world. It is to them that I would like to dedicate this book.

Alan Agresti

Gainesville, Florida and Brookline, Massachusetts

January 2010

Chapter 1

Introduction

1.1 Ordinal Categorical Scales

Until the early 1960s, statistical methods for the analysis of categorical data were at a relatively primitive stage of development. Since then, methods have been developed more fully, and the field of categorical data analysis is now quite mature. Since about 1980 there has been increasing emphasis on having data analyses distinguish between ordered and unordered scales for the categories. A variable with an ordered categorical scale is called ordinal. In this book we summarize the primary methods that can be used, and usually should be used, when response variables are ordinal.

Examples of ordinal variables and their ordered categorical scales (in parentheses) are opinion about government spending on the environment (too high, about right, too low), educational attainment (grammar school, high school, college, postgraduate), diagnostic rating based on a mammogram to detect breast cancer (definitely normal, probably normal, equivocal, probably abnormal, definitely abnormal), and quality of life in terms of the frequency of going out to have fun (never, rarely, occasionally, often). A variable with an unordered categorical scale is called nominal. Examples of nominal variables are religious affiliation (Protestant, Catholic, Jewish, Muslim, other), marital status (married, divorced, widowed, never married), favorite type of music (classical, folk, jazz, rock, other), and preferred place to shop (downtown, Internet, suburban mall). Distinct levels of such variables differ in quality, not in quantity. Therefore, the listing order of the categories of a nominal variable should not affect the statistical analysis.

Ordinal scales are pervasive in the social sciences for measuring attitudes and opinions. For example, each subject could be asked to respond to a statement such as “Same-sex marriage should be legal” using categories such as (strongly disagree, disagree, undecided, agree, strongly agree) or (oppose strongly, oppose mildly, neutral, favor mildly, favor strongly). Such a scale with a neutral middle category is often called a Likert scale. Ordinal scales also occur commonly in medical and public health disciplines: for example, for variables describing pain (none, mild, discomforting, distressing, intense, excruciating), severity of an injury in an automobile crash (uninjured, mild injury, moderate injury, severe injury, death), illness after a period of treatment (much worse, a bit worse, the same, a bit better, much better), stages of a disease (I, II, III), and degree of exposure to a harmful substance, such as measuring cigarette smoking with the categories (nonsmoker, <1 pack a day, ≥1 pack a day) or measuring alcohol consumption of college students with the scale (abstainer, non-binge drinker, occasional binge drinker, frequent binge drinker). In all fields, ordinal scales result when inherently continuous variables are measured or summarized by researchers by collapsing the possible values into a set of categories. Examples are age measured in years (0–20, 21–40, 41–60, 61–80, above 80), body mass index (BMI) measured as (<18.5, 18.5–24.9, 25–29.9, ≥30) for (underweight, normal weight, overweight, obese), and systolic blood pressure measured as (<120, 120–139, 140–159, ≥160) for (normal, prehypertension, stage 1 hypertension, stage 2 hypertension).

Often, for each observation the choice of a category is subjective, such as in a subject's report of pain or in a physician's evaluation regarding a patient's stage of a disease. (An early example of such subjectivity was U.S. President Thomas Jefferson's suggestion during his second term that newspaper articles could be classified as truths, probabilities, possibilities, or lies.) To lessen the subjectivity, it is helpful to provide guidance about what the categories represent. For example, the College Alcohol Study conducted at the Harvard School of Public Health defines “binge drinking” to mean at least five drinks for a man or four drinks for a woman within a two-hour period (corresponding to a blood alcohol concentration of about 0.08%); “occasional binge drinking” is defined as binge drinking once or twice in the past two weeks; and “frequent binge drinking” is binge drinking at least three times in the past two weeks.

For ordinal scales, unlike interval scales, there is a clear ordering of the levels, but the absolute distances among them are unknown. Pain measured with categories (none, mild, discomforting, distressing, intense, excruciating) is ordinal, because a person who chooses “mild” feels more pain than if he or she chose “none,” but no numerical measure is given of the difference between those levels. An ordinal variable is quantitative, however, in the sense that each level on its scale refers to a greater or smaller magnitude of a certain characteristic than another level. Such variables are of quite a different nature than qualitative variables, which are measured on a nominal scale and have categories that do not relate to different magnitudes of a characteristic.

1.2 Advantages of Using Ordinal Methods

Many well-known statistical methods for categorical data treat all response variables as nominal. That is, the results are invariant to permutations of the categories of those variables, so they do not utilize the ordering if there is one. Examples are the Pearson chi-squared test of independence and multinomial response modeling using baseline-category logits. Test statistics and P-values take the same values regardless of the order in which categories are listed. Some researchers routinely apply such methods to nominal and ordinal variables alike because they are both categorical.

Recognizing the discrete nature of categorical data is useful for formulating sampling models, such as in assuming that the response variable has a multinomial distribution rather than a normal distribution. However, the distinction regarding whether data are continuous or discrete is often less crucial to substantive conclusions than whether the data are qualitative (nominal) or quantitative (ordinal or interval). Since ordinal variables are inherently quantitative, many of their descriptive measures are more like those for interval variables than those for nominal variables. The models and measures of association for ordinal data presented in this book bear many resemblances to those for continuous variables.

A major theme of this book is how to analyze ordinal data by utilizing their quantitative nature. Several examples show that the type of ordinal method used is not that crucial, in the sense that we obtain similar substantive results with ordinal logistic regression models, loglinear models, models with other types of response functions, or measures of association and nonparametric procedures. These results may be quite different, however, from those obtained using methods that treat all the variables as nominal.

Many advantages can be gained from treating an ordered categorical variable as ordinal rather than nominal. They include:

Ordinal data description can use measures that are similar to those used in ordinary regression and analysis of variance for quantitative variables, such as correlations, slopes, and means.

Ordinal analyses can use a greater variety of models, and those models are more parsimonious and have simpler interpretations than the standard models for nominal variables, such as baseline-category logit models.

Ordinal methods have greater power for detecting relevant trend or location alternatives to the null hypothesis of “no effect” of an explanatory variable on the response variable.

Interesting ordinal models apply in settings for which standard nominal models are trivial or else have too many parameters to be tested for goodness of fit.

Table 1.1 Data Set for Which Ordinal Analyses Give Very Different Results from Unordered Categorical Analyses

1.3 Ordinal Modeling Versus Ordinary Regession Analysis

There are two relatively extreme ways to analyze ordered categorical response variables. One way, still common in practice, ignores the categorical nature of the response variable and uses standard parametric methods for continuous response variables. This approach assigns numerical scores to the ordered categories and then uses ordinary least squares (OLS) methods such as linear regression and analysis of variance (ANOVA). The second way restricts analyses solely to methods that use only the ordering information about the categories. Examples of this approach are nonparametric methods based on ranks and models for cumulative response probabilities.

1.3.1 Latent Variable Models for Ordinal Data

Many other methods fall between the two extremes described above, using ordinal information but having some parametric structure as well. For example, often it is natural to assume that an unobserved continuous variable underlies the ordinal response variable. Such a variable is called a latent variable.

In a study of political ideology, for example, one survey might use the categories liberal, moderate, and conservative, whereas another might use very liberal, slightly liberal, moderate, slightly conservative, and very conservative or an even finer categorization. We could regard such scales as categorizations of an inherently continuous scale that we are unable to observe. Then, rather than assigning scores to the categories and using ordinary regression, it is often more sensible to base description and inference on parametric models for the latent variable. In fact, we present connections between this approach and a popular modeling approach that has strict ordinal treatment of the response variable: In Chapters 3 and 5 we show that a logistic model and a probit model for cumulative probabilities of an ordinal response variable can be motivated by a latent variable model for an underlying quantitative response variable that has a parametric distribution such as the normal.

1.3.2 Using OLS Regression with an Ordinal Response Variable

In this book we do present methods that use only the ordering information. It is often attractive to begin a statistical analysis by making as few assumptions as possible, and a strictly ordinal approach does this. However, in this book we also present methods that have some parametric structure or that require assigning scores to categories. We believe that strict adherence to operations that utilize only the ordering in ordinal scales limits the scope of useful methodology too severely. For example, to utilize the ordering of categories of an ordinal explanatory variable, nearly all models assign scores to the categories and regard the variable as quantitative—the alternative being to ignore the ordering and treat the variable as nominal, with indicator variables. Therefore, we do not take a rigid view about permissible methodology for ordinal variables.

That being said, we recommend against the simplistic approach of posing linear regression models for ordinal response scores and fitting them using OLS methods. Although that approach can be useful for identifying variables that clearly affect a response variable, and for simple descriptions, limitations occur. First, there is usually not a clear-cut choice for the scores. Second, a particular response outcome is likely to be consistent with a range of values for some underlying latent variable, and an ordinary regression analysis does not allow for the measurement error that results from replacing such a range by a single numerical value. Third, unlike the methods presented in this book, that approach does not yield estimated probabilities for the response categories at fixed settings of the explanatory variables. Fourth, that approach can yield predicted values above the highest category score or below the lowest. Fifth, that approach ignores the fact that the variability of the responses is naturally nonconstant for categorical data: For an ordinal response variable, there is little variability at predictor values for which observations fall mainly in the highest category (or mainly in the lowest category), but there is considerable variability at predictor values for which observations tend to be spread among the categories.

Related to the second, fourth, and fifth limitations, the ordinary regression approach does not account for “ceiling effects” and “floor effects,” which occur because of the upper and lower limits for the ordinal response variable. Such effects can cause ordinary regression modeling to give misleading results. These effects also result in substantial correlation between values of residuals and values of quantitative explanatory variables.

1.3.3 Example: Floor Effect Causes Misleading OLS Regression

and standard deviation 10. The first scatterplot in Figure 1.1 shows the 100 observations on y* and x, each data point labeled by the category for z. The plot also shows the OLS fit that estimates this model.

Figure 1.1 Ordered categorical data (in second panel) for which ordinary regression suggests interaction, because of a floor effect, but ordinal modeling does not. The data were generated (in first panel) from a normal main-effects regression model with continuous (x) and binary (z) explanatory variables. When the continuous response y* is categorized and y is measured as (1, 2, 3, 4, 5), the observations labeled “1” for the category of z have a linear x effect with only half the slope of the observations labeled “0” for the category of z.

We then categorized the 100 generated values on y* into five categories to create observations for an ordinal variable y, as follows:

1.3.4 Ordinal Methods with Truly Quantitative Data

Even when the response variable is interval scale rather than ordered categorical, ordinal models can still be useful. One such case occurs when the response outcome is a count but when standard sampling models for counts, such as the Poisson, do not apply. For example, each year the British Social Attitudes Survey asks a sample of people their opinions on a wide range of issues. In several years the survey asked whether abortion should be legal in each of seven situations, such as when a woman is pregnant as a result of rape. The number of cases to which a person responds “yes” is a summary measure of support for legalized abortion. This response variable takes values between 0 and 7. It is inappropriate to treat it as a binomial variate because the separate situations would not have the same probability of a “yes” response or have independent responses. It is inappropriate to treat it as a Poisson or negative binomial variate, because there is an upper bound for the possible outcome, and at some settings of explanatory variables most observations could cluster at the upper limit of 7. Methods for ordinal data are valid, treating each observation as a single multinomial trial with eight ordered categories.

For historical purposes it is interesting to read the extensive literature of about 40 years ago, much of it in the social sciences, regarding whether it is permissible to assign scores to ordered categories and use ordinary regression methods. See, for example, Borgatta (1968), Labovitz (1970), and Kim (1975) for arguments in favor and Hawkes (1971), Mayer (1971), and Mayer and Robinson (1978) for arguments against.

1.4 Organization of This Book

The primary methodological emphasis in this book is on models that describe associations and interactions and provide a framework for making inferences. In Chapter 2 we introduce ordinal odds ratios that are natural parameters for describing most of these models. In Chapter 3 we introduce the book's main focus, presenting logistic regression models for the cumulative probabilities of an ordinal response. In Chapter 4 we summarize other types of models that apply a logit link function to ordinal response variables, and in Chapter 5 we present other types of link functions for such models.

The remainder of the book deals with multivariate ordinal responses. In Chapter 6 we present loglinear and other models for describing association and interaction structure among a set of ordinal response variables, and in Chapter 7 present bivariate ordinal measures of association that summarize the entire structure by a single number. The following three chapters deal with multivariate ordinal responses in which each response has the same categories, such as happens in longitudinal studies and other studies with repeated measurement. This topic begins in Chapter 8 with methods for square contingency tables having ordered rows and the same ordered columns and considers applications in which such tables arise. Chapters 9 and 10 extend this to an analysis of more general forms of correlated, clustered ordinal responses. Primary attention focuses on models for the marginal components of a multivariate response and on models with random effects for the clusters.

In Chapters 2 to 10 we take a frequentist approach to statistical inference, focusing on methods that use only the likelihood function. In the final chapter we show ways of implementing Bayesian methods with ordinal response variables, combining prior information about the parameters with the likelihood function to obtain a posterior distribution of the parameters for inference. The book concludes with an overview of software for the analysis of ordered categorical data, emphasizing R and SAS.

For other surveys of methods for ordinal data, see Hildebrand et al. (1977), Agresti (1983a, 1999), Winship and Mare (1984), Armstrong and Sloan (1989), Barnhart and Sampson (1994), Clogg and Shihadeh (1994), Ishii-Kuntz (1994), Ananth and Kleinbaum (1997), Scott et al. (1997), Johnson and Albert (1999), Bender and Benner (2000), Guisan and Harrell (2000), Agresti and Natarajan (2001), Borooah (2002), Cliff and Keats (2002), Lall et al. (2002), Liu and Agresti (2005), and O'Connell (2006).

Chapter 2

Ordinal Probabilities, Scores, and Odds Ratios

In this chapter we introduce ways of using odds ratios and other summary measures to describe the association between two ordinal categorical variables. The measures apply to sample data or to a population. We also present confidence intervals for these measures. First, though, we introduce some probabilities and scores that are a basis of ways of describing marginal and conditional distributions of ordinal response variables.

2.1 Probabilities and Scores for an Ordered Categorical Scale

For an observation randomly selected from the corresponding population, let πj denote the probability of response in category j. Some measures and some models utilize the cumulative probabilities

These reflect the ordering of the categories, with

2.1.1 Types of Scores for Ordered Categories

How can summary measures utilize the ordinal nature of the categorical scale? One simple way uses the cumulative probabilities to identify the median response: namely, the minimum j such that Fj ≥ 0.50. With a categorical response, an unappealing aspect of this measure for making comparisons of groups is its discontinuous nature: Changing a tiny bit of probability can have the effect of moving the median from one category to the next. Also, two groups can have the same median even when an underlying latent variable has distribution shifted upward for one group relative to the other.

Alternatively, we could assign ordered scores

An alternative approach to selecting scores uses the data themselves to determine the scores. One such set uses the average cumulative proportions for the ordinal response variable. For sample proportions {pj}, the average cumulative proportion in category j is

that is, the proportion of subjects below category j plus half the proportion in category j. In terms of the sample cumulative proportions ,

with . Bross (1958) introduced the term ridits for the average cumulative proportion scores.

The ridits have the same ordering as the categories, a1 ≤ a2 ≤ ≤ ac. Their weighted average with respect to the sample distribution satisfies

Whereas midrank scores fall between 1 and , ridit scores fall between 0 and 1. The linear relationship between them is

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!