E-Book
148,99 €

Survival Models and Data Analysis E-Book

Regina C. Elandt-Johnson

4,9

148,99 €

oder

Leseprobe lesen

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Serie: Wiley Series in Probability and Statistics
Sprache: Englisch

Beschreibung

Survival analysis deals with the distribution of life times, essentially the times from an initiating event such as birth or the start of a job to some terminal event such as death or pension. This book, originally published in 1980, surveys and analyzes methods that use survival measurements and concepts, and helps readers apply the appropriate method for a given situation. Four broad sections cover introductions to data, univariate survival function, multiple-failure data, and advanced topics.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 553

Veröffentlichungsjahr: 2014

Bewertungen

4,9 (16 Bewertungen)

Rezensionen(0 Rezensionen)

Leseprobe

Contents

Cover

Half Title page

Title page

Preface

Part 1: Survival Measurements and Concepts

Chapter 1: Survival Data

1.1 Scope of the Book

1.2 Sources of Data

1.3 Types of Variables

1.4 Exposure to Risk

1.5 Use of Probability Theory

1.6 The Collection of Survival Data

Chapter 2: Measures of Mortality and Morbidity. Ratios, Proportions, and Rates

2.1 Introduction

2.2 Ratios and Proportions

2.3 Rates of Continuous Processes

2.4 Rates for Repetitive Events

2.5 Crude Birth Rate

2.6 Mortality Measures Used in Vital Statistics

2.7 Relationships Between Crude and Age Specific Rates

2.8 Standardized Mortality Ratio (SMR): Indirect Standardization

2.9 Direct Standardization

2.10 Evaluation of Person-Years of Exposed to Risk in Long-Term Studies

2.11 Prevalence and Incidence of a Disease

2.12 Association Between Disease and Risk Factor. Relative Risk and Odds Ratio

References

Exercises

Chapter 3: Survival Distributions

3.1 Introduction

3.2 Survival Distribution Function

3.3 Hazard Function (Force of Mortality)

3.4 Conditional Probabilities of Death (Failure) and Central Rate

3.5 Truncated Distributions

3.6 Expectation and Variance of Future Lifetime

3.7 Median of Future Lifetime

3.8 Transformations of Random Variables

3.9 Location-Scale Families of Distribution

3.10 Some Survival Distributions

3.11 Some Models of Failure

3.12 Probability Integral Transformation

3.13 Compound Distributions

3.14 Miscellanea

3.15 Maximum Likelihood Estimation and Likelihood Ratio Tests

References

Exercises

Part 2: Mortality Experiences and Life Tables

Chapter 4: Life Tables: Fundamentals and Construction

4.1 Introduction

4.2 Life Table: Basic Definitions and Notation

4.3 Force of Mortality. Mathematical Relationships Among Basic Life Table Functions

4.4 Central Death Rate

4.5 Interpolation for Life Table Functions

4.6 Some Approximate Relationships Between nqx and nmx

4.7 Some Approximations to μx

4.8 Concepts of Stationary and Stable Populations

4.9 Construction of an Abridged Life Table From Mortality Experience of A Current Population

4.10 Some Other Approximations used in Construction of Abridged Life Tables

4.11 Construction of a Complete Life Table from an Abridged Life Table

4.12 Selection

4.13 Select Life Tables

4.14 Some Examples

4.15 Construction of Select Tables

References

Exercises

Chapter 5: Complete Mortality Data. Estimation of Survival Function

5.1 Introduction. Cohort Mortality Data

5.2 Empirical Survival Function

5.3 Estimation of Survival Function from Grouped Mortality Data

5.4 Joint Distribution of the Numbers of Deaths

5.5 Distribution of i

5.6 Covariance of i and j (i<j)

5.7 Conditional Distribution of i

5.8 Greenwood’s Formula for the (Conditional) Variance of i

5.9 Estimation of Curve of Deaths

5.10 Estimation of Central Death Rate and Force of Mortality in [ti, ti+1)

5.11 Summary of Results

References

Exercises

Chapter 6: Incomplete Mortality Data: Follow-up Studies

6.1 Basic Concepts and Terminology

6.2 Actuarial Estimator of qi from Grouped Data

6.3 Some Maximum Likelihood Estimators of qi

6.4 Some Other Estimators of qi

6.5 Comparison of Various Estimators of qi

6.6 Estimation of Curve of Deaths

6.7 Product-Limit Method of Estimating the Survival Function from Individual Times at Death

6.8 Estimation of Survival Function Using the Cumulative Hazard Function

References

Exercises

Chapter 7: Fitting Parametric Survival Distributions

7.1 Introduction

7.2 Some Methods of Fitting Parametric Distribution Functions

7.3 Exploitation of Special Forms of Survival Function

7.4 Fitting Different Distribution Functions Over Successive Periods of Time

7.5 Fitting a ‘Piece-Wise’ Parametric Model to a Life Table: An Example

7.6 Mixture Distributions

7.7 Cumulative Hazard Function Plots—Nelson’s Method for Ungrouped Data

7.8 Construction of the Likelihood Function for Survival Data: Some Examples

7.9 Minimum Chi-Square and Minimum Modified Chi-Square

7.10 Least Squares Fitting

7.11 Fitting a Gompertz Distribution to Grouped Data: An Example

7.12 Some Tests of Goodness of Fit

References

Exercises

Chapter 8: Comparison of Mortality Experiences

8.1 Introduction

8.2 Comparison of Two Life Tables

8.3 Comparison of Mortality Experience With a Population Life Table

8.4 Some Distribution-Free Methods for Ungrouped Data

8.5 Special Problems Arising in Clinical Trials and Progressive Life Testing

8.6 Censored Kolmogorov-Smirnov (Or Tsao-Conover) Test

8.7 Truncated Data. Pearson’s Conditional X2 Test

8.8 Testing for Consistent Differences in Mortality. Mantel-Haenszel and Logrank Tests

8.9 Parametric Methods

8.10 Sequential Methods

References

Exercises

Part 3: Multiple Types of Failure

Chapter 9: Theory of Competing Causes: Probabilistic Approach

9.1 Causes of Death: Basic Assumptions

9.2 Some Basic Problems

9.3 “Times Due to Die”

9.4 The Overall and ‘Crude’ Survival Functions

9.5 Case When X1,…, Xk are Independent

9.6 Equivalence and Nonidentifiability Theorems in Competing Risks

9.7 Proportional Hazard Rates

9.8 Examples

9.9 Heterogeneous Populations: Mixture of Survival Functions

References

Exercises

Chapter 10: Multiple Decrement Life Tables

10.1 Multiple Decrement Life Tables: Notation

10.2 Definitions of the Mdlt Functions

10.3 Relationships Among Functions of Multiple Decrement Life Table

10.4 Crude Forces of Mortality

10.5 Construction of Multiple Decrement Life Tables from Population (Cross-Sectional) Mortality Data

10.6 Some Major Causes of Death: An Example of Constructing the MDLT

References

Exercises

Chapter 11: Single Decrement Life Tables Associated with Multiple Decrement Life Tables: Their Interpretation and Meaning

11.1 Elimination, Prevention, and Control of a Disease

11.2 Mortality Pattern from Cause Cα Alone: ‘Private’ Probabilities of Death

11.3 Estimation of Waiting Time Distribution for Cause Cα: Single Decrement Life Table

References

Exercises

Chapter 12: Estimation and Testing Hypotheses in Competing Risk Analysis

12.1 Introduction. Experimental Data

12.2 Grouped Data. Nonparametric Estimation

12.3 Grouped Data. Fitting Parametric Models

12.4 Cohort Mortality Data with Recorded Times at Death or Censoring. Nonparametric Estimation

12.5 Cohort Mortality Data with Recorded Times at Death or Censoring. Parametric Estimation

References

Exercises

Part 4: Some More Advanced Topics

Chapter 13: Concomitant Variables in Lifetime Distributions Models

13.1 Concomitant Variables

13.2 The Role of Concomitant Variables in Planning Clinical Trials

13.3 General Parametric Model of Hazard Function with Observed Concomitant Variables

13.4 Additive Models of Hazard Rate Function

13.5 Multiplicative Models

13.6 Estimation in Multiplicative Models

13.7 Assessment of the Adequacy of a Model: Tests of Goodness of Fit

13.8 Selection of Concomitant Variables

13.9 Treatment-Covariate Interaction

13.10 Logistic Linear Models

13.11 Time Dependent Concomitant Variables

13.12 Concomitant Variables Regarded as Random Variables

13.13 Posterior Distribution of Concomitant Variables

13.14 Concomitant Variables in Competing Risk Models

References

Exercises

Chapter 14: Age of Onset Distributions

14.1 Introduction

14.2 Models of Onset Distributions

14.3 Estimation of Incidence Onset Distribution from Cross-Sectional Incidence Data

14.4 Estimation of Incidence Onset Distribution from Prevalence Data

14.5 Estimation of Waiting Time Onset Distribution from Population Data

14.6 Estimation of Waiting Time Onset Distribution from Retrospective Data

References

Exercises

Chapter 15: Models of Aging and Chronic Diseases

15.1 Introduction

15.2 Aging and Chronic Diseases

15.3 Some Models of Carcinogenesis

15.4 Some “Mosaic” Models of a Chronic Disease

15.5 “Fatal Shock” Models of Failure

15.6 Irreversible Markov Processes in Illness-Death Modeling

15.7 Reversible Models: The Fix-Neyman Model

References

Exercises

Author Index

Subject Index

Survival Models and Data Analysis

Published simultaneously in Canada.

Wiley Classics Library Edition Published 1999.

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750–4744. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, (212) 850-6011, fax (212) 850-6008, E-Mail: [email protected].

Library of Congress Cataloging in Publication Data:

Elandt-Johnson, Regina C. 1918– Survival models and data analysis.

(Wiley series in probability and mathematical statistics, applied section) Includes index. 1. Failure time data analysis. 2. Mortality. 3. Medical statistics. 4. Competing risks. I. Johnson, Norman Lloyd, joint author. II. Title.

QA276.E39 312’.01’51 79-22836 ISBN 0-471-03174-7 ISBN 0-471-34992-5 (Wiley Classics Paperback Edition)

Preface

This book contains material and techniques developed in several different disciplines: vital statistics, epidemiology, demography, actuarial science, reliability theory, statistical methods, among others. Despite this diversity of origin, these techniques are all relevant to aspects of the analysis of survival data.

Survival data can take so many different forms, spanning from results of small-scale laboratory tests to massive records from long-term clinical trials. Therefore it is impossible to lay down universal rules of procedure. Attempts to do so, even if apparently successful, are likely to lead to an uncritical, authoritarian approach, following whatever is currently regarded as the “correct approach.” We have tried to set out general principles to be used in each particular case. A number of the exercises require considerable independent thought and cannot be said to have a unique “correct” answer. They call rather for sound appraisal of a situation.

We have also endeavored to avoid the use of hidden assumptions. Much of our analysis is, indeed, based on assumptions (of independence, stability, etc.), but we have always sought to make it clear what assumptions are being made, and we encourage the reader to consider what might be the effects of departures from these assumptions.

The remarkable increase of activity in the statistical analysis of survival data over the last two decades, largely stimulated by problems arising in the analysis of clinical trials, has resulted in a considerable volume of writing on the topic. A major purpose of this book is to act as a guide for using this literature, assist in the choice of appropriate methods, and warn against uncritical use.

The content of the book might be subclassified according to several different criteria—statistical approach, relevant scientific disciplines, types of applications, and so on. We have, in fact, divided the book into four broad parts.

Part 1 introduces the type of data to be analyzed and basic concepts used in their analysis.

Part 2 deals with problems related to univariate survival functions. These include construction of life tables from population (cross-sectional) data and from experimental-type follow-up data. Considerable space is devoted to fitting parametric distributions and comparisons of two or more mortality experiences.

Part 3 is concerned with multiple-failure data. Time as well as cause of death are identified. Parametric and nonparametric theories of competing causes and estimation of different kinds of failure distributions are presented in some detail.

Part 4 presents some more advanced topics, including speculative mathematical models of biological processes of disease progression and aging. These are not intended to be definitive. Rather, they give the reader some ideas of ways in which models may be constructed.

Some readers may find the mathematical level uneven. This is because mathematical techniques are used as they are needed, and never for their own sake.

We take this opportunity to acknowledge the help we have received while working on this book. The typing was done by Joyce Hill (in major part), June Maxwell, and Mary Riddick. We are especially grateful to Anna Colosi, who did all the calculations and obtained graphical presentations using an electronic computer.

REGINA C. ELANDT-JOHNSON NORMAN L. JOHNSON

Chapel Hill, North CarolinaOctober 1979

Part 1 SURVIVAL MEASUREMENTS AND CONCEPTS

CHAPTER 1 Survival Data

1.1 SCOPE OF THE BOOK

The title of this book indicates that we discuss the treatment of “mortality data.” The direct meaning of this term is data that arise from recording times of death of individuals in a specified group. There will usually be additional data from observations of characters (other than survival or death) on the individuals in the group. These may be made at or near the moment of death (e.g., cause of death, length of illness, physical characteristics near the moment of death) or at earlier times (e.g., sex, age, family history, physical characteristics at earlier epochs). Certain of these variables—most commonly age (time elapsed since birth) and/or time elapsed since other important events (e.g., commencement of illness, date of operation)—are regarded as being of primary interest. It is often desired to assess the relationship between mortality and these primary variables, allowing, as far as possible for some of the other characteristics. The latter, in this context, are called concomitant variables. (Note that, for a given set of data, the distinction between primary and concomitant variables depends on the relationships to be studied.)

Individuals in the group may be humans, animals, fishes, insects, and so on. The group itself may be defined in various ways—by geographical location (e.g., population of a town or state, patients in a hospital or in a set of hospitals), by previous history (e.g., medical treatment, type of sickness, employment).

Occasionally we consider situations in which the replacement of “mortality” by the more general term “failure” is appropriate. In such contexts, the individuals are not necessarily (although they may be) living organisms. They may, for example, be mass-produced articles, such as electric lamps, with failure meaning inability to function in a specified role.

We are not primarily concerned with reversible changes of status, such as sickness causing temporary inability to work or repairable failure of electrical or mechanical systems. However there are occasional references to these matters, and Chapter 14 is devoted to discussing the distribution of age of onset of a disease.

Also, we are not concerned with statistics of birth, except as defining entry into a specific group of individuals and contributing to the assessment of mortality at juvenile ages. In particular, we do not study the measurement of fertility or the general province of demography.

Primarily, we are concerned with the study of failure data, and the relation of failure to a few important variables, such as age or time elapsed since some event (other than birth or manufacture). Other variables (concomitant variables) are introduced because of a possible relationship with failure but are not studied for their own sake.

1.2 SOURCES OF DATA

From the foregoing description, it can be seen that the methods discussed are applicable to a wide variety of situations. The sources of data are correspondingly varied. We first describe sources of mortality data, later turning to the topic of failure data in general.

A major subdivision of mortality data is between data relating to populations under more or less uncontrolled conditions (such as statistics of human deaths in a state or nation) and those observed under controlled conditions of a more or less experimental nature (as in a clinical trial).

Usually, the amount of data collected in the former situation is considerably greater than in the latter, though this need not be so. On the other hand, we almost always have more detailed information on each individual exposed to risk in the latter situation. In fact, in the uncontrolled situation we rarely have an exact enumeration of all the individuals who might be observed to fail (those exposed to risk). (A more precise discussion and definition of exposed to risk can be found in Chapter 2.)

When the date of death is recorded in a specific area over a specific period of time, estimates of the number exposed to risk are usually based on census data. For convenience, we use the term census-type data generally to describe data in which the numbers exposed to risk are estimated indirectly. When records are available from which the numbers exposed to risk can be ascertained directly we, again for convenience, use the term experimental-type data. Sometimes these terms may not appear to be very relevant to the data actually under consideration. Their function is to remind ourselves what type of data we are considering.

As we have already mentioned, experimental-type data are usually considerably smaller in volume than census-type data. An exception arises in the mortality experience of insurance companies. The records of such companies contain information on all persons insured with them, from which it is possible to determine exactly the exposed to risk, among whom the deaths (resulting in claims) are also recorded. For a given year of age, the numbers exposed to risk can quite easily be in the tens of thousands or more, and in practice some approximations to the exposed to risk may be used, corresponding to various groupings of ages (according to last birthday or nearest birthday), dates of entry, withdrawal, and death.

1.3 TYPES OF VARIABLES

We have already introduced the concept of concomitant variables in Section 1.1. Here we examine our classification of variables in somewhat greater detail.

The basic variable, representing survival or failure, is essentially a variable taking just two values (a binary variable) that can be chosen arbitrarily and are usually, and conveniently, taken to be 0 and 1. It can be measured by direct counting, as in experimental-type situations, or by indirect estimation, as in some census-type situations. We are most often concerned with studying the proportion of individuals surviving specified periods as a function of a few important variables. By far the most important variable is age, although in clinical trials, duration since an event such as initiation of treatment is sometimes taken as the “variable of interest.”

Thus life tables (which will be discussed in Chapter 4) usually represent the pattern of mortality (or failure) as a function of age, for particular groups of individuals, but this is not always the case. For human or animal populations, the time since a specific event may be used as the variable of interest. Occasionally, as in select life tables (see Sections 4.12–4.14), both age and time since a specific event are variables of interest.

The remaining measured variables, beyond the basic (survival) variable and the variable(s) of interest are concomitant variables. In so far as they do affect the mortality (failure) pattern, it is desirable to allow for them. This may be done analytically, (1) by introducing some fairly simple model that (it is hoped) will represent adequately the influence of concomitant variables, or (2) by constructing separate life tables for different values of the concomitant variable(s). The latter is the safer method, but can be usefully applied only when the concomitant variable(s) can be defined in terms of very few categories. An important example, in living populations, is sex. It is quite common to have separate life tables for females and males.

Other important concomitant variables are geographical location, social class (often measured as an index based on income, education, etc.), and physical characteristics such as blood pressure, weight, and vital capacity. This last group is especially relevant in most clinical trial data.

In mechanical and electrical systems, the variable age (duration of effective service) is again of major importance. Other variables, mainly representing conditions of use, include temperature, pressure, operator training, chemical content of contact materials, and so on.

In particular, we study (in Chapter 13) the relationship between failure and one or more variables of interest. Age is very often the variable of interest, but in clinical trials time since some specified event, other than birth, is usually the variable of interest. There is a wide variety of other concomitant variables that may need attention.

A concomitant variable of some importance is chronological year (e.g., 1965, 1975). Although not always recognized as such, its importance is acknowledged, for example, in national life tables which always relate to a specific period of time (e.g., U. S. Life Tables for 1959–1961, 1969–1971, etc.).

1.4 EXPOSURE TO RISK

In most analyses of survival data we are interested in studying the proportions of failure among groups of individuals under specified conditions. Clearly, the longer the period for which an individual is under observation, the more likely it is that failure will be observed, sooner or later. Comparability of numbers of failures requires that they be referred to some unit period of observation. To do this, we would like to know, for each individual, the period of exposure to risk, that is, the period of time during which the failure (or death) of an individual will actually be recorded and contribute to the observed failures. For census-type data this period is usually not known and has to be estimated. For experimental-type data, information from which the period of exposure to risk can be determined for each individual is usually available, although when the volume of data is large, approximate evaluation may be used.

1.5 USE OF PROBABILITY THEORY

Since we are studying proportions, it is natural to represent them in terms of probabilities. We are then able to use the very well-developed techniques and concepts of the theory of probability to assist in understanding the data. It is assumed that readers of this book already have some knowledge of elementary statistics and probability theory. Chapter 3 provides a recapitulation of the necessary knowledge, together with some special definitions directed toward applications in survival analysis.

In all applications of statistical methods based on probabilities, the models and assumptions on which they are based are not usually fully satisfied. This is certainly so in applications to survival data. For this reason we wish to de-emphasize application of these methods—especially when they are complicated, for example, in many cases of maximum likelihood estimation. We do, indeed, give accounts of the more useful statistical techniques (see in particular Chapters 7 and 8), although in a somewhat condensed form, but we strongly urge the reader to regard direct comparison—for example, by use of graphs—between observed data and model(s) to be of first importance. We therefore give special prominence to the use of graphical methods.

Probability theory provides essential background and tools for survival analysis, but possibilities of departures from theoretical models are so great and varied that direct confrontation between data and model(s) in terms of adequate descriptive agreement, is essential.

1.6 THE COLLECTION OF SURVIVAL DATA

It is a feature of much data collection, and particularly of survival and failure data, that it depends heavily for accuracy on the meticulous compilation and preservation of records. This is especially important, since the relevant failures are, typically, spread out over considerable periods of time, and each failure requires a fresh entry in the records.

Experimental-type data require even more effort in recording and storing, because more or less detailed records have to be kept on individuals coming under observation during the period of study. Concomitant variables for each individual also need to be recorded, and up-dated when necessary. It is essential, of course, that the records state clearly, and as accurately as possible, the periods over which an individual was exposed to risk in the sense that failure would have been recorded if it had occurred at any moment during that period, but not if it occurred at any other time.

All observations should be recorded to the greatest apparent accuracy that is reasonable, practicable, and meaningful. This general principle, which is justifiable on the grounds of making use of as much of the available information as possible, is of great importance in regard to recording times of death or failure. Deaths are sometimes recorded only as occurring in rather broad time intervals (e.g., in a specified week as opposed to a specified day or hour). The broader the time intervals, the greater the chance that one time interval will contain more than one death (this is often referred to as “multiple deaths”). In such cases it is not possible to decide the order in which the deaths occur, and the apparently equal times of death are referred to as tied observations. Some statistical techniques (especially the nonparametric procedures described in Chapter 8) make use of the temporal order in which the failures occur, and they cannot be applied directly when the data include tied observations (multiple failures). Some ingenious methods of dealing with ties have been suggested, but they all involve extra trouble and are based on dubious assumptions. We feel that application of the following relatively straightforward principles is usually preferable to reliance on automatic rules for applying specific formulas:

1. For any sets of ties the actual ordering must be one of a finite number of possibilities.

2. If it is feasible, every possible ordering can be analyzed using methods appropriate when there are no ties.

3. It will often suffice to consider only a few, perhaps only two, extreme orderings. If these give concordant results they can be adopted with some confidence. Even if not, it may be possible to see that only a relatively minute proportion of possible orderings give results markedly different from those of the remaining orderings.

If there is no clear-cut preponderance of reasonable consistent results among the possible orderings, the conclusion must be faced that, in fact, the recording interval is too broad, and attempts should be made to narrow it.

Of course, it is possible to make various assumptions about resolving recorded ties. For example, if two persons are recorded as dying in a period of six months, and their ages last birthday at the beginning of the period were 28 and 58, it is reasonable (other things being equal) to give greater weight to the possibility of the second person actually being the first of the two to die.

CHAPTER 2 Measures of Mortality and Morbidity. Ratios, Proportions, and Rates

2.1 INTRODUCTION

In Chapter 1, we discussed certain characteristics that describe an individual’s health. Health, in turn, is closely related to chances of survival beyond a certain age. We discussed the collection of such data—commonly called vital statistics—over large populations. Phenomena such as deaths and diseases were our main concern. In this chapter, we discuss the construction of some summary measures from such data, and also the interpretation of the measures.

It is important to distinguish two aspects of description of a community of living organisms: static and dynamic. Measures appropriate to the description of the static state of a population at a point of time (or over a specific, short period of time) are usually ratios and/or proportions. Living processes are, of course, dynamic, proceeding from birth to death of individuals. We need to describe the rapidity of change in living communities, and this leads to the second class of measures. For example, we need to measure how fast a population is dying off and replenishing itself by new births and immigration; how rapidly epidemics spread, and how quickly they abate; or the number of new cases of specific diseases such as cancer, heart attack, or tuberculosis per year, or more precisely, per year and per 100 or 1000 individuals in the population. Appropriate measures of such phenomena are rates.

In almost every scientific discipline, there is some ambiguity and lack of precision in terminology. In the analysis of vital statistics data, misunderstanding and misuse of the terms “proportion” and “rate” are especially confusing. So much so that it is difficult to follow the text of some papers on epidemiology and survival analysis, in which these terms are used arbitrarily and often inconsistently. It is necessary, therefore, to start by clarifying this terminology before introducing any of the well-known measures. Sections 2.2, 2.3, and 2.4 are devoted to this clarification, and Sections 2.6 through 2.10 present the basic principles of calculating “exposed to risk” and their use in calculating death rates. Two important concepts—prevalence and incidence of a disease—are outlined in Section 2.11.

2.2 RATIOS AND PROPORTIONS

2.2.1 Ratio

In a very broad sense, a ratio results from dividing one quantity by another. In science, however, it is mostly used in a more specific sense, that is, when the numerator and the denominator are two distinct quantities. Two kinds of ratios are in common use.

1. A ratio is used in comparing the frequencies of two mutually exclusive classes. For example,

in a given population. Another example is

(2.1)

in a given year, for a given population. This expresses the number of fetal deaths as compared to live births in the same population. The composition of a heterogeneous population consisting of different ethnic groups can be described by ratios.

Lesen Sie weiter in der vollständigen Ausgabe!

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: