E-Book
54,99 €

Applied Epidemiology and Biostatistics E-Book

Giuseppe La Torre

0,0

54,99 €

oder

Leseprobe lesen

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: SEEd Edizioni Scientifiche
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

In the era of Evidence Based Medicine, health professionals are required to fully understand design, analysis and interpretation of the results of research. Furthermore, they should be able to assess the needs of their communities and respond accordingly. To achieve these goals, clinicians need to be familiar with the basic concepts of epidemiology and biostatistics. But epidemiology is more than “the study of.” Its application and practice are essential to address public health issues. That is why this book provides not only the theory, but also the opportunity of applying it in practice. In fact, each chapter presents one or more specific examples on how to perform an epidemiological or statistical data analysis and includes download access to the software and databases, giving the reader the possibility of replicating the analyses described. The final purpose is, therefore, to introduce epidemiologic and biostatistical methods as applied to clinical research, and to develop proficiency with computer software for performing the analysis of clinical datasets.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Veröffentlichungsjahr: 2010

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Der größte Raubzug der Geschichte

Matthias Weik

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Radikal führen

Reinhard K. Sprenger

30 Minuten Sympathisch und souverän: So geht Vortragen!

Thomas Lorenz

BLACKOUT - Morgen ist es zu spät

Marc Elsberg

The Truth About Employee Engagement

Patrick M. Lencioni

Mensch und Wald

Carsten Wippermann

The Food Truck Handbook

David Weber

Die selbstbestimmte Geburt

Ina May Gaskin

Leseprobe

Applied Epidemiology and Biostatistics

Giuseppe La Torre

Piazza Carlo Emanuele II, 19 – 10123 Torino – Italy

Tel. +39.011.566.02.58 – Fax +39.011.518.68.92

www.edizioniseed.it – [email protected]

First edition

September 2010

ISBN 978-88-8968-856-4

Although the information about medication given in this book has been carefully checked, the author and publisher accept no liability for the accuracy of this information. In every individual case the user must check such information by consulting the relevant literature.

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the Italian Copyright Law in its current version, and permission for use must always be obtained from SEEd Medical Publishers srl. Violations are liable to prosecution under the Italian Copyright Law.

Table of Contents

Colophon

Preface

1. Measures of Occurrence

1.1. Introduction

1.2. Prevalence

1.3. Incidence

1.4. Practical issues

1.4.1. Denominator issues

1.4.2. Numerator issues

1.5. Practical examples

References

2. Measures of Association

2.1. Relative risk

2.2. Risk difference

2.3. Other measures of attributable risk

2.3.1. Attributable risk percent

2.3.2. Population attributable risk

2.3.3. Population attributable risk percent

2.3.4. Odds ratio

2.4. Practical examples

2.4.1. Example 1

2.4.2. Example 2

2.4.3. Example 3

2.4.4. Example 4

References

3. Controlling for Confounding

3.1. What is confounding in epidemiology?

3.2. Controlling for confounding factors

3.2.1. Study design

3.2.2. Data analysing

3.3. How to control for confounding factors

3.3.1. Stratified analysis

3.3.2. The multivariate analysis

3.4. Practical examples

3.4.1. Example 1

3.4.2. Example 2

References

4. Cross-Sectional Studies

4.1. Introduction

4.2. Performing a cross-sectional study

4.3. A practical example

References

5. Cohort Studies

5.1. What is a cohort study?

5.2. Why do we need a cohort study?

5.3. The eligibility criteria

5.4. The structure of a cohort study

5.5. Censoring

5.6. The statistical analysis in a cohort study

5.7. Practical examples

5.7.1. Example 1

5.7.2. Example 2

5.7.3. Example 3

References

6. Experimental Studies

6.1. What is a sample experimental study?

6.2. Why do we need an experimental study?

6.3. The eligibility criteria

6.4. The randomisation process

6.5. The blinding

6.6. The structure of an experimental study

6.7. The statistical analysis in an experimental study

6.8. Practical examples

6.8.1. Example 1

6.8.2. Example 2

6.8.3. Example 3

References

7. Temporal Trend Analysis

7.1. Introduction

7.2. Basic principles of temporal trend analysis

7.3. Practical examples

7.3.1. Example 1

7.3.2. Example 2

References

8. The Surveillance of Sexually Transmitted Infections: the Theory and the Practice

8.1. Introduction

8.2. Surveillance of sexually transmitted infections in the third millennium

8.3. Attributes of a STI surveillance system

8.3.1. Simplicity

8.3.2. Acceptability

8.3.3. Sensitivity

8.3.4. Representativeness

8.3.5. Timeliness

8.4. Universal versus sentinel surveillance systems

8.5. How to perform STI surveillance

8.5.1. Which infections should be included in surveillance?

8.5.2. The example of Italy’s STI surveillance

8.5.3. Case definition

8.5.4. Sensitivity and specificity of the case definition

8.5.6. Data collection forms

8.5.7. Geographic distribution and representativeness

8.5.8. Stability of the sentinel population

8.5.9. Population denominators

8.5.10. Data dissemination

8.6. Data management and analysis

8.7. Practical exercises for analysing a dataset of STIs

8.7.1. The dataset

8.7.2. Characteristics of cases by reporting centre

8.7.3. Demographic and behavioural characteristics of patients

References

9. Systematic Reviews and Meta-Analysis of Clinical Trials

9.1. What is a systematic review? What is a meta-analysis?

9.2. Why do we need systematic reviews and meta-analyses?

9.2.1. The “Streptokinase case”

9.3. Practical steps of a meta-analysis

9.3.1. Production of an explicit and reproducible research protocol

9.3.2. Definition of criteria for the inclusion or exclusion of individual studies

9.3.3. Explicit and systematic bibliographic search

9.3.4. Assessment of the methodological quality of the studies

9.3.5. Data extraction from included studies

9.3.6. Statistical combination of data and presentation of the results

9.3.7. Sensitivity analysis and interpretation of the findings

9.3.8. Publication bias

9.4. A practical example of a meta-analysis of RCTs

9.4.1. Introduction to the statistical analysis: fixed and random effects models

9.4.2. Addressing heterogeneity

9.4.3. Meta-regression

9.4.4. Publication bias

9.4.5. Other commands and options for the analysis

References

10. Meta-Analysis of Observational Studies

10.1. Introduction

10.2. Practical example

10.2.1. Study justification and objectives

10.2.2. Brief description of the search strategy and data extraction

10.3. Worked examples

10.3.1. Example 1: Meta-analysis done by hand

10.3.2. Example 2: Meta-analysis using WINPEPI

10.3.3. Example 3: Meta-analysis using Stata basic commands

References

11. Genetic Epidemiology

11.1. Key concepts of genetic epidemiology

11.1.1. Definition and goals

11.1.2. The Human Genome

11.1.3. Study design

11.2. A practical example: the “candidate gene approach”

References

12. Analysis of Cost Data Using Bootstrap Technique

12.1. Introduction

12.2. Basic principles of the bootstrap method

12.3. Bootstrap standard normal confidence interval

12.4. Percentile method confidence interval

12.5. Bias corrected and accelerated (BCa) confidence interval

12.6. Application to example

References

13. Sensitivity, Specificity, and ROC Curves

13.1. Study introduction

13.2. Sensitivity, specificity, and predictive value

13.3. Basic principles of ROC curves

13.3.1. The area under a ROC curve

13.3.2. Practical example

13.4. Use of ROC analysis for comparison

References

14. Measures of Central Tendency and Dispersion

14.1. Introduction

14.2. Measures of central tendency

14.2.1. Mean

14.2.2. Median

14.2.3. Mode

14.3. Measures of dispersion

14.3.1. Range

14.3.2. Percentiles, quartiles, interquartile range

14.3.3. Variance

14.3.4. Standard deviation

14.4. Practical exercise

References

15. Sample Size Calculations

15.1. What is a sample size and why do we need a sample?

15.2. Steps of a sample size calculation

15.2.1. Practical examples

References

16. Representation of Data

16.1. Introduction

16.2. Representation of qualitative variables

16.2.1. Tables

16.2.2. Bar chart

16.2.3. Pie chart

16.3. Representation of quantitative variables

16.3.1. Histogram

16.3.2. Scatter plot

16.3.3. Box plot

References

17. Running Multiple Regression With Quantitative and Qualitative Variables With R

17.1. Introduction

17.2. The regression model with quantitative and qualitative variables

17.2.1. Testing of significance of parameters

17.2.2. The coding of dummy variables

17.3. Practical example: multiple regression with 2 qualitative variables

17.3.1. Installing and running R

17.3.2. Loading the dataset

17.3.3. Exploratory analysis

17.3.4. Correlation analysis

17.3.5. Model development

References

18. Methods for Assessing Normality of Quantitative Variables

18.1. Introduction

18.2. Definition of normality

18.3. Parametric and nonparametric statistics

18.4. How to verify normality of data

18.5. Practical examples

References

19. Quality of Life Evaluation

19.1. Quality of life in the general population

19.1.1. Quality of life measurement

19.1.2. The study

19.2. Quality of life in the clinical setting

19.2.1. Choice of QoL assessment tools

19.2.2. Study introduction

19.2.3. Practical example to calculate SF-36 scores

19.2.4. Correlation analysis between SF-36 scores and clinical findings

References

Appendix. Algorithm to create the SF-36 scales

20. Disability Adjusted Life Years (DALY) Summary Measure of Population Health

20.1. Introduction

20.2. Disability adjusted life year (DALY): the concept and its uses

20.3. Method for DALY estimation used in the serbian burden of disease study

20.3.1. Disease selection and staging

20.3.2. Social value choices

20.3.3. Population data

20.3.4. Mortality data

20.3.5. Disability data

20.3.6. DALY estimation

20.3.7. Years of life lost (YLL)

20.3.8. Years lost due to disability (YLD)

20.3.9. DALY Estimation

20.4. Practical example: calculation of DALY for colorectal cancer, Serbia, 2000

Acknowledgments

References

Preface

When this book was conceived, as a discussion among members of the section of Public Health Epidemiology of the European Public Health Associations (EUPHA), the main idea was to describe not only theory, but above all how to use the available software for epidemiologic and statistical data analysis.

But epidemiology is more than “the study of.” Its application and practice are essential to address public health issues.

So, the purpose of the book is to give the reader either the theory concerning specific aspects of technical disciplines as epidemiology and biostatistics, and in the mean time to give the opportunity of replicating under guidance the analysis done by each chapter’s author and already published in a given research article. The idea is to use the available software for epidemiologic and statistical data analysis, that each reader can download freely from the Internet.

Concerning the way in which the purpose of the book is to be achieved, it is important to underline that each chapter will present one or more specific examples on how to perform an epidemiological or statistical data analysis. The single chapter will give the reader the possibility of conducting an epidemiological or a statistical analysis, using a step by step approach. In other words, the reader will be able to do the analysis following the detailed description of the commands to use and the figures that represent a picture of the software command and/or output.

Why do we believe the book is needed?

The answer is mainly of technical reason. Up to now, many books concerning epidemiology and biostatistics are available, but no one could give practical examples using different freely available software. This book will use software such as Epi Info, Episheet, Simcalc, StatCalc, RevMan, that are downloadable from the web, and could cover most arguments concerning the two disciplines. In selected cases, we will make examples using commercial statistical software, such as Stata and SPSS.

The reader will be interested in this book because he/she will find a resolution of an epidemiological/biostatistical problem with practical example and a guide to use the software in a very detailed and efficient way.

Have you ever been interested in performing an epidemiological data analysis, but you thought to be not able to?

Have you ever been in trouble in making a statistical analysis, because you considered statistics a matter of statisticians only?

Applied Epidemiology and Biostatistics is the answer to you.

Questions as following will find an answer:

How to perform a multiple logistic regression using your own data?How to calculate the 95% confidence intervals of that odds ratio?How to perform a meta-analysis of papers of your interest?How to make graphs for your report?How to make a ROC curve or control for a possible confounding?How to calculate the sample size needed for the clinical trial?

This is a manual designed for using software, that are freely downloadable from the web, and could cover most arguments concerning the two disciplines, epidemiology and biostatistics. In selected cases, examples will use commercial statistical packages.

Who is the best reader of this book?

Considering that epidemiology can be seen as the study of factors affecting the health and illness of a certain population, and the Biostatistics is one of the main pillar of the research, in our intention this manual will have as principle possible targets the following:

Public Health practitioners (professionals, researchers).Clinicians (researchers).Health Managers (professionals, researchers).Teachers of Epidemiology.Teachers of Biostatistics.

Finally, I would like to thank all the contributors to this manual. Without their support and suggestions, it would have been impossible to achieve this goal.

Now, do you want to start?

Let’s make Epidemiology and Biostatistics together!

Giuseppe La Torre

Instructions for Downloading

To download the software and databases described in this book, you need to:

access the website http://download.edizioniseed.itaccess the “Download area” andtype the code SOB0H46Y.

1. Measures of Occurrence

Carlo Signorelli1, Edoardo Colzani2

1 University of Parma, Italy

2 University of Milano-Bicocca, Italy

Objectives of the chapter

To describe the measures of occurrence.To give some practical examples showing how to get them from a dataset using the statistical package SPSS.

1.1. Introduction

One of the objectives of epidemiology is to describe the frequency and distribution of diseases and other health-related events and to assess the association between possible risk factors and diseases. An initial contrast must be made between measures of occurrence (the ones describing a health phenomenon) and measures of association (the ones describing the strength of a possible association between an exposure and a health outcome). In this chapter, we will spend some time on the measures of occurrence. Measures of association will be discussed in a later chapter (see chapter 2).

The measures of occurrence can be divided into the following groups [1]:

Description of number of events.

Ratios.

Proportions.

Rates.

The description of the number of events usually only satisfies an administrative need for quantifying a phenomenon, but doesn’t usually give any information about the denominator to which it is referring. Knowing, for instance, how many people are sick with a certain disease could help institutions organize their healthcare facilities accordingly, but it would not give any additional information on how that disease is spread in that particular group of people in the absence of a denominator or a time reference.

1.2. Prevalence

A specific kind of proportion largely used in epidemiology is the prevalence. The prevalence can be defined as the proportion of cases of disease (or of another health-related phenomenon) at a certain time or period in the overall population.

Some refer to point prevalence when it is measured in a specific moment, and to period prevalence when it is measured over a defined period of time (a month, a year, etc.) [3]. The point prevalence can have big variations, especially if it measures diseases with a short duration, like infectious diseases that can wax and wane over relatively short periods of time. For instance, we could have a certain prevalence of influenza on one day and a very different one on the following day.

The period prevalence partially solves this problem since it is focused on a broader period of time and represents a good average of the phenomenon. In the denominator, it is usually considered the population at the mid-point of the time period. So, if we consider the prevalence of a certain disease over a year, the denominator will be the overall population on June 30 (had it been over a month, there would be the overall population on the 15th day of that same month, etc.), and the numerator will be the number of people with the disease in that year (both new and existing cases). The period prevalence always considers the entire population and this differentiates it from the incidence rate.

The prevalence has to be considered a static measure since it does not take time into account and also considers the existing cases of disease, not just the new cases. It is like a cross-sectional picture of a certain health-related phenomenon and it is mainly used to describe the fraction of a certain population affected by a disease or a risk factor. Its use is more frequently related to healthcare planning and the cost analysis of certain interventions [4].

1.3. Incidence

The incidence is an epidemiologic measure that indicates the risk of developing a disease over a period of time; in other words, how many new cases of disease have occurred during a specific period of time in an at-risk population. It is a more dynamic measure compared to the prevalence, and can be calculated as a proportion (cumulative incidence, incidence proportion, or incidence risk) or, more frequently, as a rate (incidence rate, incidence density, or person-time incidence) [5].

The incidence proportion (or cumulative incidence or incidence risk) considers the number of new cases of disease in the numerator. The denominator is the population at risk for that disease at the beginning of the period of observation [6], so any individual counted in the denominator has, in theory, some chance of being counted in the numerator as well. Therefore, the denominator does not include people who already have the disease or people who surely cannot develop the disease—for instance, individuals fully immunised against a certain communicable disease.

The time reference of the incidence always has to be specified: if we had four new cases of measles in the last week in a group of 10 subjects, one of which had been immunised and another had already had the disease, we would say that the incidence proportion of influenza is four new cases divided by 8 (10 – 2) subjects at risk, per week, which is equal to 0.5 or 50%. The proportion of 50% could be also called the risk that a member of that group of people will develop influenza in a week: that is why it is also called incidence risk.

The incidence rate (or incidence density or person-time incidence) shares the exact same numerator (number of new cases) as the incidence proportion, but since it is a rate, it also includes time in the denominator so that person-times at risk as well as persons at risk are factored into the formula and it can be more accurate. It is particularly helpful when the event happens to the same person more than once during the study period and the investigators want to take this into account [7].

The advantage of considering the incidence rate instead of the incidence proportion is the higher accuracy of the former. In fact, if the group above were a dynamic cohort with people coming in and out over the period of time considered (one week), we could have easily taken into account each person’s respective contribution to the denominator by only considering the actual at-risk periods each person spent in the cohort without giving the same weight to people staying in the cohort at risk for just one day and to people who were at risk for the entire period.

Figure 1.1: Relationship between incidence and prevalence.

The incidence is a more dynamic measure because it takes time into account. The incidence proportion gives more of an estimate of the individual risk of getting a certain disease by not taking into account all the at-risk periods of time, and only using at-risk individuals in the denominator. The incidence rate instead can be seen as an estimate of the speed of a certain health-related phenomenon by taking time into account in the computational formula.

In a steady-state situation, in which the inflow of subjects in the population equals the outflow, and with a steady incidence over time, the following relationship applies:

Therefore, prevalence is affected both by changes in incidence and disease duration. In fact, if we notice an increase in prevalence of a certain disease, we can expect it to be due either to an increase in incidence (more new cases) or to an increase in disease duration (increased survival), or both (Figure 1.1).

For instance, if we knew that the incidence of pancreatic cancer was 10/100,000 per year, and that its prevalence was 25/100,000 in a certain year, then we could estimate the average duration of the disease by dividing the prevalence by the incidence [1]:

1.4. Practical issues

1.4.1. Denominator issues

When investigators are dealing with large populations, the issue of the incidence denominator, which should just include the people (or person-times) actually at risk, is usually a minor one since only a small amount of people can be considered not at risk.

Sometimes, however, this can actually be an issue. For instance, when investigators are dealing with the incidence of uterine cancer, women who have had hysterectomies have to be excluded from the denominator (together with males, of course) to prevent an underestimate of the actual incidence or mortality rate [8].

1.4.2. Numerator issues

The most important issue is to define who has the disease. In other words, we must determine who actually is “a case.” To do this, an accurate written definition of a “case” must be followed. There are diseases—like certain psychiatric conditions—that can follow different and more subjective diagnostic criteria. By using different diagnostic criteria, we can come up with different numerators and therefore a different incidence or prevalence. Biases in data collection in general can also potentially affect the measure of frequency obtained.

1.5. Practical examples

Figure 1.2: The simple dataset.

Figure 1.3: The Analyze menu.

Figure 1.4: Getting the results: the output window.

After opening the complete dataset in Microsoft Excel from SPSS (choose Open existing data source from the list shown and then Browse), the Analyze menu can be scrolled down to ask for the proportions for each field we want to consider in the analysis, as shown in Figure 1.3 (click Analyze then Descriptive statistics and then Descriptives).

The next step is to select Variables V3, V4 and V5 and move them into the variables box and then click OK.

Once the OK button has been clicked, the absolute and relative frequencies will be available from the outcome box, with each table reporting about each of the variables chosen (Figure 1.4). Since prevalence is a simple proportion, and in this specific case it is a point prevalence, the obtained proportion will be the proportion of people testing for HIV, or of people who are at high risk for HIV sexual transmission.

If the objective is instead to obtain incidence, a different dataset is needed because incidence cannot be computed with cross-sectional data for obvious reasons. For instance, a dataset about influenza cases in the northern Italian city of Parma can be used. The data were collected during the flu period by the week of surveillance and through an active surveillance system involving three general practitioners (GPs) for a total of 2,700 patients. Specimens were collected for each influenza-like illness (ILI) diagnosed by the GPs to search for influenza viruses and identify the different influenza strains [10].

Since the population is made up of patients (individuals) and not person-times, only the cumulative incidence can be calculated. The proportion of ILI cases (or viruses) detected by a week of surveillance can be computed, and this again can be done by opening the dataset from SPSS, as shown in Figure 1.5.

Figure 1.5: The dataset in SPSS.

In order to get the cumulative incidence of influenza-like illnesses (ILIs) by week of surveillance, the values in the ILI column should be divided by the values in the Denominator field. To do this with SPSS, scroll down the Transform menu and choose the Compute option. A window will then open (see Figure 1.6). The variables on the left will have to be selected and moved into the Numeric Expression field in the upper right-hand side. They are then divided by using the computational symbols or functions shown in the bottom fields. The name of the target variable (the new column where the incidence results should be displayed) will also have to be typed into the upper left-hand side corner of the window.

Figure 1.6: Computing the new variable.

The characteristics of the new column for the incidence, originally called Var, can be modified by clicking on the Variable View sheet from the tool bar on the page. This can either be carried out before or even after computing the incidence with the Transform command as discussed above. The variable names can be modified by double-clicking on them. It is also possible to change the characteristics of the variables, and in this case, the number of visible decimals, since the incidences are very small (see Figure 1.7).

Figure 1.7: Variables view.

Going back to the data view sheet, the new outcome column should now be visible, in this case renamed as ILI_incidence (see Figure 1.8) with all the incidences by week of surveillance computed. The same type of calculations can also be carried out to obtain the cumulative incidences of virus isolation in the same population by week.

Figure 1.8: The final data view with the ILI incidences.

Calculating incidence and prevalence by using these types of statistical packages can also be quite straightforward in obtaining grouped or stratified measures. In fact, by scrolling through the Transform and Analyse menus, it is quite easy to find many different options based on specific requests. As a bottom line, though, the importance of the quality of the collected data and their source cannot be stressed enough in getting accurate measures of occurrence. That is something that no statistical package can correct nor account for.

References

1. Signorelli C. Elementi di metodologia epidemiologica. 6th ed. Roma: Società Editrice Universo, 2005.

2. Rothman KJ, Greenland S. Modern Epidemiology. 2nd ed. Philadelphia: Lippincott-Raven Publishers, 1998.

3. Abramson JH. Making Sense of Data. 2nd ed. Oxford: Oxford University Press, 1994.

4. Beaglehole R, Bonita R, Kjellstrom T. Basic Epidemiology. 2nd ed. Geneva: WHO, 2007.

5. Last JM. A Dictionary of Epidemiology. 3rd ed. Oxford: Oxford University Press, 1995.

6. Lopalco PL, Tozzi AE. Epidemiologia facile. 1st ed. Rome: Il Pensiero Unico Editore, 2003.

7. Jekel JF, Elmore JG, and Katz DL. Epidemiology, Biostatistics and Preventive Medicine. 2nd ed. Philadelphia: Saunders Text and Review Series, 2004.

8. Gordis L. Epidemiology. 3rd ed. Philadelphia: Elsevier Saunders, 2004.

9. Signorelli C, Pasquarella C, Limina RM, Colzani E, Fanti M, Cielo A, et al. Third Italian national survey on knowledge, attitudes, and sexual behaviour in relation to HIV/AIDS risk and the role of health education campaigns. Eur J Public Health 2006; 16: 498-504.

10. Tanzi ML. Data from the Regional Center for the Surveillance of viral diseases. Parma, 2002.

2. Measures of Association

Giuseppe La Torre1

1 Clinical Medicine and Public Health Unit, Sapienza University, Rome, Italy

Objectives of the chapter

To understand the concept of measures of association in different study designs.To be able to calculate relative risk, risk difference, and odds ratio with statistical packages.

2.1. Relative risk

In epidemiology, the concept of relative risk (RR, sometimes called risk ratio) concerns a ratio of the probability (risk) of the event occurring in the exposed group versus the probability of the same event in a non-exposed group.

Exposure

Disease status

Present

Absent

Drinking

No drinking

Table 2.1: A 2 by 2 contingency table: cohort study.

We can consider the following contingency table, corresponding to a cohort study, in which one can see the exposure status (i.e., drinking alcohol) on the left entrance, and can categorise the disease status of the participants in the study (present vs. absent) on the upper entrance (Table 2.1).

From the Table 2.1, the risk of getting the disease for drinkers is a/(a + b). Moreover, the risk of getting the disease for nondrinkers is c/(c + d). The RR is defined as the ratio between the risk of getting the disease for drinkers and the risk of getting the disease for nondrinkers—i.e., [a/(a + b)] / [c/(c + d)]. In this case, the baseline risk of getting the disease comes from the not-exposed group, which can be seen as a reference.

In a randomized clinical trial (RCT), we can define the rate [a/(a + b)] as experimental event rate (EER), and the rate [c/(c + d)] as the control event rate (CER). If the exposure variable is not dichotomised, but three levels of exposure exist, one can still calculate the RRs, taking one level of exposure as the reference. Looking at Table 2.2, one can calculate the risk of getting the disease for each exposure category.

Lesen Sie weiter in der vollständigen Ausgabe!

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: