Quality of Life - Peter M. Fayers - E-Book

Quality of Life E-Book

Peter M. Fayers

0,0
88,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The assessment of patient reported outcomes and health-related quality of life continue to be rapidly evolving areas of research and this new edition reflects the development within the field from an emerging subject to one that is an essential part of the assessment of clinical trials and other clinical studies.

The analysis and interpretation of quality-of-life assessments relies on a variety of psychometric and statistical methods which are explained in this book in a non-technical way. The result is a practical guide that covers a wide range of methods and emphasizes the use of simple techniques that are illustrated with numerous examples, with extensive chapters covering qualitative and quantitative methods and the impact of guidelines. The material in this new third edition reflects current teaching methods and content widened to address continuing developments in item response theory, computer adaptive testing, analyses with missing data, analysis of ordinal data, systematic reviews and meta-analysis.

This book is aimed at everyone involved in quality-of-life research and is applicable to medical and non-medical, statistical and non-statistical readers. It is of particular relevance for clinical and biomedical researchers within both the pharmaceutical industry and clinical practice.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 1183

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



To Tessa and Emma Fayers and Christine Machin

Quality of Life

The assessment, analysis and reporting of patient-reported outcomes

Third Edition

PETER M. FAYERS

Institute of Applied Health Sciences, University of Aberdeen School of Medicine and Dentistry, Scotland, UK and Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway

and

DAVID MACHIN

Medical Statistics Group, School of Health and Related Research, University of Sheffield, Sheffield, UK and Department of Cancer Studies and Molecular Medicine, University of Leicester, Leicester, UK

This edition first published 2016 © 2016 by John Wiley & Sons, Ltd

Second edition published 2007 © 2007 by John Wiley & Sons, Ltd

Registered office:John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial offices:9600 Garsington Road, Oxford, OX4 2DQ, UK The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK 111 River Street, Hoboken, NJ 07030-5774, USA

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell

The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting a specific method, diagnosis, or treatment by health science practitioners for any particular patient. The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. Readers should consult with a specialist where appropriate. The fact that an organization or Website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, readers should be aware that Internet Websites listed in this work may have changed or disappeared between when this work was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising herefrom.

A catalogue record for this book is available from the Library of Congress and British Library.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Cover image: © istockphoto/bopp63

CONTENTS

Preface to the third edition

Preface to the second edition

Preface to the first edition

Acknowledgements

List of abbreviations

Part 1: Developing and Validating Instruments for Assessing Quality of Life and Patient-Reported Outcomes

1: Introduction

1.1 Patient-reported outcomes

1.2 What is a patient-reported outcome?

1.3 What is

quality of life

?

1.4 Historical development

1.5 Why measure quality of life?

1.6 Which clinical trials should assess QoL?

1.7 How to measure quality of life

1.8 Instruments

1.9 Computer-adaptive instruments

1.10 Conclusions

2: Principles of measurement scales

2.1 Introduction

2.2 Scales and items

2.3 Constructs and latent variables

2.4 Single global questions versus multi-item scales

2.5 Single-item versus multi-item scales

2.6 Effect indicators and causal indicators

2.7 Psychometrics, factor analysis and item response theory

2.8 Psychometric versus clinimetric scales

2.9 Sufficient causes, necessary causes and scoring items

2.10 Discriminative, evaluative and predictive instruments

2.11 Measuring quality of life: reflective, causal and composite indicators?

2.12 Further reading

2.13 Conclusions

3: Developing a questionnaire

3.1 Introduction

3.2 General issues

3.3 Defining the target population

3.4 Phases of development

3.5 Phase 1: Generation of issues

3.6 Qualitative methods

3.7 Sample sizes

3.8 Phase 2: Developing items

3.9 Multi-item scales

3.10 Wording of questions

3.11 Face and content validity of the proposed questionnaire

3.12 Phase 3: Pre-testing the questionnaire

3.13 Cognitive interviewing

3.14 Translation

3.15 Phase 4: Field-testing

3.16 Conclusions

3.17 Further reading

4: Scores and measurements: validity, reliability, sensitivity

4.1  Introduction

4.2 Content validity

4.3 Criterion validity

4.4 Construct validity

4.5 Repeated assessments and change over time

4.6 Reliability

4.7 Sensitivity and responsiveness

4.8 Conclusions

4.9 Further reading

5: Multi-item scales

5.1 Introduction

5.2 Significance tests

5.3 Correlations

5.4 Construct validity

5.5 Cronbach’s

α

and internal consistency

5.6 Validation or alteration?

5.7 Implications for formative or causal items

5.8 Conclusions

6: Factor analysis and structural equation modelling

6.1 Introduction

6.2 Correlation patterns

6.3 Path diagrams

6.4 Factor analysis

6.5 Factor analysis of the HADS questionnaire

6.6 Uses of factor analysis

6.7 Applying factor analysis: Choices and decisions

6.8 Assumptions for factor analysis

6.9 Factor analysis in QoL research

6.10 Limitations of correlation-based analysis

6.11 Formative or causal models

6.12 Confirmatory factor analysis and structural equation modelling

6.13 Chi-square goodness-of-fit test

6.14 Approximate goodness-of-fit indices

6.15 Comparative fit of models

6.16 Difficulty-factors

6.17 Bifactor analysis

6.18 Do formative or causal relationships matter?

6.19 Conclusions

6.20 Further reading, and software

7: Item response theory and differential item functioning

7.1 Introduction

7.2 Item characteristic curves

7.3 Logistic models

7.4 Polytomous item response theory models

7.5 Applying logistic IRT models

7.6 Assumptions of IRT models

7.7 Fitting item response theory models: Tips

7.8 Test design and validation

7.9 IRT versus traditional and Guttman scales

7.10 Differential item functioning

7.11 Sample size for DIF analyses

7.12 Quantifying differential item functioning

7.13 Exploring differential item functioning: Tips

7.14 Conclusions

7.15 Further reading, and software

8: Item banks, item linking and computer-adaptive tests

8.1 Introduction

8.2 Item bank

8.3 Item evaluation, reduction and calibration

8.4 Item linking and test equating

8.5 Test information

8.6 Computer-adaptive testing

8.7 Stopping rules and simulations

8.8 Computer-adaptive testing software

8.9 CATs for PROs

8.10 Computer-assisted tests

8.11 Short-form tests

8.12 Conclusions

8.13 Further reading

Part 2: Assessing, Analysing and Reporting Patient-Reported Outcomes and the Quality of Life of Patients

9: Choosing and scoring questionnaires

9.1 Introduction

9.2 Finding instruments

9.3 Generic versus specific

9.4 Content and presentation

9.5 Choice of instrument

9.6 Scoring multi-item scales

9.7 Conclusions

9.8 Further reading

10: Clinical trials

10.1 Introduction

10.2 Basic design issues

10.3 Compliance

10.4 Administering a quality-of-life assessment

10.5 Recommendations for writing protocols

10.6 Standard operating procedures

10.7 Summary and checklist

10.8 Further reading

11: Sample sizes

11.1 Introduction

11.2 Significance tests,

p

-values and power

11.3 Estimating sample size

11.4 Comparing two groups

11.5 Comparison with a reference population

11.6 Non-inferiority studies

11.7 Choice of sample size method

11.8 Non-Normal distributions

11.9 Multiple testing

11.10 Specifying the target difference

11.11 Sample size estimation is pre-study

11.12 Attrition

11.13 Circumspection

11.14 Conclusion

11.15 Further reading

12: Cross-sectional analysis

12.1 Types of data

12.2 Comparing two groups

12.3 Adjusting for covariates

12.4 Changes from baseline

12.5 Analysis of variance

12.6 Analysis of variance models

12.7 Graphical summaries

12.8 Endpoints

12.9 Conclusions

13: Exploring longitudinal data

13.1 Area under the curve

13.2 Graphical presentations

13.4 Reporting

13.5 Conclusions

14: Modelling longitudinal data

14.1 Preliminaries

14.2 Auto-correlation

14.3 Repeated measures

14.4 Other situations

14.6 Conclusions

15: Missing data

15.1 Introduction

15.2 Why do missing data matter?

15.3 Types of missing data

15.4 Missing items

15.5 Methods for missing items within a form

15.6 Missing forms

15.7 Methods for missing forms

15.8 Simple methods for missing forms

15.9 Methods of imputation that incorporate variability

15.10 Multiple imputation

15.11 Pattern mixture models

15.12 Comments

15.13 Degrees of freedom

15.14 Sensitivity analysis

15.15 Conclusions

15.16 Further reading

16: Practical and reporting issues

16.1 Introduction

16.2 The reporting of design issues

16.3 Data analysis

16.4 Elements of good graphics

16.5 Some errors

16.6 Guidelines for reporting

16.7 Further reading

17: Death, and quality-adjusted survival

17.1 Introduction

17.2 Attrition due to death

17.3 Preferences and utilities

17.4 Multi-attribute utility (MAU) measures

17.5 Utility-based instruments

17.6 Quality-adjusted life years (QALYs)

17.7 Utilities for traditional instruments

17.8 Q-TWiST

17.9 Sensitivity analysis

17.10 Prognosis and variation with time

17.11 Alternatives to

QALY

17.12 Conclusions

17.13 Further reading

18: Clinical interpretation

18.1 Introduction

18.2 Statistical significance

18.3 Absolute levels and changes over time

18.4 Threshold values: percentages

18.5 Population norms

18.6 Minimal important difference

18.7 Anchoring against other measurements

18.8 Minimum detectable change

18.9 Expert judgement for evidence-based guidelines

18.10 Impact of the state of quality of life

18.11 Changes in relation to life events

18.12 Effect size statistics

18.13  Patient variability

18.14 Number needed to treat

18.15 Conclusions

18.16 Further reading

19: Biased reporting and response shift

19.1 Bias

19.2 Recall bias

19.3 Selective reporting bias

19.4 Other biases affecting PROs

19.5 Response shift

19.6 Assessing response shift

19.7 Impact of response shift

19.8 Clinical trials

19.9 Non-randomised studies

19.10 Conclusions

20: Meta-analysis

20.1 Introduction

20.2 Defining objectives

20.3 Defining outcomes

20.4 Literature searching

20.5 Assessing quality

20.6 Summarising results

20.7 Measures of treatment effect

20.8 Combining studies

20.9 Forest plot

20.10 Heterogeneity

20.11 Publication bias and funnel plots

20.12 Conclusions

20.13 Further reading

Appendix 1: Examples of Instruments

Appendix 2: Statistical tables

Table T1: Normal distribution

Table T2: Probability points of the Normal distribution

Table T3: Student’s

t

-distribution

Table T4: The χ

2

distribution

Table T5: The

F

-distribution

References

Index

EULA

List of Tables

Chapter 1

Table 1.1

Table 1.2

Table 1.3

Table 1.4

Table 1.5

Chapter 2

Table 2.1

Chapter 4

Table 4.1

Table 4.2

Table 4.3

Table 4.4

Table 4.5

Table 4.6

Table 4.7

Table 4.8

Table 4.9

Table 4.10

Table 4.11

Table 4.12

Table 4.13

Table 4.14

Table 4.15

Table 4.16

Table 4.17

Chapter 5

Table 5.1

Table 5.2

Chapter 6

Table 6.1

Table 6.2

Table 6.3

Table 6.4

Table 6.5

Table 6.6

Table 6.7

Table 6.8

Chapter 7

Table 7.1

Table 7.2

Table 7.3

Table 7.4

Table 7.5

Table 7.6

Table 7.7

Chapter 10

Table 10.1

Chapter 11

Table 11.1

Table 11.2

Table 11.3

Table 11.4

Table 11.5

Table 11.6

Chapter 12

Table 12.1

Table 12.2

Table 12.3

Table 12.4

Table 12.5

Table 12.6

Table 12.7

Table 12.8

Table 12.9

Chapter 13

Table 13.1

Chapter 14

Table 14.1

Table 14.2

Table 14.3

Table 14.4

Table 14.5

Table 14.6

Table 14.7

Chapter 15

Table 15.1

Table 15.2

Table 15.3

Table 15.4

Chapter 17

Table 17.1

Table 17.2a

Table 17.3

Table 17.4

Table 17.5

Chapter 18

Table 18.1

Table 18.2

Table 18.3

Table 18.4

Table 18.5

Table 18.6

Table 18.7

Table 18.8

Table 18.9

Table 18.10

Chapter 19

Table 19.1

Chapter 20

Table 20.1

List of Illustrations

Chapter 2

Figure 2.1 Scales, indexes, profiles and batteries.

Chapter 3

Figure 3.1 The Linear Analogue Self Assessment scale (LASA) is an example of a visual analogue scale.

Figure 3.2 Physical functioning scale of the EORTC QLQ-C30 versions 1.0 and 2.0.

Chapter 4

Figure 4.1 Sample size for two-observation

ICC

s, such as test-retest studies. The plot shows the effect of sample size on the average distance to the lower limit of an (asymmetric) two-sided 95% confidence interval. For example, with 80 patients on average the 95%

CI

for an

ICC

of 0.8 will have a lower limit of slightly above 0.7 (distance slightly less than 0.1).

Chapter 5

Figure 5.1 Clinimetric scales and scales with causal items.

Chapter 6

Figure 6.1 Comparison of approaches to analysis.

Figure 6.2 Postulated structure of the HADS questionnaire.

Figure 6.3 Plot of Factor 2 against Factor 1, using the rotated factors from Table 6.5

Figure 6.4  Scree plot of the eigenvalues in Table 6.3

Figure 6.5 Histograms of the 14 HADS items, using the dataset from Table 6.1

Figure 6.6 Conventional EFA model for the RSCL, showing factors for general psychological distress, pain, nausea/vomiting and symptoms/side-effects. Only 17 out of the 30 items on the main RSCL are shown.

Figure 6.7 Postulated causal structure for 17 items on the RSCL. Treatment- or disease-related symptoms and side-effects may be causal rather than effect indicators.

Figure 6.8 A PhysicalHealth/MentalHealth structural model for the EORTC QLQ-C30. Source: Gundy

et al.,

2012, Figure 1(b). CC BY-NC 2.0 (http://creativecommons.org/licenses/by-nc/2.0/uk/). Reproduced commercially with permission of Springer Science+Business Media.

Figure 6.9 Plot of Factor 2 against Factor 1, using the rotated factors from EFA of the mobility items. Source: Helbostad

et al.,

2011, Figure 1. Reproduced with permission of Springer Science+Business Media.

Chapter 7

Figure 7.1 Item characteristic curves (ICCs) for two items of differing difficulty.

Figure 7.2 Item characteristic curves for two items of differing discrimination and difficulty.

Figure 7.3 Item characteristic curves for four items, each with four categorical response options. The intersections between adjacent categories correspond to threshold parameters of the Generalized Partial Credit Model. Items A and C exhibit good properties, having steep slopes and covering different trait levels; item B covers a similar range of levels to C, but has weaker discrimination illustrated by less steep slopes; Item D is a weak item with disordered thresholds.

Figure 7.4 Item information curves corresponding to the four items of Figure 7.3. Items A and C are markedly superior to items B and D.

Figure 7.5 Standard error of MMSE score estimates in palliative care patients, showing that a 6-item test has similar properties to the full MMSE. Source: Adapted from Fayers

et al

., 2005, Figures 2 and 4. Reproduced with permission of Elsevier.

Figure 7.6 Tips for fitting IRT models.

Figure 7.7 DIF analysis of EORTC QLQ–C30 2-item pain scale, by geographical region.

Figure 7.8 Tips for exploring DIF.

Chapter 8

Figure 8.1 Chart representing the computer algorithm for a computer adaptive test (CAT).

Figure 8.2 Item banking.

Figure 8.3 Stages in the development of an item bank, for use in a computer adaptive test (CAT).

Figure 8.4 Information function and standard error of measurement for the HIT item pool compared with the population distribution of headache impact. Source: Bjorner et al., 2003, Figure 3. Reproduced with permission from Springer Science and Business Media.

Figure 8.5 Advantages and disadvantages of CAT.

Figure 8.6 Relation between HIT scores based on the full 54-item pool and the CAT based on 6, 10, 13 or 20 items. Source: Ware

et al.,

2003, Figure 2. Reproduced with permission of Springer Science and Business Media.

Figure 8.7 Issues to be considered for CAT software.

Chapter 9

Figure 9.1  The SF-36 health status of 318 adults with upper respiratory tract infections (URIs).

T

-scores were calculated using the general USA population. The URI patients were also contrasted against patients with lung disease, osteoarthritis and depression.

Note

: *

p

 < 0.001, †

p

 < 0.05 for comparisons with URI. Source: Reproduced with kind permission of Springer Science and Business Media. Linder JA and Singer DE (2003) Health-related quality of life of adults with upper respiratory tract infections.

Journal of General Internal Medicine

18: 802–807.

Chapter 10

Figure 10.1 Questionnaire to ascertain the reason why a patient has not completed the current QoL assessment.

Figure 10.2 Patient QoL Information Leaflet.

Figure 10.3  Checklist for writing clinical trials protocols.

Chapter 11

Figure 11.1 The total sample size required to detect an effect of a specified size, using a two-­sample

t

-test to compare two unpaired means with

α

of 0.05 and powers of 80% and 90%.

Figure 11.2 Sample size multiplication factors to compensate for multiple comparisons when applying a Bonferroni correction.

Chapter 12

Figure 12.1 Cumulative distribution function of responses for Aricept

®

5 and 10 mg doses compared to placebo. Important change thresholds considered for ADAS-cog score decreases over 24 weeks are 7, 4 and 0 points. Source: Reproduced from ARICEPT

®

Oral Solution (Donepezil Hydrochloride) [approval label, Figure 2]. Available at: http://www.accessdata.fda.gov/drugsatfda_docs/label/2004/21719lbl.pdf.

Figure 12.2 Histogram of baseline emotional functioning (EF) in patients with multiple myeloma. Source: Data from Wisløff

et al

., 1996.

Figure 12.3 Bar chart illustrating the baseline EF scores from multiple myeloma patients, by gender and age group. Source: Data from Wisløff

et al

., 1996.

Figure 12.4 Scatter plot of the baseline EF in patients with multiple myeloma at different ages. Source: Data from Wisløff

et al

., 1996.

Figure 12.5 Box-and-whisker plot of the baseline EF in patients with multiple myeloma, by gender and age group. Source: Data from Wisløff

et al

., 1996.

Figure 12.6 Profile of function and symptom values at one month after commencing treatment with α-interferon (bold line) or without α-interferon (thin line), in patients with myeloma. Source: Data from Wisløff

et al

., 1996.

Chapter 13

Figure 13.1 Pain profiles of two patients with severe burns. Source: Data from Ang

et al.,

2003.

Figure 13.2 A trial comparing two radiotherapy regimens for non-small-cell lung cancer, which used a daily diary card to assess QoL. Source: Bllehen

et al.

1992, Table 3. Reprinted by permission from Macmillan Publishers Ltd on behalf of Cancer Research UK.

Figure 13.3 Scatter plot showing the timing of FACT QoL assessments in a group of anaemic cancer patients. Each symbol represents a QoL assessment for a patient at the specified time in the study. The last assessment for a patient who withdrew early is represented by a star, the last assessment for a completer is a closed circle, and a continuing assessment is shown by an open circle. Source: Fallowfield et al., 2002, Figure 2. Reprinted with permission of Macmillan Publishers Ltd on behalf of Cancer Research UK.

Figure 13.4 Scatter plot of the HADS depression score at each assessment against day of assessment for patients with small-cell lung cancer. Source: Machin and Weeden 1998. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 13.5 Box-and-whisker plots of HADS depression scores for the 10 scheduled assessments of patients with small-cell lung cancer recruited to the MRC Lung Cancer Working Party trial. Source: Machin and Weeden 1998. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 13.6 Profiles of HADS anxiety score for patients with small-cell lung cancer (a) according to number of assessments completed, (b) by all patients and (c) by treatment group.

Source:

Hopwood

et al.

, 1994, Figures 2, 3 and 4. Reproduced with permission of Springer Science and Business Media.

Figure 13.7 Mean HADS depression score in patients with small-cell lung cancer, reverse plotted from date of last assessment, subdivided by the number of available assessments.

Source:

Machin and Weeden, 1998. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 13.8 Kaplan–Meier estimates of the survival curves for patients with myeloma, by treatment received. The apparent survival advantage to MP+IFN (relative risk,

RR

, = 1.08) is not statistically significant. Source: Data from Wisløff

et al.,

1996.

Figure 13.9 Histograms of AUC

36

for patients with myeloma, by treatment received. A Normal distribution is superimposed.

Source:

Data from Wisløff

et al

., 1996.

Chapter 14

Figure 14.1 Scatter plot of emotional functioning (

EF

) for multiple myeloma patients, prior to treatment and one month after starting treatment. Source: Data from Wisløff

et al.

, 1996.

Figure 14.2 Pairwise scatter diagrams for

EF

of multiple myeloma patients immediately prior to and during the first 36 months of therapy. Source: Data from Wisløff

et al.

, 1996.

Figure 14.3 Pain score changes for a patient receiving arnica following total abdominal hysterectomy. Source: Adapted from Hart

et al

., 1997. Reproduced with permission of Sage Publications, Ltd.

Figure 14.4 Mean levels of fatigue in patients with multiple myeloma, before and during treatment with MP or MP+IFN. Source: Data from Wisløff

et al.

, 1996

Figure 14.5 Brachytherapy or metal stent for palliation of dysphagia from oesophageal cancer. Source: Homs

et al.,

2004. Reproduced with permission of Elsevier.

Chapter 15

Figure 15.1  Mean physical functioning (PF) score stratified by time of dropout due to death or non-compliance in patients with metastatic breast cancer. Source: Curran

et al.,

1998a, Figure 14.2. Reproduced with permission of Oxford University Press

Figure 15.2  Kaplan–Meier estimate of the time to progression for patients with metastatic breast cancer. Source: Curran

et al

., 1998a, Figure 14.1. Reproduced with permission of Oxford University Press.

Figure 15.3  The emotional functioning (EF) scale of the EORTC QLQ-C30.

Figure 15.4  Illustration of simple imputation methods: imputing an item for a single patient.

Figure 15.5  Pattern mixture model with multiple imputation.

Figure 15.6  The course of PF over time by treatment, estimated using pattern mixture models. Source: Post

et al

., 2010, Figure 5.CC NC 4.0 (<http://creativecommons.org/licenses/by/4.0/>). Reproduced commercially with permission of Springer Science+Business Media.

Chapter 16

Figure 16.1 Histogram of baseline emotional functioning (EF) in patients with multiple myeloma. Source: Data from Wisløff

et al

., 1996.

Figure 16.2 Mean levels of fatigue in patients with multiple myeloma, before and during treatment with MP or MP+IFN: (a) bar chart; (b) line plot. Source: Data from Wisløff

et al

., 1996.

Figure 16.3 Baseline EF by age, in patients with multiple myeloma: (a) scatter plot; (b) scatter plot with ‘jitter’ and marginal box plots. In the box plots, the box shows the 25% and 75% quartiles and the median, and the ‘whiskers’ indicate the so-called ‘adjacent values’. Source: Data from Wisløff

et al

., 1996.

Figure 16.4 Change in EF from baseline, plotted against the baseline EF, in patients with multiple myeloma: (a) plotting (EF

1

– EF

0

) against EF

0

; (b) substituting random numbers in place of EF

0

, the initial measurement. Source: Data from Wisløff

et al

., 1996.

Chapter 17

Figure 17.1  Example of a choice in the DCE. Source: Ryan

et al

., 2006. Reproduced with permission of Elsevier.

Figure 17.2  Partitioned survival curves for the two treatment arms of the metastatic colorectal cancer trial. (a) panitumumab+BSC and (b) BSC alone, where BSC=best supportive care. REL is the relapse period until death or end of follow up; TOX represents days with grade 3 or worse adverse events; TWiST is the time without symptoms or toxicity.

Source:

Wang

et al

., 2011, Figure 1. Reprinted with permission of Macmillan Publishers Ltd on behalf of Cancer Research UK.

Figure 17.3  Threshold utility analysis, showing differences in

Q-TWiST

(number of weeks) for varying toxicity and relapse utility levels. Positive numbers indicate a benefit in favour of the panitumumab arm.

Source:

Wang

et al.,

2011, Figure 3. Reprinted with permission of Macmillan Publishers Ltd., on behalf of Cancer Research UK.

Figure 17.4 Change in

Q-TWiST

gain (months) over time for women with breast cancer receiving Short or Long duration chemotherapy; ‘gain’ indicates the advantage of Long over Short.

Source:

Gelber

et al.,

1995, Figure 4.

Chapter 18

Figure 18.1  Age-distribution of the mean scores for the subscales of the EORTC QLQ-C30 (version 2.0), for males (0) and females (Δ) from a sample of the general Norwegian population.

Source:

Adapted from Hjermstad

et al.

, 1998a. Reproduced with permission from the American Society of Clinical Oncology.

Figure 18.2 EORTC QLQ-C30

scale scores

for 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC). The bold line shows age- and gender-matched reference values in a sample of the general population in the Netherlands. The reference group had highest QoL, best functioning and least symptoms (based on the data shown in Table 18.3).

Figure 18.3  

Bar chart

of the mean EORTC QLQ-C30 scale scores shown in Figure 18.2, for 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC) (based on the data shown in Table 18.3).

Figure 18.4  Data of Figure 18.2, showing

mean differences

of the patient scores from the age- and gender-matched reference values of the general population for 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC). See footnote to Table 18.3 for scale names.

Figure 18.5 Each graph shows the mean score change and 95% CI for 239 patients with multiple myeloma who stated that they had become much better, moderately better, a little better, unchanged (0), a little worse, moderately worse, or much worse from time 1 (baseline) to time 2 (three months). High scores indicate more symptoms (pain and fatigue) or better functioning (physical function and global HRQL).

Source:

Kvam

et al.,

2010. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 18.6 Receiver-operating characteristic (ROC) curve of the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 change score in patients who stated that their pain had deteriorated. AUC, area under the curve.

Source:

Kvam

et al.,

2010. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 18.7 Data of Figure 18.4, showing

effect sizes

instead of absolute mean differences. For 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC), compared to age- and gender-matched reference values of the general population. See footnote to Table 18.3 for scale names.

Chapter 19

Figure 19.1 Sprangers and Schwartz (1999) theoretical model of response shift and quality of life.

Source:

Sprangers and Schwarz, 1999, Figure 1. Reproduced with permission of Elsevier.

Figure 19.2 Ratings of QoL varied according to frame of reference. Source: Fayers

et al., 2007

. Reproduced with permission of Elsevier.

Figure 19.3 Meta-analysis of response shift affecting assessment of global QoL. (see Chapter 20 for explanation of the ‘Forest plot’). Note: See Schwartz et al., 2006 for detailed references of studies cited.

Source:

Schwartz

et al. 2006

, Figure 3. Reproduced with permission of Springer Science and Business Media.

Chapter 20

Figure 20.1 Forest plot of St John’s Wort for depression. The solid squares denote individual mean effects and the horizontal lines represent 95%

CI

s. The diamonds denote pooled weighted mean differences. Note: See Linde

et al.

, 2005 for detailed references of studies cited.

Source:

Linde

et al.,

2005, comparison 02, outcome 04. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 20.2 Forest plot comparing placebo control against extracts from the Zingiberaceae family of plants for treatment of chronic pain. Standardised mean differences (

SMD

s) are shown for each study, sorted by date of publication. (Based on Lakhan

et al

., 2015).

Figure 20.3 Forest plot of pain severity following open mesh or non-mesh groin hernia repairs.

Source:

Scott

et al

., 2001. Reproduced with permission of John Wiley & Sons, Ltd.

Figure 20.4 Funnel plot of 23 placebo-controlled trials that reported information on the number of responders according to the Hamilton Rating Scale for Depression.

Source:

Linde

et al.,

2005, Figure 1. Reproduced with permission of John Wiley & Sons, Ltd.

Guide

Cover

Table of Contents

Preface to the third edition

Pages

xiii

xiv

xv

xvii

xviii

xix

xx

xxi

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

35

36

37

38

39

40

41

42

43

44

45

46

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

89

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

149

150

151

152

153

154

155

156

157

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

226

228

229

230

231

232

233

234

235

236

237

238

239

240

241

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

340

341

342

343

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

365

367

368

369

370

371

372

373

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

393

429

430

431

432

433

434

435

436

437

440

441

442

443

444

445

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

482

483

484

485

486

487

488

489

490

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

536

537

538

539

540

541

542

543

544

545

546

547

549

579

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

Preface to the third edition

When the first edition of this book was published in 2000, the assessment of quality of life (QoL) as an important outcome in clinical trials and other research studies was, at best, controversial. More traditional endpoints were the norm – measures such as disease status, cure and patient’s survival time dominated in research publications. How times have changed. Nowadays it is generally accepted that the patients’ perspective is paramount, patient representatives are commonly involved in the design of clinical trials, and patient-reported outcomes (PROs) have become recognised as standard outcomes that should be assessed and reported in a substantial proportion of trials, either as secondary outcomes or, in many instances, as a primary outcome from the study. Indeed, in 2000 the term ‘patient-reported outcome’ hardly existed and the focus at that time was on the ill-defined but all embracing concept of ‘quality of life’. Now, we regard QoL as but one PRO, with the latter encompassing anything reported by ‘asking the patient’ – symptoms such as pain or depression, physical or other functioning, mobility, activities of daily living, satisfaction with treatment or other aspects of management, and so on. Drug regulatory bodies have also embraced PROs and QoL as endpoints, while at the same time demanding higher standards of questionnaire development and validation.

In parallel with this, research into instrument development, validation and application continues to grow apace. There is increasing recognition of the importance of qualitative methods to secure a solid foundation when developing new instruments, and a corresponding rigour in applying and reporting qualitative research. In parallel, a major radical shift towards using item response theory both as a tool for developing and validating new instruments and as the basis of computer-adaptive tests (CATs). Many of the major research groups have been developing new CAT instruments for assessing PROs, and this new generation of questionnaires are becoming widely available for use on computer tablets and smart-phones.

Analysis, too, has benefited in various ways for the increased importance being attached to PROs – two examples being (i) methods for handling missing data and in particular reducing the biases that can arise when data are missing, and (ii) greater rigour demanded for the reporting of PROs.

As a consequence of these and many other developments, we have taken the opportunity to update many chapters. The examples, too, have been refreshed and largely brought up-to-date, although some of the classic citations still stand proud and have been retained. A less convenient aspect of the changes is, perhaps, the resultant increase in page-count.

We continue to be grateful to our many colleagues – their continued encouragement and enthusiasm has fuelled the energy to produce this latest edition; Mogens Groenvold in particular contributed to the improvement of Chapter 3.

Peter M. Fayers and David Machin September 2015

Preface to the second edition

We have been gratified by the reception of the first edition of this book, and this new edition offers the opportunity to respond to the many suggestions we have received for further improving and clarifying certain sections. In most cases the changes have meant expanding the text, to reflect new developments in research.

Chapters have been reorganised, to follow a more logical sequence for teaching. Thus sample size estimation has been moved to Part C, Clinical Trials, because it is needed for trial design. In the first edition it followed the chapters about analysis where we discussed choice of statistical tests, because the sample size computation depends on the test that will be used.

Health-related quality of life is a rapidly evolving field of research, and this is illustrated by shifting names and identity: quality of life (QoL) outcomes are now also commonly called patient- (or person-) reported outcomes (PROs), to reflect more clearly that symptoms and side effects of treatment are included in the assessments; we have adopted that term as part of the subtitle. Drug regulatory bodies have also endorsed this terminology, with the USA Food and Drug Administration (US FDA) bringing out guidance notes concerning the use of PROs in clinical trials for new drug applications; this new edition reflects the FDA (draft) recommendations.

Since the first edition of this book there have been extensive developments in item response theory and, in particular, computer-adaptive testing; these are addressed in a new chapter. Another area of growth has been in systematic reviews and meta-analysis, as evinced by the formation of a Quality of Life Methods Group by the Cochrane Collaboration. QoL presents some particular challenges for meta-analysis, and this led us to include the final chapter.

We are very grateful to the numerous colleagues who reported finding this book useful, some of whom also offered constructive advice for this second edition.

Peter M. Fayers and David Machin June 2006

Preface to the first edition

Measurement of quality of life has grown to become a standard endpoint in many randomised controlled trials and other clinical studies. In part, this is a consequence of the realisation that many treatments for chronic diseases frequently fail to cure, and that there may be limited benefits gained at the expense of taking toxic or unpleasant therapy. Sometimes therapeutic benefits may be outweighed by quality of life considerations. In studies of palliative therapy, quality of life may become the principal or only endpoint of consideration. In part, it is also recognition that patients should have a say in the choice of their therapy, and that patients place greater emphasis upon non-clinical aspects of treatment than healthcare professionals did in the past. Nowadays, many patients and patient-support groups demand that they should be given full information about the consequences of their disease and its therapy, including impact upon aspects of quality of life, and that they should be allowed to express their opinions. The term quality of life has become a catch-phrase, and patients, investigators, funding bodies and ethical review committees often insist that, where appropriate, quality of life should be assessed as an endpoint for clinical trials.

The assessment, analysis and interpretation of quality of life relies upon a variety of psychometric and statistical methods, many of which may be less familiar than the other techniques used in medical research. Our objective is to explain these techniques in a non-technical way. We have assumed some familiarity with basic statistical ideas, but we have avoided detailed statistical theory. Instead, we have tried to write a practical guide that covers a wide range of methods. We emphasise the use of simple techniques in a variety of situations by using numerous examples, taken both from the literature and from our own experience. A number of these inevitably arise from our own particular field of interest - cancer clinical trials. This is also perhaps justifiable in that much of the pioneering work on quality of life assessment occurred in cancer, and cancer still remains the disease area that is associated with the largest number of quality of life instruments and the most publications. However, the issues that arise are common to quality of life assessment in general.

Acknowledgements

We would like to say a general thank you to all those with whom we have worked on aspects of quality of life over the years; especially, past and present members of the EORTC Quality of Life Study Group, and colleagues from the former MRC Cancer Therapy Committee Working Parties. Particular thanks go to Stein Kaasa of the Norwegian University of Science and Technology at Trondheim who permitted PMF to work on this book whilst on sabbatical and whose ideas greatly influenced our thinking about quality of life, and to Kristin Bjordal of The Radium Hospital, Oslo, who made extensive input and comments on many chapters and provided quality of life data that we used in examples. Finn Wisløff, for the Nordic Myeloma Study Group, very kindly allowed us to make extensive use their QoL data for many examples. We are grateful to the National Medical Research Council of Singapore for providing funds and facilities to enable us to complete this work. We also thank Dr Julian Thumboo, Tan Tock Seng Hospital, Singapore, for valuable comments on several chapters. Several chapters, and Chapter 7 in particular, were strongly influenced by manuals and guidelines published by the EORTC Quality of Life Study Group.

Peter M. Fayers and David Machin January 2000

List of abbreviations

Note: We have adopted the policy of using italics to indicate variables, or things that take values.

α

Alpha, Type I error

α

Cronbach

Cronbach’s reliability coefficient

β

Beta, Power, 1-Type II error

κ

Kappa, Cohen’s measure of agreement

θ

Theta, an unobservable or “latent” variable

ρ

Rho, the correlation coefficient

σ

Sigma, the population standard deviation, estimated by SD

ADL

Activities of daily living

ANOVA

Analysis of variance

ARR

Absolute risk reduction

AUC

Area under the curve

CAT

Computer-adaptive test

CFA

Confirmatory factor analysis

CI

Confidence interval

CONSORT

Consolidated Standards of Reporting Trials

(http://www.consort-statement.org/)

CPMP

Committee for Proprietary Medicinal Products (European regulatory body)

DCE

Discrete choice experiment

df

Degrees of freedom

DIF

Differential item functioning

EF

Emotional functioning

EFA

Exploratory factor analysis

ES

Effect size

F

-statistic

The ratio of two variance estimates; also called F-ratio

F

-test

The statistical test used in ANOVA, based on the F-statistic

GEE

Generalised estimating equation

HYE

Healthy-years equivalent

IADL

Instrumental activities of daily living

ICC

Intraclass correlation

ICC

Item characteristic curve

IRT

Item response theory

LASA

Linear analogue self-assessment (scale)

MANOVA

Multivariate analysis of variance

MAR

Missing at random

MCAR

Missing completely at random

MIMIC

Multiple indicator multiple cause (model)

ML

Maximum likelihood (estimation)

MNAR

Missing not at random

MTMM

Multitrait–multimethod

NNT

Number needed to treat

NS

Not statistically significant

OR

Odds ratio

p

Probability, as in p-value

PF

Physical functioning

PRO

Patient-reported outcome

QALY

Quality-adjusted life years

QoL

Quality of life

Q-TWiST

Quality-adjusted time without symptoms and toxicity

RCT

Randomised controlled trial

RE

Relative efficiency

RR

Relative risk

RV

Relative validity

SD

Standard deviation of a sample

SE

Standard error

SEM

Standard error of measurement

SEM

Structured equation model

SG

Standard gamble

SMD

Standardised mean difference

SRM

Standardised response mean

t

Student’s t-statistic

TTO

Time trade-off

TWiST

Time without symptoms and toxicity

VAS

Visual analogue scale

WMD

Weighted mean difference

WTP

Willingness to pay

QoL instruments

 

AIMS

Arthritis Impact Measurement Scale

AMPS

Assessment of Motor and Process Skills

AQLQ

Asthma Quality of Life Questionnaire

BDI

Beck Depression Inventory

BI

Barthel Index of disability

BPI

Brief Pain Inventory

BPRS

Brief Psychiatric Rating Scale

EORTC QLQ-C30

European Organisation for Research and Treatment of

Cancer, Quality of Life Questionnaire, 30-items

EQ-5D

EuroQoL EQ-5D self report questionnaire

FACT-G

Functional Assessment of Cancer–General Version

FAQ

Functional Activity Questionnaire

FLIC

Functional Living Index–Cancer

GPH

General Perceived Health

HADS

Hospital Anxiety and Depression Scale

HDQoL Huntington’s Disease health-related Quality of Life

questionnaire

HOPES

HIV Overview of Problems–Evaluation System

HRSD

Hamilton Rating Scale for Depression

HUI

Health Utilities Index

MFI-20

Multidimensional Fatigue Inventory 20

MMSE

Mini-Mental State Examination

MPQ

McGill Pain Questionnaire

NHP

Nottingham Health Profile

PACIS

Perceived Adjustment to Chronic Illness Scale

PAQLQ

Pediatric Asthma Quality of Life Questionnaire

PASS

Pain Anxiety Symptoms Scale

PCQLI

Pediatric Cardiac Quality of Life Inventory

PGI

Patient Generated Index

POMS

Profile of Mood States

QOLIE-89

Quality of Life in Epilepsy

RSCL

Rotterdam Symptom Checklist

SEIQoL

Schedule for Evaluation of Individual Quality of Life

SF-36

Short Form 36

SIP

Sickness Impact Profile

WPSI

Washington Psychosocial Seizure Inventory

PART 1Developing and Validating Instruments for Assessing Quality of Life and Patient-Reported Outcomes

1Introduction

Summary

A key methodology for the evaluation of therapies is the randomised controlled trial (RCT). These clinical trials traditionally considered relatively objective clinical outcome measures, such as cure, biological response to treatment, or survival. Later, investigators and patients alike have argued that subjective indicators should also be considered. These subjective patient-reported outcomes are often regarded as indicators of quality of life. They comprise a variety of outcome measures, such as emotional functioning (including anxiety and depression), physical functioning, social functioning, pain, fatigue, other symptoms and toxicity. A large number of questionnaires, or instruments, have been developed for assessing patient-reported outcomes and quality of life, and these have been used in a wide variety of circumstances. This book is concerned with the development, analysis and interpretation of data from these quality of life instruments.

1.1 Patient-reported outcomes

This book accepts a broad definition of quality of life, and discusses the design, application and use of single- and multi-item, subjective, measurement scales. This encompasses not just ‘overall quality of life’ but also the symptoms and side effects that may or may not reflect – or affect – quality of life. Some researchers prefer to emphasise that we are only interested in health aspects, as in health-related quality of life (HRQoL or HRQL), while others adopt the terms patient-reported outcomes (PROs) or patient-reported outcome measures (PROMs), because those terms indicate interest in a whole host of outcomes, such as pain, fatigue, depression through to physical symptoms such as nausea and vomiting. But not all subjects are ‘patients’ who are ill; it is also suggested that PRO could mean person-reported outcome. Health outcomes assessment has also been proposed, which emphasises that the focus is on health issues and also avoids specifying the respondent: for young children and for the cognitively impaired we may use proxy assessment for cognitive reasons. And for many years some questionnaires have focused on health status or self-reported health (SRH), with considerable overlap to quality of life.

From a measurement perspective, this book is concerned with all the above. For simplicity we will use the now well-established overall term quality of life (QoL) to indicate (a) the set of outcomes that contribute to a patient’s well-being or overall health, or (b) a summary measure or scale that purports to describe a patient’s overall well-being or health. Examples of summary measures for QoL include general questions such as ‘How good is your overall quality of life?’ or ‘How do you rate your overall health?’ that represent global assessments. When referring to outcomes that reflect individual dimensions, we use the acronym PROs. Examples of PROs are pain or fatigue; symptoms such as headaches or skin irritation; function, such as social and role functioning; issues such as body image or existential beliefs; and so on. Mostly, we shall assume the respondent is the patient or person whose experience we are interested in (self-report), but it could be a proxy.

The measurement issues for all these outcomes are similar. Should we use single- or multi-item scales? Content and construct validity – are we measuring what we intend? Sensitivity, reliability, responsiveness – is the assessment statistically adequate? How should such assessments be incorporated into clinical studies? And how do we analyse, report and interpret the results?

1.2 What is a patient-reported outcome?

The definition of patient-reported outcome is straightforward, and has been described as “any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else” (US FDA, 2009). A PRO can be measured by self-report or by interview provided that the interviewer records only the patient’s response. The outcome can be measured in absolute terms (e.g. severity of a symptom, sign or state of a disease) or as a change from a previous assessment.

1.3 What is quality of life?

In contrast to PRO, the term Quality of life is ill defined. The World Health Organization (WHO, 1948) declares health to be ‘a state of complete physical, mental and social well-being, and not merely the absence of disease’. Many other definitions of both ‘health’ and ‘quality of life’ have been attempted, often linking the two and, for QoL, frequently emphasising components of happiness and satisfaction with life. In the absence of any universally accepted definition, some investigators argue that most people, in the Western world at least, are familiar with the expression ‘quality of life’ and have an intuitive understanding of what it comprises.

However, it is clear that ‘QoL’ means different things to different people, and takes on different meanings according to the area of application. To a town planner, for example, it might represent access to green space and other facilities. In the context of clinical trials we are rarely interested in QoL in such a broad sense, and instead are concerned only with evaluating those aspects that are affected by disease or treatment for disease. This may sometimes be extended to include indirect consequences of disease, such as unemployment or financial difficulties. To distinguish between QoL in its more general sense and the requirements of clinical medicine and clinical trials the term health-related quality of life (HRQoL) is frequently used in order to remove ambiguity.

Health-related QoL is still a loose definition. What aspects of QoL should be included? It is generally agreed that the relevant aspects may vary from study to study but can include general health, physical functioning, physical symptoms and toxicity, emotional functioning, cognitive functioning, role functioning, social well-being and functioning, sexual functioning and existential issues. In the absence of any agreed formal definition of QoL, most investigators circumvent the issues by describing what they mean by QoL, and then letting the items (questions) in their questionnaire speak for themselves. Thus some questionnaires focus upon the relatively objective signs such as patient-reported toxicity, and in effect define the relevant aspects of QoL as being, for their purposes, limited to treatment toxicity. Other investigators argue that what matters most is the impact of toxicity, and therefore their questionnaires place greater emphasis upon psychological aspects, such as anxiety and depression. Yet others try to allow for spiritual issues, ability to cope with illness and satisfaction with life.

Some QoL instruments focus upon a single concept, such as emotional functioning. Other instruments regard these individual concepts as aspects, or dimensions, of QoL, and therefore include items relating to several concepts. Although there is disagreement about what components should be evaluated, most investigators agree that a number of the above dimensions should be included in QoL questionnaires, and that QoL is a multidimensional construct. Because there are so many potential dimensions, it is impractical to try to assess all these concepts simultaneously in one instrument. Most instruments intended for health-status assessment include at least some items that focus upon physical, emotional and social functioning. For example, if emotional functioning is accepted as being one aspect of QoL that should be investigated, several questions could evaluate anxiety, tension, irritability, depression and so on. Thus instruments may contain many items. Although a single global question such as ‘How would you rate your overall quality of life?’ is a useful adjunct to multi-item instruments, global questions are often regarded as too vague and non-specific to be used on their own. Most of the general questionnaires that we describe include one or more global questions alongside a number of other items covering specific issues. Some instruments place greater emphasis upon the concept of global questions, and the EQ-5D questionnaire (Appendix E4) asks a parsimonious five questions before using a single global question that enquires about ‘your health’. Even more extreme is the Perceived Adjustment to Chronic Illness Scale (PACIS) described by Hürny et al. (1993). This instrument consists of a single, carefully phrased question that is a global indicator of coping and adjustment: ‘How much effort does it cost you to cope with your illness?’ This takes responses ranging between ‘No effort at all’ and ‘A great deal of effort’.

One unifying and non-controversial theme throughout all the approaches is that the concepts forming these dimensions can be assessed only by subjective measures, PROs, and that they should be evaluated by asking the patient. Proxy assessments, by a relative or other close observer, are usually employed only if the patient is unable to make a coherent response, for example those who are very young, very old, severely ill or have mental impairment. Furthermore, many of these individual concepts – such as emotional functioning and fatigue – lack a formal, agreed definition that is universally understood by patients. In many cases the problem is compounded by language differences, and some concepts do not readily translate to other tongues. There are also cultural differences regarding the importance of the issues. Single-item questions on these aspects of QoL, as for global questions about overall QoL, are likely to be ambiguous and unreliable. Therefore it is usual to develop questionnaires that consist of multi-item measurement scales for each concept.

1.4 Historical development

One of the earliest references that impinges upon a definition of QoL appears in the Nichomachean Ethics, in which Aristotle (384–322 BCE) notes: “Both the multitude and persons of refinement … conceive ‘the good life’ or ‘doing well’ to be the same thing as ‘being happy’. But what constitutes happiness is a matter of dispute … some say one thing and some another, indeed very often the same man says different things at different times: when he falls sick he thinks health is happiness, when he is poor, wealth.” The Greek ευδαιμoνια is commonly translated as ‘happiness’ although Rackham, the translator that we cite, noted that a more accurate rendering would embrace ‘well-being’, with Aristotle denoting by ευδαιμoνια both a state of feeling and a kind of activity. In modern parlance this is assuredly quality of life. Although the term ‘quality of life’ did not exist in the Greek language of 2000 years ago, Aristotle clearly appreciated that QoL means different things to different people. He also recognised that it varies according to a person’s current situation – an example of a phenomenon now termed response shift. QoL was rarely mentioned until the twentieth century, although one early commentator on the subject noted that happiness could be sacrificed for QoL: “Life at its noblest leaves mere happiness far behind; and indeed cannot endure it … Happiness is not the object of life: life has no object: it is an end in itself; and courage consists in the readiness to sacrifice happiness for an intenser quality of life” (Shaw, [1900] 1972). It would appear that by this time ‘quality of life’ had become a familiar term that did not require further explanation. Specific mention of QoL in relation to patients’ health came much later. The influential WHO 1948 definition of health cited above was one of the earliest statements recognising and stressing the importance of the three dimensions – physical, mental and social – in the context of disease. Other definitions have been even more general: “Quality of Life: Encompasses the entire range of human experience, states, perceptions, and spheres of thought concerning the life of an individual or a community. Both objective and subjective, quality-of-life can include cultural, physical, psychological, interpersonal, spiritual, financial, political, temporal, and philosophical dimensions. Quality-of-life implies judgement of value placed on experience of communities, groups such as families, or individuals” (Patrick and Erickson, 1993).

One of the first instruments that broadened the assessment of patients beyond physiological and clinical examination was the Karnofsky Performance Scale proposed in 1947 (Karnofsky and Burchenal, 1947) for use in clinical settings. This is a simple scale ranging from 0 for ‘dead’ to 100 indicating ‘normal, no complaints, no evidence of disease’. Healthcare staff make the assessment. Over the years, it has led to a number of other scales for functional ability, physical functioning and activities of daily living (ADL), such as the Barthel Index. Although these questionnaires are still sometimes described as QoL instruments, they capture only one aspect of it and provide an inadequate representation of patients’ overall well-being and QoL.

The next generation of questionnaires, in the late 1970s and early 1980s, that quantified health status were used for the general evaluation of health. These instruments focused on physical functioning, physical and psychological symptoms, impact of illness, perceived distress and life satisfaction. Examples of such instruments include the Sickness Impact Profile (SIP) and the Nottingham Health Profile (NHP). Although these instruments are frequently described as QoL questionnaires, their authors neither designed them nor claimed them as QoL instruments.

Meanwhile, Priestman and Baum (1976) were adapting linear analogue self-assessment (LASA) methods to assess QoL in breast cancer patients. The LASA approach, which is also sometimes called a visual analogue scale (VAS), provides a 10 cm line, with the ends labelled with words describing the extremes of a condition. The patient is asked to mark the point along the line that corresponds with their feelings. An example of a LASA scale is contained in the EQ-5D (Appendix E4). Priestman and Baum (1976) measured a variety of subjective effects, including well-being, mood, anxiety, activity, pain, social activities and the patient’s opinion as to ‘Is the treatment helping?’ Others took the view that one need only ask a single question to evaluate the QoL of patients with cancer: “How would you rate your QoL today?” (Gough et al.