88,99 €
The assessment of patient reported outcomes and health-related quality of life continue to be rapidly evolving areas of research and this new edition reflects the development within the field from an emerging subject to one that is an essential part of the assessment of clinical trials and other clinical studies.
The analysis and interpretation of quality-of-life assessments relies on a variety of psychometric and statistical methods which are explained in this book in a non-technical way. The result is a practical guide that covers a wide range of methods and emphasizes the use of simple techniques that are illustrated with numerous examples, with extensive chapters covering qualitative and quantitative methods and the impact of guidelines. The material in this new third edition reflects current teaching methods and content widened to address continuing developments in item response theory, computer adaptive testing, analyses with missing data, analysis of ordinal data, systematic reviews and meta-analysis.
This book is aimed at everyone involved in quality-of-life research and is applicable to medical and non-medical, statistical and non-statistical readers. It is of particular relevance for clinical and biomedical researchers within both the pharmaceutical industry and clinical practice.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 1183
Veröffentlichungsjahr: 2015
To Tessa and Emma Fayers and Christine Machin
Third Edition
PETER M. FAYERS
Institute of Applied Health Sciences, University of Aberdeen School of Medicine and Dentistry, Scotland, UK and Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
and
DAVID MACHIN
Medical Statistics Group, School of Health and Related Research, University of Sheffield, Sheffield, UK and Department of Cancer Studies and Molecular Medicine, University of Leicester, Leicester, UK
This edition first published 2016 © 2016 by John Wiley & Sons, Ltd
Second edition published 2007 © 2007 by John Wiley & Sons, Ltd
Registered office:John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial offices:9600 Garsington Road, Oxford, OX4 2DQ, UK The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK 111 River Street, Hoboken, NJ 07030-5774, USA
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell
The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting a specific method, diagnosis, or treatment by health science practitioners for any particular patient. The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. Readers should consult with a specialist where appropriate. The fact that an organization or Website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, readers should be aware that Internet Websites listed in this work may have changed or disappeared between when this work was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising herefrom.
A catalogue record for this book is available from the Library of Congress and British Library.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Cover image: © istockphoto/bopp63
Preface to the third edition
Preface to the second edition
Preface to the first edition
Acknowledgements
List of abbreviations
Part 1: Developing and Validating Instruments for Assessing Quality of Life and Patient-Reported Outcomes
1: Introduction
1.1 Patient-reported outcomes
1.2 What is a patient-reported outcome?
1.3 What is
quality of life
?
1.4 Historical development
1.5 Why measure quality of life?
1.6 Which clinical trials should assess QoL?
1.7 How to measure quality of life
1.8 Instruments
1.9 Computer-adaptive instruments
1.10 Conclusions
2: Principles of measurement scales
2.1 Introduction
2.2 Scales and items
2.3 Constructs and latent variables
2.4 Single global questions versus multi-item scales
2.5 Single-item versus multi-item scales
2.6 Effect indicators and causal indicators
2.7 Psychometrics, factor analysis and item response theory
2.8 Psychometric versus clinimetric scales
2.9 Sufficient causes, necessary causes and scoring items
2.10 Discriminative, evaluative and predictive instruments
2.11 Measuring quality of life: reflective, causal and composite indicators?
2.12 Further reading
2.13 Conclusions
3: Developing a questionnaire
3.1 Introduction
3.2 General issues
3.3 Defining the target population
3.4 Phases of development
3.5 Phase 1: Generation of issues
3.6 Qualitative methods
3.7 Sample sizes
3.8 Phase 2: Developing items
3.9 Multi-item scales
3.10 Wording of questions
3.11 Face and content validity of the proposed questionnaire
3.12 Phase 3: Pre-testing the questionnaire
3.13 Cognitive interviewing
3.14 Translation
3.15 Phase 4: Field-testing
3.16 Conclusions
3.17 Further reading
4: Scores and measurements: validity, reliability, sensitivity
4.1 Introduction
4.2 Content validity
4.3 Criterion validity
4.4 Construct validity
4.5 Repeated assessments and change over time
4.6 Reliability
4.7 Sensitivity and responsiveness
4.8 Conclusions
4.9 Further reading
5: Multi-item scales
5.1 Introduction
5.2 Significance tests
5.3 Correlations
5.4 Construct validity
5.5 Cronbach’s
α
and internal consistency
5.6 Validation or alteration?
5.7 Implications for formative or causal items
5.8 Conclusions
6: Factor analysis and structural equation modelling
6.1 Introduction
6.2 Correlation patterns
6.3 Path diagrams
6.4 Factor analysis
6.5 Factor analysis of the HADS questionnaire
6.6 Uses of factor analysis
6.7 Applying factor analysis: Choices and decisions
6.8 Assumptions for factor analysis
6.9 Factor analysis in QoL research
6.10 Limitations of correlation-based analysis
6.11 Formative or causal models
6.12 Confirmatory factor analysis and structural equation modelling
6.13 Chi-square goodness-of-fit test
6.14 Approximate goodness-of-fit indices
6.15 Comparative fit of models
6.16 Difficulty-factors
6.17 Bifactor analysis
6.18 Do formative or causal relationships matter?
6.19 Conclusions
6.20 Further reading, and software
7: Item response theory and differential item functioning
7.1 Introduction
7.2 Item characteristic curves
7.3 Logistic models
7.4 Polytomous item response theory models
7.5 Applying logistic IRT models
7.6 Assumptions of IRT models
7.7 Fitting item response theory models: Tips
7.8 Test design and validation
7.9 IRT versus traditional and Guttman scales
7.10 Differential item functioning
7.11 Sample size for DIF analyses
7.12 Quantifying differential item functioning
7.13 Exploring differential item functioning: Tips
7.14 Conclusions
7.15 Further reading, and software
8: Item banks, item linking and computer-adaptive tests
8.1 Introduction
8.2 Item bank
8.3 Item evaluation, reduction and calibration
8.4 Item linking and test equating
8.5 Test information
8.6 Computer-adaptive testing
8.7 Stopping rules and simulations
8.8 Computer-adaptive testing software
8.9 CATs for PROs
8.10 Computer-assisted tests
8.11 Short-form tests
8.12 Conclusions
8.13 Further reading
Part 2: Assessing, Analysing and Reporting Patient-Reported Outcomes and the Quality of Life of Patients
9: Choosing and scoring questionnaires
9.1 Introduction
9.2 Finding instruments
9.3 Generic versus specific
9.4 Content and presentation
9.5 Choice of instrument
9.6 Scoring multi-item scales
9.7 Conclusions
9.8 Further reading
10: Clinical trials
10.1 Introduction
10.2 Basic design issues
10.3 Compliance
10.4 Administering a quality-of-life assessment
10.5 Recommendations for writing protocols
10.6 Standard operating procedures
10.7 Summary and checklist
10.8 Further reading
11: Sample sizes
11.1 Introduction
11.2 Significance tests,
p
-values and power
11.3 Estimating sample size
11.4 Comparing two groups
11.5 Comparison with a reference population
11.6 Non-inferiority studies
11.7 Choice of sample size method
11.8 Non-Normal distributions
11.9 Multiple testing
11.10 Specifying the target difference
11.11 Sample size estimation is pre-study
11.12 Attrition
11.13 Circumspection
11.14 Conclusion
11.15 Further reading
12: Cross-sectional analysis
12.1 Types of data
12.2 Comparing two groups
12.3 Adjusting for covariates
12.4 Changes from baseline
12.5 Analysis of variance
12.6 Analysis of variance models
12.7 Graphical summaries
12.8 Endpoints
12.9 Conclusions
13: Exploring longitudinal data
13.1 Area under the curve
13.2 Graphical presentations
13.4 Reporting
13.5 Conclusions
14: Modelling longitudinal data
14.1 Preliminaries
14.2 Auto-correlation
14.3 Repeated measures
14.4 Other situations
14.6 Conclusions
15: Missing data
15.1 Introduction
15.2 Why do missing data matter?
15.3 Types of missing data
15.4 Missing items
15.5 Methods for missing items within a form
15.6 Missing forms
15.7 Methods for missing forms
15.8 Simple methods for missing forms
15.9 Methods of imputation that incorporate variability
15.10 Multiple imputation
15.11 Pattern mixture models
15.12 Comments
15.13 Degrees of freedom
15.14 Sensitivity analysis
15.15 Conclusions
15.16 Further reading
16: Practical and reporting issues
16.1 Introduction
16.2 The reporting of design issues
16.3 Data analysis
16.4 Elements of good graphics
16.5 Some errors
16.6 Guidelines for reporting
16.7 Further reading
17: Death, and quality-adjusted survival
17.1 Introduction
17.2 Attrition due to death
17.3 Preferences and utilities
17.4 Multi-attribute utility (MAU) measures
17.5 Utility-based instruments
17.6 Quality-adjusted life years (QALYs)
17.7 Utilities for traditional instruments
17.8 Q-TWiST
17.9 Sensitivity analysis
17.10 Prognosis and variation with time
17.11 Alternatives to
QALY
17.12 Conclusions
17.13 Further reading
18: Clinical interpretation
18.1 Introduction
18.2 Statistical significance
18.3 Absolute levels and changes over time
18.4 Threshold values: percentages
18.5 Population norms
18.6 Minimal important difference
18.7 Anchoring against other measurements
18.8 Minimum detectable change
18.9 Expert judgement for evidence-based guidelines
18.10 Impact of the state of quality of life
18.11 Changes in relation to life events
18.12 Effect size statistics
18.13 Patient variability
18.14 Number needed to treat
18.15 Conclusions
18.16 Further reading
19: Biased reporting and response shift
19.1 Bias
19.2 Recall bias
19.3 Selective reporting bias
19.4 Other biases affecting PROs
19.5 Response shift
19.6 Assessing response shift
19.7 Impact of response shift
19.8 Clinical trials
19.9 Non-randomised studies
19.10 Conclusions
20: Meta-analysis
20.1 Introduction
20.2 Defining objectives
20.3 Defining outcomes
20.4 Literature searching
20.5 Assessing quality
20.6 Summarising results
20.7 Measures of treatment effect
20.8 Combining studies
20.9 Forest plot
20.10 Heterogeneity
20.11 Publication bias and funnel plots
20.12 Conclusions
20.13 Further reading
Appendix 1: Examples of Instruments
Appendix 2: Statistical tables
Table T1: Normal distribution
Table T2: Probability points of the Normal distribution
Table T3: Student’s
t
-distribution
Table T4: The χ
2
distribution
Table T5: The
F
-distribution
References
Index
EULA
Chapter 1
Table 1.1
Table 1.2
Table 1.3
Table 1.4
Table 1.5
Chapter 2
Table 2.1
Chapter 4
Table 4.1
Table 4.2
Table 4.3
Table 4.4
Table 4.5
Table 4.6
Table 4.7
Table 4.8
Table 4.9
Table 4.10
Table 4.11
Table 4.12
Table 4.13
Table 4.14
Table 4.15
Table 4.16
Table 4.17
Chapter 5
Table 5.1
Table 5.2
Chapter 6
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Table 6.5
Table 6.6
Table 6.7
Table 6.8
Chapter 7
Table 7.1
Table 7.2
Table 7.3
Table 7.4
Table 7.5
Table 7.6
Table 7.7
Chapter 10
Table 10.1
Chapter 11
Table 11.1
Table 11.2
Table 11.3
Table 11.4
Table 11.5
Table 11.6
Chapter 12
Table 12.1
Table 12.2
Table 12.3
Table 12.4
Table 12.5
Table 12.6
Table 12.7
Table 12.8
Table 12.9
Chapter 13
Table 13.1
Chapter 14
Table 14.1
Table 14.2
Table 14.3
Table 14.4
Table 14.5
Table 14.6
Table 14.7
Chapter 15
Table 15.1
Table 15.2
Table 15.3
Table 15.4
Chapter 17
Table 17.1
Table 17.2a
Table 17.3
Table 17.4
Table 17.5
Chapter 18
Table 18.1
Table 18.2
Table 18.3
Table 18.4
Table 18.5
Table 18.6
Table 18.7
Table 18.8
Table 18.9
Table 18.10
Chapter 19
Table 19.1
Chapter 20
Table 20.1
Chapter 2
Figure 2.1 Scales, indexes, profiles and batteries.
Chapter 3
Figure 3.1 The Linear Analogue Self Assessment scale (LASA) is an example of a visual analogue scale.
Figure 3.2 Physical functioning scale of the EORTC QLQ-C30 versions 1.0 and 2.0.
Chapter 4
Figure 4.1 Sample size for two-observation
ICC
s, such as test-retest studies. The plot shows the effect of sample size on the average distance to the lower limit of an (asymmetric) two-sided 95% confidence interval. For example, with 80 patients on average the 95%
CI
for an
ICC
of 0.8 will have a lower limit of slightly above 0.7 (distance slightly less than 0.1).
Chapter 5
Figure 5.1 Clinimetric scales and scales with causal items.
Chapter 6
Figure 6.1 Comparison of approaches to analysis.
Figure 6.2 Postulated structure of the HADS questionnaire.
Figure 6.3 Plot of Factor 2 against Factor 1, using the rotated factors from Table 6.5
Figure 6.4 Scree plot of the eigenvalues in Table 6.3
Figure 6.5 Histograms of the 14 HADS items, using the dataset from Table 6.1
Figure 6.6 Conventional EFA model for the RSCL, showing factors for general psychological distress, pain, nausea/vomiting and symptoms/side-effects. Only 17 out of the 30 items on the main RSCL are shown.
Figure 6.7 Postulated causal structure for 17 items on the RSCL. Treatment- or disease-related symptoms and side-effects may be causal rather than effect indicators.
Figure 6.8 A PhysicalHealth/MentalHealth structural model for the EORTC QLQ-C30. Source: Gundy
et al.,
2012, Figure 1(b). CC BY-NC 2.0 (http://creativecommons.org/licenses/by-nc/2.0/uk/). Reproduced commercially with permission of Springer Science+Business Media.
Figure 6.9 Plot of Factor 2 against Factor 1, using the rotated factors from EFA of the mobility items. Source: Helbostad
et al.,
2011, Figure 1. Reproduced with permission of Springer Science+Business Media.
Chapter 7
Figure 7.1 Item characteristic curves (ICCs) for two items of differing difficulty.
Figure 7.2 Item characteristic curves for two items of differing discrimination and difficulty.
Figure 7.3 Item characteristic curves for four items, each with four categorical response options. The intersections between adjacent categories correspond to threshold parameters of the Generalized Partial Credit Model. Items A and C exhibit good properties, having steep slopes and covering different trait levels; item B covers a similar range of levels to C, but has weaker discrimination illustrated by less steep slopes; Item D is a weak item with disordered thresholds.
Figure 7.4 Item information curves corresponding to the four items of Figure 7.3. Items A and C are markedly superior to items B and D.
Figure 7.5 Standard error of MMSE score estimates in palliative care patients, showing that a 6-item test has similar properties to the full MMSE. Source: Adapted from Fayers
et al
., 2005, Figures 2 and 4. Reproduced with permission of Elsevier.
Figure 7.6 Tips for fitting IRT models.
Figure 7.7 DIF analysis of EORTC QLQ–C30 2-item pain scale, by geographical region.
Figure 7.8 Tips for exploring DIF.
Chapter 8
Figure 8.1 Chart representing the computer algorithm for a computer adaptive test (CAT).
Figure 8.2 Item banking.
Figure 8.3 Stages in the development of an item bank, for use in a computer adaptive test (CAT).
Figure 8.4 Information function and standard error of measurement for the HIT item pool compared with the population distribution of headache impact. Source: Bjorner et al., 2003, Figure 3. Reproduced with permission from Springer Science and Business Media.
Figure 8.5 Advantages and disadvantages of CAT.
Figure 8.6 Relation between HIT scores based on the full 54-item pool and the CAT based on 6, 10, 13 or 20 items. Source: Ware
et al.,
2003, Figure 2. Reproduced with permission of Springer Science and Business Media.
Figure 8.7 Issues to be considered for CAT software.
Chapter 9
Figure 9.1 The SF-36 health status of 318 adults with upper respiratory tract infections (URIs).
T
-scores were calculated using the general USA population. The URI patients were also contrasted against patients with lung disease, osteoarthritis and depression.
Note
: *
p
< 0.001, †
p
< 0.05 for comparisons with URI. Source: Reproduced with kind permission of Springer Science and Business Media. Linder JA and Singer DE (2003) Health-related quality of life of adults with upper respiratory tract infections.
Journal of General Internal Medicine
18: 802–807.
Chapter 10
Figure 10.1 Questionnaire to ascertain the reason why a patient has not completed the current QoL assessment.
Figure 10.2 Patient QoL Information Leaflet.
Figure 10.3 Checklist for writing clinical trials protocols.
Chapter 11
Figure 11.1 The total sample size required to detect an effect of a specified size, using a two-sample
t
-test to compare two unpaired means with
α
of 0.05 and powers of 80% and 90%.
Figure 11.2 Sample size multiplication factors to compensate for multiple comparisons when applying a Bonferroni correction.
Chapter 12
Figure 12.1 Cumulative distribution function of responses for Aricept
®
5 and 10 mg doses compared to placebo. Important change thresholds considered for ADAS-cog score decreases over 24 weeks are 7, 4 and 0 points. Source: Reproduced from ARICEPT
®
Oral Solution (Donepezil Hydrochloride) [approval label, Figure 2]. Available at: http://www.accessdata.fda.gov/drugsatfda_docs/label/2004/21719lbl.pdf.
Figure 12.2 Histogram of baseline emotional functioning (EF) in patients with multiple myeloma. Source: Data from Wisløff
et al
., 1996.
Figure 12.3 Bar chart illustrating the baseline EF scores from multiple myeloma patients, by gender and age group. Source: Data from Wisløff
et al
., 1996.
Figure 12.4 Scatter plot of the baseline EF in patients with multiple myeloma at different ages. Source: Data from Wisløff
et al
., 1996.
Figure 12.5 Box-and-whisker plot of the baseline EF in patients with multiple myeloma, by gender and age group. Source: Data from Wisløff
et al
., 1996.
Figure 12.6 Profile of function and symptom values at one month after commencing treatment with α-interferon (bold line) or without α-interferon (thin line), in patients with myeloma. Source: Data from Wisløff
et al
., 1996.
Chapter 13
Figure 13.1 Pain profiles of two patients with severe burns. Source: Data from Ang
et al.,
2003.
Figure 13.2 A trial comparing two radiotherapy regimens for non-small-cell lung cancer, which used a daily diary card to assess QoL. Source: Bllehen
et al.
1992, Table 3. Reprinted by permission from Macmillan Publishers Ltd on behalf of Cancer Research UK.
Figure 13.3 Scatter plot showing the timing of FACT QoL assessments in a group of anaemic cancer patients. Each symbol represents a QoL assessment for a patient at the specified time in the study. The last assessment for a patient who withdrew early is represented by a star, the last assessment for a completer is a closed circle, and a continuing assessment is shown by an open circle. Source: Fallowfield et al., 2002, Figure 2. Reprinted with permission of Macmillan Publishers Ltd on behalf of Cancer Research UK.
Figure 13.4 Scatter plot of the HADS depression score at each assessment against day of assessment for patients with small-cell lung cancer. Source: Machin and Weeden 1998. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 13.5 Box-and-whisker plots of HADS depression scores for the 10 scheduled assessments of patients with small-cell lung cancer recruited to the MRC Lung Cancer Working Party trial. Source: Machin and Weeden 1998. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 13.6 Profiles of HADS anxiety score for patients with small-cell lung cancer (a) according to number of assessments completed, (b) by all patients and (c) by treatment group.
Source:
Hopwood
et al.
, 1994, Figures 2, 3 and 4. Reproduced with permission of Springer Science and Business Media.
Figure 13.7 Mean HADS depression score in patients with small-cell lung cancer, reverse plotted from date of last assessment, subdivided by the number of available assessments.
Source:
Machin and Weeden, 1998. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 13.8 Kaplan–Meier estimates of the survival curves for patients with myeloma, by treatment received. The apparent survival advantage to MP+IFN (relative risk,
RR
, = 1.08) is not statistically significant. Source: Data from Wisløff
et al.,
1996.
Figure 13.9 Histograms of AUC
36
for patients with myeloma, by treatment received. A Normal distribution is superimposed.
Source:
Data from Wisløff
et al
., 1996.
Chapter 14
Figure 14.1 Scatter plot of emotional functioning (
EF
) for multiple myeloma patients, prior to treatment and one month after starting treatment. Source: Data from Wisløff
et al.
, 1996.
Figure 14.2 Pairwise scatter diagrams for
EF
of multiple myeloma patients immediately prior to and during the first 36 months of therapy. Source: Data from Wisløff
et al.
, 1996.
Figure 14.3 Pain score changes for a patient receiving arnica following total abdominal hysterectomy. Source: Adapted from Hart
et al
., 1997. Reproduced with permission of Sage Publications, Ltd.
Figure 14.4 Mean levels of fatigue in patients with multiple myeloma, before and during treatment with MP or MP+IFN. Source: Data from Wisløff
et al.
, 1996
Figure 14.5 Brachytherapy or metal stent for palliation of dysphagia from oesophageal cancer. Source: Homs
et al.,
2004. Reproduced with permission of Elsevier.
Chapter 15
Figure 15.1 Mean physical functioning (PF) score stratified by time of dropout due to death or non-compliance in patients with metastatic breast cancer. Source: Curran
et al.,
1998a, Figure 14.2. Reproduced with permission of Oxford University Press
Figure 15.2 Kaplan–Meier estimate of the time to progression for patients with metastatic breast cancer. Source: Curran
et al
., 1998a, Figure 14.1. Reproduced with permission of Oxford University Press.
Figure 15.3 The emotional functioning (EF) scale of the EORTC QLQ-C30.
Figure 15.4 Illustration of simple imputation methods: imputing an item for a single patient.
Figure 15.5 Pattern mixture model with multiple imputation.
Figure 15.6 The course of PF over time by treatment, estimated using pattern mixture models. Source: Post
et al
., 2010, Figure 5.CC NC 4.0 (<http://creativecommons.org/licenses/by/4.0/>). Reproduced commercially with permission of Springer Science+Business Media.
Chapter 16
Figure 16.1 Histogram of baseline emotional functioning (EF) in patients with multiple myeloma. Source: Data from Wisløff
et al
., 1996.
Figure 16.2 Mean levels of fatigue in patients with multiple myeloma, before and during treatment with MP or MP+IFN: (a) bar chart; (b) line plot. Source: Data from Wisløff
et al
., 1996.
Figure 16.3 Baseline EF by age, in patients with multiple myeloma: (a) scatter plot; (b) scatter plot with ‘jitter’ and marginal box plots. In the box plots, the box shows the 25% and 75% quartiles and the median, and the ‘whiskers’ indicate the so-called ‘adjacent values’. Source: Data from Wisløff
et al
., 1996.
Figure 16.4 Change in EF from baseline, plotted against the baseline EF, in patients with multiple myeloma: (a) plotting (EF
1
– EF
0
) against EF
0
; (b) substituting random numbers in place of EF
0
, the initial measurement. Source: Data from Wisløff
et al
., 1996.
Chapter 17
Figure 17.1 Example of a choice in the DCE. Source: Ryan
et al
., 2006. Reproduced with permission of Elsevier.
Figure 17.2 Partitioned survival curves for the two treatment arms of the metastatic colorectal cancer trial. (a) panitumumab+BSC and (b) BSC alone, where BSC=best supportive care. REL is the relapse period until death or end of follow up; TOX represents days with grade 3 or worse adverse events; TWiST is the time without symptoms or toxicity.
Source:
Wang
et al
., 2011, Figure 1. Reprinted with permission of Macmillan Publishers Ltd on behalf of Cancer Research UK.
Figure 17.3 Threshold utility analysis, showing differences in
Q-TWiST
(number of weeks) for varying toxicity and relapse utility levels. Positive numbers indicate a benefit in favour of the panitumumab arm.
Source:
Wang
et al.,
2011, Figure 3. Reprinted with permission of Macmillan Publishers Ltd., on behalf of Cancer Research UK.
Figure 17.4 Change in
Q-TWiST
gain (months) over time for women with breast cancer receiving Short or Long duration chemotherapy; ‘gain’ indicates the advantage of Long over Short.
Source:
Gelber
et al.,
1995, Figure 4.
Chapter 18
Figure 18.1 Age-distribution of the mean scores for the subscales of the EORTC QLQ-C30 (version 2.0), for males (0) and females (Δ) from a sample of the general Norwegian population.
Source:
Adapted from Hjermstad
et al.
, 1998a. Reproduced with permission from the American Society of Clinical Oncology.
Figure 18.2 EORTC QLQ-C30
scale scores
for 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC). The bold line shows age- and gender-matched reference values in a sample of the general population in the Netherlands. The reference group had highest QoL, best functioning and least symptoms (based on the data shown in Table 18.3).
Figure 18.3
Bar chart
of the mean EORTC QLQ-C30 scale scores shown in Figure 18.2, for 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC) (based on the data shown in Table 18.3).
Figure 18.4 Data of Figure 18.2, showing
mean differences
of the patient scores from the age- and gender-matched reference values of the general population for 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC). See footnote to Table 18.3 for scale names.
Figure 18.5 Each graph shows the mean score change and 95% CI for 239 patients with multiple myeloma who stated that they had become much better, moderately better, a little better, unchanged (0), a little worse, moderately worse, or much worse from time 1 (baseline) to time 2 (three months). High scores indicate more symptoms (pain and fatigue) or better functioning (physical function and global HRQL).
Source:
Kvam
et al.,
2010. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 18.6 Receiver-operating characteristic (ROC) curve of the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 change score in patients who stated that their pain had deteriorated. AUC, area under the curve.
Source:
Kvam
et al.,
2010. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 18.7 Data of Figure 18.4, showing
effect sizes
instead of absolute mean differences. For 439 patients with rectal cancer, divided into non-advanced disease (NAD), locally advanced rectal cancer (LARC) and locally recurrent rectal cancer (LRRC), compared to age- and gender-matched reference values of the general population. See footnote to Table 18.3 for scale names.
Chapter 19
Figure 19.1 Sprangers and Schwartz (1999) theoretical model of response shift and quality of life.
Source:
Sprangers and Schwarz, 1999, Figure 1. Reproduced with permission of Elsevier.
Figure 19.2 Ratings of QoL varied according to frame of reference. Source: Fayers
et al., 2007
. Reproduced with permission of Elsevier.
Figure 19.3 Meta-analysis of response shift affecting assessment of global QoL. (see Chapter 20 for explanation of the ‘Forest plot’). Note: See Schwartz et al., 2006 for detailed references of studies cited.
Source:
Schwartz
et al. 2006
, Figure 3. Reproduced with permission of Springer Science and Business Media.
Chapter 20
Figure 20.1 Forest plot of St John’s Wort for depression. The solid squares denote individual mean effects and the horizontal lines represent 95%
CI
s. The diamonds denote pooled weighted mean differences. Note: See Linde
et al.
, 2005 for detailed references of studies cited.
Source:
Linde
et al.,
2005, comparison 02, outcome 04. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 20.2 Forest plot comparing placebo control against extracts from the Zingiberaceae family of plants for treatment of chronic pain. Standardised mean differences (
SMD
s) are shown for each study, sorted by date of publication. (Based on Lakhan
et al
., 2015).
Figure 20.3 Forest plot of pain severity following open mesh or non-mesh groin hernia repairs.
Source:
Scott
et al
., 2001. Reproduced with permission of John Wiley & Sons, Ltd.
Figure 20.4 Funnel plot of 23 placebo-controlled trials that reported information on the number of responders according to the Hamilton Rating Scale for Depression.
Source:
Linde
et al.,
2005, Figure 1. Reproduced with permission of John Wiley & Sons, Ltd.
Cover
Table of Contents
Preface to the third edition
xiii
xiv
xv
xvii
xviii
xix
xx
xxi
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
35
36
37
38
39
40
41
42
43
44
45
46
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
89
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
149
150
151
152
153
154
155
156
157
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
226
228
229
230
231
232
233
234
235
236
237
238
239
240
241
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
340
341
342
343
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
365
367
368
369
370
371
372
373
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
393
429
430
431
432
433
434
435
436
437
440
441
442
443
444
445
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
482
483
484
485
486
487
488
489
490
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
536
537
538
539
540
541
542
543
544
545
546
547
549
579
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
When the first edition of this book was published in 2000, the assessment of quality of life (QoL) as an important outcome in clinical trials and other research studies was, at best, controversial. More traditional endpoints were the norm – measures such as disease status, cure and patient’s survival time dominated in research publications. How times have changed. Nowadays it is generally accepted that the patients’ perspective is paramount, patient representatives are commonly involved in the design of clinical trials, and patient-reported outcomes (PROs) have become recognised as standard outcomes that should be assessed and reported in a substantial proportion of trials, either as secondary outcomes or, in many instances, as a primary outcome from the study. Indeed, in 2000 the term ‘patient-reported outcome’ hardly existed and the focus at that time was on the ill-defined but all embracing concept of ‘quality of life’. Now, we regard QoL as but one PRO, with the latter encompassing anything reported by ‘asking the patient’ – symptoms such as pain or depression, physical or other functioning, mobility, activities of daily living, satisfaction with treatment or other aspects of management, and so on. Drug regulatory bodies have also embraced PROs and QoL as endpoints, while at the same time demanding higher standards of questionnaire development and validation.
In parallel with this, research into instrument development, validation and application continues to grow apace. There is increasing recognition of the importance of qualitative methods to secure a solid foundation when developing new instruments, and a corresponding rigour in applying and reporting qualitative research. In parallel, a major radical shift towards using item response theory both as a tool for developing and validating new instruments and as the basis of computer-adaptive tests (CATs). Many of the major research groups have been developing new CAT instruments for assessing PROs, and this new generation of questionnaires are becoming widely available for use on computer tablets and smart-phones.
Analysis, too, has benefited in various ways for the increased importance being attached to PROs – two examples being (i) methods for handling missing data and in particular reducing the biases that can arise when data are missing, and (ii) greater rigour demanded for the reporting of PROs.
As a consequence of these and many other developments, we have taken the opportunity to update many chapters. The examples, too, have been refreshed and largely brought up-to-date, although some of the classic citations still stand proud and have been retained. A less convenient aspect of the changes is, perhaps, the resultant increase in page-count.
We continue to be grateful to our many colleagues – their continued encouragement and enthusiasm has fuelled the energy to produce this latest edition; Mogens Groenvold in particular contributed to the improvement of Chapter 3.
Peter M. Fayers and David Machin September 2015
We have been gratified by the reception of the first edition of this book, and this new edition offers the opportunity to respond to the many suggestions we have received for further improving and clarifying certain sections. In most cases the changes have meant expanding the text, to reflect new developments in research.
Chapters have been reorganised, to follow a more logical sequence for teaching. Thus sample size estimation has been moved to Part C, Clinical Trials, because it is needed for trial design. In the first edition it followed the chapters about analysis where we discussed choice of statistical tests, because the sample size computation depends on the test that will be used.
Health-related quality of life is a rapidly evolving field of research, and this is illustrated by shifting names and identity: quality of life (QoL) outcomes are now also commonly called patient- (or person-) reported outcomes (PROs), to reflect more clearly that symptoms and side effects of treatment are included in the assessments; we have adopted that term as part of the subtitle. Drug regulatory bodies have also endorsed this terminology, with the USA Food and Drug Administration (US FDA) bringing out guidance notes concerning the use of PROs in clinical trials for new drug applications; this new edition reflects the FDA (draft) recommendations.
Since the first edition of this book there have been extensive developments in item response theory and, in particular, computer-adaptive testing; these are addressed in a new chapter. Another area of growth has been in systematic reviews and meta-analysis, as evinced by the formation of a Quality of Life Methods Group by the Cochrane Collaboration. QoL presents some particular challenges for meta-analysis, and this led us to include the final chapter.
We are very grateful to the numerous colleagues who reported finding this book useful, some of whom also offered constructive advice for this second edition.
Peter M. Fayers and David Machin June 2006
Measurement of quality of life has grown to become a standard endpoint in many randomised controlled trials and other clinical studies. In part, this is a consequence of the realisation that many treatments for chronic diseases frequently fail to cure, and that there may be limited benefits gained at the expense of taking toxic or unpleasant therapy. Sometimes therapeutic benefits may be outweighed by quality of life considerations. In studies of palliative therapy, quality of life may become the principal or only endpoint of consideration. In part, it is also recognition that patients should have a say in the choice of their therapy, and that patients place greater emphasis upon non-clinical aspects of treatment than healthcare professionals did in the past. Nowadays, many patients and patient-support groups demand that they should be given full information about the consequences of their disease and its therapy, including impact upon aspects of quality of life, and that they should be allowed to express their opinions. The term quality of life has become a catch-phrase, and patients, investigators, funding bodies and ethical review committees often insist that, where appropriate, quality of life should be assessed as an endpoint for clinical trials.
The assessment, analysis and interpretation of quality of life relies upon a variety of psychometric and statistical methods, many of which may be less familiar than the other techniques used in medical research. Our objective is to explain these techniques in a non-technical way. We have assumed some familiarity with basic statistical ideas, but we have avoided detailed statistical theory. Instead, we have tried to write a practical guide that covers a wide range of methods. We emphasise the use of simple techniques in a variety of situations by using numerous examples, taken both from the literature and from our own experience. A number of these inevitably arise from our own particular field of interest - cancer clinical trials. This is also perhaps justifiable in that much of the pioneering work on quality of life assessment occurred in cancer, and cancer still remains the disease area that is associated with the largest number of quality of life instruments and the most publications. However, the issues that arise are common to quality of life assessment in general.
We would like to say a general thank you to all those with whom we have worked on aspects of quality of life over the years; especially, past and present members of the EORTC Quality of Life Study Group, and colleagues from the former MRC Cancer Therapy Committee Working Parties. Particular thanks go to Stein Kaasa of the Norwegian University of Science and Technology at Trondheim who permitted PMF to work on this book whilst on sabbatical and whose ideas greatly influenced our thinking about quality of life, and to Kristin Bjordal of The Radium Hospital, Oslo, who made extensive input and comments on many chapters and provided quality of life data that we used in examples. Finn Wisløff, for the Nordic Myeloma Study Group, very kindly allowed us to make extensive use their QoL data for many examples. We are grateful to the National Medical Research Council of Singapore for providing funds and facilities to enable us to complete this work. We also thank Dr Julian Thumboo, Tan Tock Seng Hospital, Singapore, for valuable comments on several chapters. Several chapters, and Chapter 7 in particular, were strongly influenced by manuals and guidelines published by the EORTC Quality of Life Study Group.
Peter M. Fayers and David Machin January 2000
Note: We have adopted the policy of using italics to indicate variables, or things that take values.
α
Alpha, Type I error
α
Cronbach
Cronbach’s reliability coefficient
β
Beta, Power, 1-Type II error
κ
Kappa, Cohen’s measure of agreement
θ
Theta, an unobservable or “latent” variable
ρ
Rho, the correlation coefficient
σ
Sigma, the population standard deviation, estimated by SD
ADL
Activities of daily living
ANOVA
Analysis of variance
ARR
Absolute risk reduction
AUC
Area under the curve
CAT
Computer-adaptive test
CFA
Confirmatory factor analysis
CI
Confidence interval
CONSORT
Consolidated Standards of Reporting Trials
(http://www.consort-statement.org/)
CPMP
Committee for Proprietary Medicinal Products (European regulatory body)
DCE
Discrete choice experiment
df
Degrees of freedom
DIF
Differential item functioning
EF
Emotional functioning
EFA
Exploratory factor analysis
ES
Effect size
F
-statistic
The ratio of two variance estimates; also called F-ratio
F
-test
The statistical test used in ANOVA, based on the F-statistic
GEE
Generalised estimating equation
HYE
Healthy-years equivalent
IADL
Instrumental activities of daily living
ICC
Intraclass correlation
ICC
Item characteristic curve
IRT
Item response theory
LASA
Linear analogue self-assessment (scale)
MANOVA
Multivariate analysis of variance
MAR
Missing at random
MCAR
Missing completely at random
MIMIC
Multiple indicator multiple cause (model)
ML
Maximum likelihood (estimation)
MNAR
Missing not at random
MTMM
Multitrait–multimethod
NNT
Number needed to treat
NS
Not statistically significant
OR
Odds ratio
p
Probability, as in p-value
PF
Physical functioning
PRO
Patient-reported outcome
QALY
Quality-adjusted life years
QoL
Quality of life
Q-TWiST
Quality-adjusted time without symptoms and toxicity
RCT
Randomised controlled trial
RE
Relative efficiency
RR
Relative risk
RV
Relative validity
SD
Standard deviation of a sample
SE
Standard error
SEM
Standard error of measurement
SEM
Structured equation model
SG
Standard gamble
SMD
Standardised mean difference
SRM
Standardised response mean
t
Student’s t-statistic
TTO
Time trade-off
TWiST
Time without symptoms and toxicity
VAS
Visual analogue scale
WMD
Weighted mean difference
WTP
Willingness to pay
QoL instruments
AIMS
Arthritis Impact Measurement Scale
AMPS
Assessment of Motor and Process Skills
AQLQ
Asthma Quality of Life Questionnaire
BDI
Beck Depression Inventory
BI
Barthel Index of disability
BPI
Brief Pain Inventory
BPRS
Brief Psychiatric Rating Scale
EORTC QLQ-C30
European Organisation for Research and Treatment of
Cancer, Quality of Life Questionnaire, 30-items
EQ-5D
EuroQoL EQ-5D self report questionnaire
FACT-G
Functional Assessment of Cancer–General Version
FAQ
Functional Activity Questionnaire
FLIC
Functional Living Index–Cancer
GPH
General Perceived Health
HADS
Hospital Anxiety and Depression Scale
HDQoL Huntington’s Disease health-related Quality of Life
questionnaire
HOPES
HIV Overview of Problems–Evaluation System
HRSD
Hamilton Rating Scale for Depression
HUI
Health Utilities Index
MFI-20
Multidimensional Fatigue Inventory 20
MMSE
Mini-Mental State Examination
MPQ
McGill Pain Questionnaire
NHP
Nottingham Health Profile
PACIS
Perceived Adjustment to Chronic Illness Scale
PAQLQ
Pediatric Asthma Quality of Life Questionnaire
PASS
Pain Anxiety Symptoms Scale
PCQLI
Pediatric Cardiac Quality of Life Inventory
PGI
Patient Generated Index
POMS
Profile of Mood States
QOLIE-89
Quality of Life in Epilepsy
RSCL
Rotterdam Symptom Checklist
SEIQoL
Schedule for Evaluation of Individual Quality of Life
SF-36
Short Form 36
SIP
Sickness Impact Profile
WPSI
Washington Psychosocial Seizure Inventory
Summary
A key methodology for the evaluation of therapies is the randomised controlled trial (RCT). These clinical trials traditionally considered relatively objective clinical outcome measures, such as cure, biological response to treatment, or survival. Later, investigators and patients alike have argued that subjective indicators should also be considered. These subjective patient-reported outcomes are often regarded as indicators of quality of life. They comprise a variety of outcome measures, such as emotional functioning (including anxiety and depression), physical functioning, social functioning, pain, fatigue, other symptoms and toxicity. A large number of questionnaires, or instruments, have been developed for assessing patient-reported outcomes and quality of life, and these have been used in a wide variety of circumstances. This book is concerned with the development, analysis and interpretation of data from these quality of life instruments.
This book accepts a broad definition of quality of life, and discusses the design, application and use of single- and multi-item, subjective, measurement scales. This encompasses not just ‘overall quality of life’ but also the symptoms and side effects that may or may not reflect – or affect – quality of life. Some researchers prefer to emphasise that we are only interested in health aspects, as in health-related quality of life (HRQoL or HRQL), while others adopt the terms patient-reported outcomes (PROs) or patient-reported outcome measures (PROMs), because those terms indicate interest in a whole host of outcomes, such as pain, fatigue, depression through to physical symptoms such as nausea and vomiting. But not all subjects are ‘patients’ who are ill; it is also suggested that PRO could mean person-reported outcome. Health outcomes assessment has also been proposed, which emphasises that the focus is on health issues and also avoids specifying the respondent: for young children and for the cognitively impaired we may use proxy assessment for cognitive reasons. And for many years some questionnaires have focused on health status or self-reported health (SRH), with considerable overlap to quality of life.
From a measurement perspective, this book is concerned with all the above. For simplicity we will use the now well-established overall term quality of life (QoL) to indicate (a) the set of outcomes that contribute to a patient’s well-being or overall health, or (b) a summary measure or scale that purports to describe a patient’s overall well-being or health. Examples of summary measures for QoL include general questions such as ‘How good is your overall quality of life?’ or ‘How do you rate your overall health?’ that represent global assessments. When referring to outcomes that reflect individual dimensions, we use the acronym PROs. Examples of PROs are pain or fatigue; symptoms such as headaches or skin irritation; function, such as social and role functioning; issues such as body image or existential beliefs; and so on. Mostly, we shall assume the respondent is the patient or person whose experience we are interested in (self-report), but it could be a proxy.
The measurement issues for all these outcomes are similar. Should we use single- or multi-item scales? Content and construct validity – are we measuring what we intend? Sensitivity, reliability, responsiveness – is the assessment statistically adequate? How should such assessments be incorporated into clinical studies? And how do we analyse, report and interpret the results?
The definition of patient-reported outcome is straightforward, and has been described as “any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else” (US FDA, 2009). A PRO can be measured by self-report or by interview provided that the interviewer records only the patient’s response. The outcome can be measured in absolute terms (e.g. severity of a symptom, sign or state of a disease) or as a change from a previous assessment.
In contrast to PRO, the term Quality of life is ill defined. The World Health Organization (WHO, 1948) declares health to be ‘a state of complete physical, mental and social well-being, and not merely the absence of disease’. Many other definitions of both ‘health’ and ‘quality of life’ have been attempted, often linking the two and, for QoL, frequently emphasising components of happiness and satisfaction with life. In the absence of any universally accepted definition, some investigators argue that most people, in the Western world at least, are familiar with the expression ‘quality of life’ and have an intuitive understanding of what it comprises.
However, it is clear that ‘QoL’ means different things to different people, and takes on different meanings according to the area of application. To a town planner, for example, it might represent access to green space and other facilities. In the context of clinical trials we are rarely interested in QoL in such a broad sense, and instead are concerned only with evaluating those aspects that are affected by disease or treatment for disease. This may sometimes be extended to include indirect consequences of disease, such as unemployment or financial difficulties. To distinguish between QoL in its more general sense and the requirements of clinical medicine and clinical trials the term health-related quality of life (HRQoL) is frequently used in order to remove ambiguity.
Health-related QoL is still a loose definition. What aspects of QoL should be included? It is generally agreed that the relevant aspects may vary from study to study but can include general health, physical functioning, physical symptoms and toxicity, emotional functioning, cognitive functioning, role functioning, social well-being and functioning, sexual functioning and existential issues. In the absence of any agreed formal definition of QoL, most investigators circumvent the issues by describing what they mean by QoL, and then letting the items (questions) in their questionnaire speak for themselves. Thus some questionnaires focus upon the relatively objective signs such as patient-reported toxicity, and in effect define the relevant aspects of QoL as being, for their purposes, limited to treatment toxicity. Other investigators argue that what matters most is the impact of toxicity, and therefore their questionnaires place greater emphasis upon psychological aspects, such as anxiety and depression. Yet others try to allow for spiritual issues, ability to cope with illness and satisfaction with life.
Some QoL instruments focus upon a single concept, such as emotional functioning. Other instruments regard these individual concepts as aspects, or dimensions, of QoL, and therefore include items relating to several concepts. Although there is disagreement about what components should be evaluated, most investigators agree that a number of the above dimensions should be included in QoL questionnaires, and that QoL is a multidimensional construct. Because there are so many potential dimensions, it is impractical to try to assess all these concepts simultaneously in one instrument. Most instruments intended for health-status assessment include at least some items that focus upon physical, emotional and social functioning. For example, if emotional functioning is accepted as being one aspect of QoL that should be investigated, several questions could evaluate anxiety, tension, irritability, depression and so on. Thus instruments may contain many items. Although a single global question such as ‘How would you rate your overall quality of life?’ is a useful adjunct to multi-item instruments, global questions are often regarded as too vague and non-specific to be used on their own. Most of the general questionnaires that we describe include one or more global questions alongside a number of other items covering specific issues. Some instruments place greater emphasis upon the concept of global questions, and the EQ-5D questionnaire (Appendix E4) asks a parsimonious five questions before using a single global question that enquires about ‘your health’. Even more extreme is the Perceived Adjustment to Chronic Illness Scale (PACIS) described by Hürny et al. (1993). This instrument consists of a single, carefully phrased question that is a global indicator of coping and adjustment: ‘How much effort does it cost you to cope with your illness?’ This takes responses ranging between ‘No effort at all’ and ‘A great deal of effort’.
One unifying and non-controversial theme throughout all the approaches is that the concepts forming these dimensions can be assessed only by subjective measures, PROs, and that they should be evaluated by asking the patient. Proxy assessments, by a relative or other close observer, are usually employed only if the patient is unable to make a coherent response, for example those who are very young, very old, severely ill or have mental impairment. Furthermore, many of these individual concepts – such as emotional functioning and fatigue – lack a formal, agreed definition that is universally understood by patients. In many cases the problem is compounded by language differences, and some concepts do not readily translate to other tongues. There are also cultural differences regarding the importance of the issues. Single-item questions on these aspects of QoL, as for global questions about overall QoL, are likely to be ambiguous and unreliable. Therefore it is usual to develop questionnaires that consist of multi-item measurement scales for each concept.
One of the earliest references that impinges upon a definition of QoL appears in the Nichomachean Ethics, in which Aristotle (384–322 BCE) notes: “Both the multitude and persons of refinement … conceive ‘the good life’ or ‘doing well’ to be the same thing as ‘being happy’. But what constitutes happiness is a matter of dispute … some say one thing and some another, indeed very often the same man says different things at different times: when he falls sick he thinks health is happiness, when he is poor, wealth.” The Greek ευδαιμoνια is commonly translated as ‘happiness’ although Rackham, the translator that we cite, noted that a more accurate rendering would embrace ‘well-being’, with Aristotle denoting by ευδαιμoνια both a state of feeling and a kind of activity. In modern parlance this is assuredly quality of life. Although the term ‘quality of life’ did not exist in the Greek language of 2000 years ago, Aristotle clearly appreciated that QoL means different things to different people. He also recognised that it varies according to a person’s current situation – an example of a phenomenon now termed response shift. QoL was rarely mentioned until the twentieth century, although one early commentator on the subject noted that happiness could be sacrificed for QoL: “Life at its noblest leaves mere happiness far behind; and indeed cannot endure it … Happiness is not the object of life: life has no object: it is an end in itself; and courage consists in the readiness to sacrifice happiness for an intenser quality of life” (Shaw, [1900] 1972). It would appear that by this time ‘quality of life’ had become a familiar term that did not require further explanation. Specific mention of QoL in relation to patients’ health came much later. The influential WHO 1948 definition of health cited above was one of the earliest statements recognising and stressing the importance of the three dimensions – physical, mental and social – in the context of disease. Other definitions have been even more general: “Quality of Life: Encompasses the entire range of human experience, states, perceptions, and spheres of thought concerning the life of an individual or a community. Both objective and subjective, quality-of-life can include cultural, physical, psychological, interpersonal, spiritual, financial, political, temporal, and philosophical dimensions. Quality-of-life implies judgement of value placed on experience of communities, groups such as families, or individuals” (Patrick and Erickson, 1993).
One of the first instruments that broadened the assessment of patients beyond physiological and clinical examination was the Karnofsky Performance Scale proposed in 1947 (Karnofsky and Burchenal, 1947) for use in clinical settings. This is a simple scale ranging from 0 for ‘dead’ to 100 indicating ‘normal, no complaints, no evidence of disease’. Healthcare staff make the assessment. Over the years, it has led to a number of other scales for functional ability, physical functioning and activities of daily living (ADL), such as the Barthel Index. Although these questionnaires are still sometimes described as QoL instruments, they capture only one aspect of it and provide an inadequate representation of patients’ overall well-being and QoL.
The next generation of questionnaires, in the late 1970s and early 1980s, that quantified health status were used for the general evaluation of health. These instruments focused on physical functioning, physical and psychological symptoms, impact of illness, perceived distress and life satisfaction. Examples of such instruments include the Sickness Impact Profile (SIP) and the Nottingham Health Profile (NHP). Although these instruments are frequently described as QoL questionnaires, their authors neither designed them nor claimed them as QoL instruments.
Meanwhile, Priestman and Baum (1976) were adapting linear analogue self-assessment (LASA) methods to assess QoL in breast cancer patients. The LASA approach, which is also sometimes called a visual analogue scale (VAS), provides a 10 cm line, with the ends labelled with words describing the extremes of a condition. The patient is asked to mark the point along the line that corresponds with their feelings. An example of a LASA scale is contained in the EQ-5D (Appendix E4). Priestman and Baum (1976) measured a variety of subjective effects, including well-being, mood, anxiety, activity, pain, social activities and the patient’s opinion as to ‘Is the treatment helping?’ Others took the view that one need only ask a single question to evaluate the QoL of patients with cancer: “How would you rate your QoL today?” (Gough et al.
