Sage Journals: Discover world-class research

Abstract

Aim:

Generic patient-reported outcome measures (PROMs) allow comparison of health-related quality of life across populations and pathologies. For these comparisons to be valid, the PROM must be responsive; the score must change when the patient’s quality of life changes. This study aims to assess the responsiveness of the EQ-5D-three level (3L) in elective shoulder surgery.

Methods:

Pre- and post-operative EQ-5D-3L and Oxford Shoulder Scores (OSS) were prospectively collected across a range of 204 elective shoulder surgeries. Internal responsiveness was assessed through significance testing of mean change scores and standardized response means (SRMs). External responsiveness of the EQ-5D-3L was assessed against the minimal clinically important difference in OSS, using receiver operating characteristic curve and change score correlation.

Results:

Both EQ-5D-3L and OSS scores improved significantly over time (p < 0.05). The SRM for the EQ-5D was 1.27 (95% CI 1.14–1.41) and for OSS 2.36 (2.22–2.52). Area under the curve for EQ-5D was 0.49. Only a weak correlation was found between EQ-5D and OSS change scores (r = 0.21).

Discussion:

The EQ-5D-3L is adequately internally responsive to change following elective shoulder surgery but is unable to differentiate patients demonstrating minimal clinically important change. The EQ-5D therefore only partially reflects patient experience.

Keywords

assessment EQ-5D outcome healthcare PROMs psychometrics responsiveness shoulder

Introduction

It is accepted that only by quantifying the patient’s perspective of their own health, can we truly comment on the quality and effectiveness of healthcare interventions. To this end, considerable investment of resources has been made by academics and clinicians to develop robust and valid ways of collecting self-reported health outcome data, the culmination of which is the patient-reported outcome measure (PROM).¹ The use of PROMs is now embedded in the research framework through which health technology appraisal is undertaken, with the United Kingdom’s National Institute of Health and Care Excellence (NICE) requiring PROMs evidence as part of their deliberations.² Through this route, cost-effectiveness is measured in relation to the benefit in quality-adjusted life years (QALY), using health-related quality of life (HRQoL) data derived from generic measures such as the EQ-5D.

The paradigm between cost and QALY is becoming increasingly relevant in modern healthcare. This is of particular importance in the heavily scrutinized area of elective orthopaedic surgery. It has been well reported that the volume of elective shoulder surgery has risen exponentially.^3,4 Evidence of effectiveness, or lack thereof, must therefore be valid, reliable and responsive. The use of generic instruments such as the EQ-5D and SF-36 has previously been shown to fulfil these metric properties in a wide range of conditions^1,5; however, in certain groups, these instruments miss aspects of health that are vitally important to patients. In these circumstances, condition-specific measures, such as the Oxford Shoulder Score (OSS),⁶ are advocated alongside their generic counterparts. However, these measures do not allow comparison across patients with different conditions and cannot provide quality of life data for economic evaluation.⁵ It is therefore in the interests of patients that the responsiveness of the generic PROM has been adequately assessed. Only then can we be confident that the generic PROM accurately reflects the experience of the patient.⁵

The EQ-5D-three-level (3L) questionnaire is currently employed by the National Health Service (NHS) England PROMs programme.^7,8 It was introduced in 1990 and comprises questions on the following five dimensions: mobility, self-care, usual activities, pain/discomfort and anxiety/depression. Each dimension has three levels: no problems, some problems and extreme problems. A visual analogue scale (VAS) is also provided within the questionnaire which records the patient’s self-rated health on a vertical scale where the endpoints are labelled ‘Best imaginable health state’ and ‘Worst imaginable health state’. Though its sensitivity has previously been questioned⁸ and a five-level version has been produced and is beginning to be utilized,^3,9 the 3L version remains highly prevalent in recent upper limb research^10,11 and is likely to remain part of the relevant evidence base for many years, perhaps decades.¹² Analysis of the NHS PROMs programme has found that the EQ-5D-3L is adequately responsive in total hip and knee arthroplasty.¹ The only assessment of shoulder-related responsiveness has been in proximal humeral fractures¹³ where it was recommended as a quality of life measure for that particular injury. In light of the increasing utilization and central importance of cost-effectiveness analysis, it is of vital relevance that this is assessed. The aim of this study is to evaluate the responsiveness of the generic PROM, the EQ-5D-3L and the condition-specific OSS in elective shoulder surgery.

Patients and methods

A prospective cohort study on patients undergoing shoulder surgery between January 2009 and January 2012 was undertaken. Patients undergoing surgery for instability were excluded as they were assessed with the Oxford Shoulder Instability Score. All patients completed the OSS and the EQ-5D-3L score preoperatively. Patients who underwent arthroscopic capsular release, arthroscopic subacromial decompression and arthroscopic rotator cuff repair (RCR) completed the same questionnaires 6 months post-operatively, while patients who had a total shoulder replacement (TSR) completed them at 1 year post-operatively. Questionnaires were checked for completion by one of the investigators to facilitate instrument completion without influencing responses.

The OSS⁶ assesses the symptoms and function experienced by the patient during the preceding 4 weeks. It comprises 12 questions with five possible response options, scored from 0 to 4, with 4 representing the best response. Scores from individual questions are added to produce a final score ranging from 0 (worst shoulder function) to 48 (best shoulder function). The OSS has been shown to be a valid, reliable and responsive tool in the assessment of all operative shoulder procedures, excluding surgery for instability.¹⁴

The EuroQol¹⁵ EQ-5D-3L is an internationally validated general measure of HRQoL. The EQ-5D-3L index score is calculated using population-based preference weights and the score ranges from 0 to 1, where 1 represents perfect health and 0 is death. Negative values are allowed and represent a health status considered to be worse than death. The EQ-5D VAS was also collected. This provides a broad measure of overall health from 0 (worst imaginable health) to 100 (best imaginable health). All outcome measures were administered as part of ongoing service evaluation of publically funded healthcare. The EQ-5D is the intellectual property of the EuroQol Group and the OSS is the intellectual property of Oxford University Innovation.

Statistics methods

Responsiveness is related to an instrument’s ability to capture clinically important changes over time.¹⁶ For a rounded understanding of the utility of a generic health measure, this needs to be assessed in two forms, ‘internal’ and ‘external’ responsiveness. Internal responsiveness is the ability of the measure to change over a pre-specified time period. ‘External responsiveness’ reflects the extent to which change in a measure relates to a corresponding change in a reference measure of clinical or health status.¹⁷ All statistical analysis was undertaken in STATA (StataCorp. 2015. Stata Statistical Software: Release 14. College Station, TX: StataCorp LP). Significance was set at p < 0.05.

Internal responsiveness

The statistical significance of the observed change was assessed for both OSS and EQ-5D using a one-sample (two-sided) t-test to the change in scores (preoperative to post-operative). Testing against the null hypothesis of no change.

Further assessment of responsiveness was assessed by measuring the standardized response mean (SRM). The SRM is the mean score change divided by the standard deviation (SD) of the score change between each time period.¹⁸ The use of SRM is often used alongside the paired t-test as it removes the dependence on sample size.¹⁷

The SRM is preferred over effect size in measuring responsiveness as it uses the SDs of the change scores as the denominator.¹³ The SRM was interpreted using Cohen’s criteria where the SRM is regarded as large (>0.8), moderate (0.5–0.8) or small (<0.5).¹⁹

External responsiveness

The external criterion for which the responsiveness of the EQ-5D was tested was the minimal clinically important difference (MCID) of the OSS. This is the smallest difference in a score which the patient perceives as being beneficial.²⁰ The MCID of the OSS has only recently been defined as >4.5 points for elective shoulder surgery.³ This was derived from the distribution method of half a SD.²¹ Anchor-based methodologies have been utilized in cross-culturally adapted Dutch OSS but not in UK populations.^22,23

Non-parametric receiver operating characteristic (ROC) curves were used to assess the sensitivity and the false positive rate (1-specificity) of the EQ-5D-3L and EQ-5D VAS against the dichotomized outcome of patients with an OSS change score >4.5 (MCID). The ROC curves demonstrate the ability of the change scores (post-operative score minus preoperative score) to discriminate between the patients defined by the external criterion.¹³ The area under the curve (AUC) represents the diagnostic ability of an instrument, with a value of 0.5 denoting performance no better than random chance and a value of 1.0 indicating perfect predictive ability.²⁴

Logistic regression was used to provide a relative estimate of the level of variance that is explained by the change scores. The derived odds ratio (OR) was calculated with the external criterion as the dependent variable and the EQ-5D change scores as the dependent variables.

Correlations between the change scores of the OSS, EQ-5D-3L and EQ-5D VAS were calculated using Spearman’s rank. Under the null hypothesis, a proportional change in all scores would not occur over time.

Preoperatively and post-operatively scores were evaluated for floor (scores reflecting the lowest level of functioning) and ceiling (scores reflecting the maximal level of functioning) effects. An instrument is considered to have significant floor or ceiling effect if more than 15% of the scores are at the lowest or highest level of functioning.²⁵

Results

There were 204 patients (125 women, 79 men) who were eligible for inclusion in the study. The demographics of the surgical population studied and the respective subgroups are shown in Table 1. There was a significant improvement in the patient’s function post-operatively as assessed by the OSS, the EQ-5D-3L and EQ-5D VAS with the exception of the TSR EQ-5D VAS group.

Table 1.

Patient demographics and mean (SD) of pre- and post-operative and change score for the OSS, EQ-5D-3L and EQ-5D VAS.

Surgical subgroup	No.	Male (%)	Female (%)	Age	Pre-op OSS, mean (SD)	Post-op OSS, mean (SD)	Change score: OSS, mean (SD)	Pre-op EQ-5D, mean (SD)	Post-op EQ-5D, mean (SD)	Change: EQ-5D, mean (SD)	Pre-op EQ-5D VAS, mean (SD)	Post-op EQ-5D VAS, mean (SD)	Change: EQ-5D VAS (SD)
All	204	79 (38.7)	125 (61.3)	60.3	19.64 (6.52)	37.69 (5.61)^a	18.04 (7.65)	0.37 (0.30)	0.80 (0.18)^a	0.43 (0.34)	72.34 (15.33)	82.45 (13.71)^a	10.11 (18.73)
ACR	100	40 (40)	60 (60)	54.3	16.78 (5.81)	36.23 (4.21)^a	19.45 (6.50)	0.24 (0.28)	0.79 (0.11)^a	0.55 (0.30)	69.81 (17.04)	84.05 (8.84)^a	14.24 (18.12)
ASD	21	9 (43)	12 (57)	61	24.24 (4.88)	41.00 (3.95)^a	16.76 (5.56)	0.62 (0.26)	0.86 (0.24)^a	0.24 (0.26)	75.24 (14.27)	86.19 (14.57)^a	10.95 (14.72)
RCR	41	19 (46.3)	22 (53.7)	56	23.85 (6.85)	37.56 (7.04)^a	13.71 (9.28)	0.47 (0.28)	0.84 (0.22)^a	0.37 (0.35)	71.78 (12.76)	79.76 (17.89)^a	7.98 (20.15)
TSR	42	11 (26.2)	31 (73.8)	70	20.05 (4.85)	39.62 (6.53)^a	19.57 (7.86)	0.45 (0.26)	0.74 (0.24)^a	0.29 (0.34)	77.48 (12.52)	79.40 (17.19)	1.93 (18.09)

VAS: visual analogue scale; SD: standard deviation; OSS: Oxford Shoulder Score; ACR: arthroscopic capsular release; ASD: arthroscopic subacromial decompression; RCR: rotator cuff repair; TSR: total shoulder replacement.

^a Denotes statistically significant difference between pre- and post-op score.

The SRM for OSS was significantly higher than the SRM for EQ-5D-3L and EQ-5D VAS in all surgical groups except for those patients who underwent the RCR procedure (Table 2). In accordance with Cohen’s criteria, the OSS SRM scores were large (>0.8) in all categories. This is also the same in the EQ-5D group, though to a lesser magnitude. The EQ-5D VAS responsiveness was moderate for capsular release and subacromial decompression groups but small for RCR and TSR groups.

Table 2.

SRM ± 95% CI for the OSS, EQ-5D-3L and EQ-5D VAS in surgical subgroups.

Surgical subgroup	SRM: OSS (95% CI)	SRM: EQ-5D (95% CI)	SRM: EQ-5D VAS (95% CI)
All	2.36 (2.22–2.50)	1.27 (1.14–1.41)	0.54 (0.40–0.69)
ACR	2.99 (2.79–3.19)	1.85 (1.66–2.05)	0.76 (0.57–0.95)
ASD	3.02 (2.56–3.47)	0.91 (0.45–1.36)	0.58 (0.23–0.94)
RCR	1.48 (1.16–1.79)	1.07 (0.75–1.38)	0.43 (0.086–0.76)
TSR	2.49 (2.18–2.80)	0.84 (0.53–1.15)	0.10 (0.20–0.40)

SRM: standardized response mean; CI: confidence interval; OSS: Oxford Shoulder Score; ACR: arthroscopic capsular release; ASD: arthroscopic subacromial decompression; RCR: rotator cuff repair; TSR: total shoulder replacement; VAS: visual analogue scale.

Weak correlations were noted between OSS and the EQ-5D-3L (r = 0.21) and EQ-5D VAS (r = 0.15) change scores (Table 3). ROC curve analysis found the EQ-5D did not discriminate between patients judged to be above or below the external criterion threshold, the AUC was 0.49. The EQ-5D VAS, however, did show some discriminatory ability with an AUC of 0.79. Logistic regression OR demonstrated weak external responsiveness for the EQ-5D VAS but not for the EQ-5D-3L (Table 3 and Figure 1).

Table 3.

EQ-5D change score correlation (Spearman rank) with OSS change score.^a

	Change score correlation with OSS	AUC (95% CI) against OSS external criteria (MCID > 4.5)	Logistic regression OR (95% CI) against OSS external criteria (MCID > 4.5)
EQ-5D-3L	0.21	0.490 (0.234–0.746)	0.91 (0.06–13.07)
EQ-5D VAS	0.15	0.7874 (0.599–0.975)	1.06 (1.02–1.11)

AUC: area under the curve; CI: confidence interval; OR: odds ratio; MCID: minimal clinically important difference; OSS: Oxford Shoulder Score; VAS: visual analogue scale.

^a AUC ± 95% CI and logistic regression OR ± 95% CI against MCID criteria of a change in OSS of >4.5 points.

Figure 1.

Non-parametric ROC curves for EQ-5D-3L and EQ-5D VAS against the external criterion of a MCID of a change score in OSS of >4.5 points. ROC: receiver operating characteristic; MCID: minimal clinically important difference; OSS: Oxford Shoulder Score.

There were no floor effects observed in the OSS or the EQ-5D-3L scores preoperatively or post-operatively (Table 4). There were no ceiling effects with the OSS, but significant ceiling effects were observed both overall and within all subgroups for the EQ-5D-3L post-operative scores.

Table 4.

Pre- and post-operative floor and ceiling effect of the OSS and EQ-5D in surgical subgroups.

	OSS				EQ-5D
	Preoperative		Post-operative		Preoperative		Post-operative
	Floor effect	Ceiling effect	Floor effect	Ceiling effect	Floor effect	Ceiling effect	Floor effect	Ceiling effect
Overall	0	0	0	7 (3%)	0	0	0	57 (28%)
ACR	0	0	0	0	0	0	0	16 (16%)
ASD	0	0	0	0	0	0	0	12 (57%)
TSR	0	0	0	5 (12%)	0	0	0	10 (24%)
RCR	0	0	0	2 (5%)	0	0	0	19 (46%)

OSS: Oxford Shoulder Score; ACR: arthroscopic capsular release; ASD: arthroscopic subacromial decompression; RCR: rotator cuff repair; TSR: total shoulder replacement.

Discussion

The routine use of PROMs is now an established component of health technology appraisal, monitoring and performance assessment.^1,26 The OSS is collected by the United Kingdom and several other European countries’ national joint arthroplasty registries.²⁷ In addition to the OSS, some national joint registries collect the EQ-5D-3L, allowing the assessment against population-based norms and for health economic analysis. In the United Kingdom, the EQ-5D has become the instrument of choice for many agencies including the National Institute for Clinical Excellence (NICE).² The results of this study in elective shoulder procedures demonstrate that though the EQ-5D-3L has adequate internal responsiveness, it’s external responsiveness was poor and the change in scores correlates only weakly with the change in OSS. Furthermore, it is unable to discriminate between patients who did, or did not, demonstrate MCID in the OSS.

Significant improvement in both scores was noted following a variety of common elective shoulder procedures. However, when quantified through SRM, the OSS was found to be significantly more responsive. Though the SRM values for the EQ-5D-3L exceed Cohen’s benchmark of 0.8 for large effect, they were approximately half that shown by the OSS, and it is relevant to note that the EQ-5D VAS responded poorly, particularly in the arthroplasty group. To a certain extent, this is to be expected, where a disease-specific outcome commonly outperforms generic measures and forms the basis for recommendations that assessments should include both measures.¹ We would certainly concur with this assertion, particularly in light of previous attempts to reduce the patient burden by administration of generic only questionnaires, an approach which has been found to be sufficient in lower limb surgery, where the inclusion of condition-specific scores did not provide additional information.^28,29

The use of an accepted external criterion of MCID was employed to assess the external responsiveness of the EQ-5D. A difference in the pre- and post-operative OSS score of >4.5 is felt to represent a patient improvement. Against this criteria, the EQ-5D was not able to discriminate between improved and non-improved patients. Post hoc modelling of different theoretical MCID reference scores found the responsiveness to improve once the MCID was set at >9 point change in OSS, with an AUC of 0.63. This represents a large change in OSS, and interestingly, using a distribution-based derivation of MCID (half a SD),²¹ our own data set would place the MCID at >3. The EQ-5D change scores weakly correlated with OSS change scores and linear regression demonstrated poor discriminative ability.

When the MCID was >4.5, the EQ-5D VAS demonstrated good discrimination between improved and unimproved patients. This was surprising in light of the very weak correlation between OSS and EQ-5D VAS change scores and is likely due to the small number of very poor EQ-5D VAS change scores in the non-improved patients. When the MCID was set at a theoretical >9 points, where a larger number of non-improved patients are included, the discriminatory value of the EQ-5D VAS significantly diminished, with an AUC of 0.52.

The presence of ceiling effects limits the use of an instrument due to clustering of scores at a maximum level of functioning.³⁰ In line with previous reports,^31,32 we found the OSS to be resistant from any significant floor and ceiling effects. In contrast, the EQ-5D-3L had a significantly high ceiling effect ranging from 16% in the capsular release group to as high as 57% in the subacromial decompression group. Our results agree with the findings of Slobogean et al.³³ who evaluated patients with proximal humerus fractures and reported a 30% ceiling effect with the use. The high ceiling rate may be in part due to a bimodal distribution of scores but the content of the questionnaire also makes a difference. If an item is irrelevant to members of a population, then there is limited probability it will show improvement in a longitudinal study.³⁰ For example, although it is possible that patients with shoulder pathology may have had concomitant symptoms of anxiety/depression, most patients are unlikely to be affected by this domain, therefore this item is unlikely to shift after any treatment.

The authors accept that there are limitations to this study. The EQ-5D-3L was selected for this study, but we recognize that previously reported high ceiling effect, bimodality and inadequacy of response options in capturing changes in health states in milder health problems, drove the development of the EQ-5D-5L, published in 2011.³⁴ This has extended the response options on the five health dimensions from three to five options. It may be that the responsiveness of this newer version is improved, and we would encourage this analysis to be undertaken. However, reference value sets for UK populations have only very recently been published,⁸ and the three-option version continues to be employed in health technology assessment and population monitoring, with particular reference to the English NHS PROMs programme for hip and knee arthroplasty.⁷ If this was extended to shoulder arthroplasty or elective shoulder surgery, it is highly relevant to note that the responsiveness of the three-option version may not represent patients adequately. We also recognize that the use of Cohen’s criteria with SRM data may lead to over- or underestimations of the magnitude of change over time.³⁵ Using the correction method advocated by Middel and van Sonderen,³⁵ by relating repeated measure correlations to SRM data, no classification changes occurred for OSS or EQ-5D-3L groups. We also recognize that responsiveness is only one of the psychometric properties essential to the functioning of a PROM,³⁶ however, the appropriateness, validity, repeatability, acceptability and feasibility of these have previously been studied.

The EQ-5D-3L is one of the most widely used generic PROMs.³⁷ It is the instrument recommended by NICE² for health technology assessment, which includes assessment of effectiveness and cost-effectiveness. It is vital that any health economic assessment computed on the basis of the EQ-5D are a reliable measure of the health state they represent.⁶ The EQ-5D-3L, though internally responsive in elective shoulder surgery, correlated poorly with the OSS and is unable to differentiate patients whose clinical condition has improved from those that have not. Though the VAS component of the EQ-5D-3L might have utility in distinguishing patients, based on ROC analysis, the limited responsiveness to change demonstrated by a low SRM suggests that it too is inadequate. Though the five-option version may offer improved metric properties, it is worth noting that a 15-level 5-dimension score has been found to be less responsive than the EQ-5D-3L in some health states,³⁸ and further investigation is therefore warranted. The OSS itself was initially validated against the SF-36, where pre- and post-op correlation coefficients were greater³⁹ than those demonstrated here with the EQ-5D-3L, though their assessment of effect size of the SF-36 was broadly similar to the EQ-5D-3L. Further assessment of SF-36 or SF-6D responsiveness as well as EQ-5D-5L in elective shoulder procedures is therefore required before any measure could be confidently recommended.

Conclusion

This is the first study to demonstrate that the EQ-5D-3L exhibits adequate internal responsiveness but poor external responsiveness in elective shoulder surgery. The EQ-5D-3L does not represent patient improvement and therefore may not provide adequate evidence on QALYs for economic evaluation in this patient population.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Appleby

Devlin

Parkin

. Using patient reported outcomes to improve health care. West Sussex: John Wiley & Sons; 2015.

National Institute for Health and Care Excellence (NICE). Guide to the methods of technology appraisal 2013, https://www.nice.org.uk/process/pmg9/chapter/foreword (accessed 4 May 2017).

Beard

Rees

Rombach

. The CSAW Study (Can Shoulder Arthroscopy Work?) – a placebo-controlled surgical intervention trial assessing the clinical and cost effectiveness of arthroscopic subacromial decompression for shoulder pain: study protocol for a randomised controlled trial. Trials 2015; 16(1): 210.

Jevne

. The sexy scalpel: unnecessary shoulder surgery on the rise. Br J Sports Ex Med 2015; (16): 1031–1032.

Payakachat

Ali

Tilford

. Can the EQ-5D detect meaningful change? A systematic review. Pharmacoeconomics 2015; 33(11): 1137–1154.

Tordrup

Mossman

Kanavos

. Responsiveness of the EQ-5D to clinical change: Is the patient experience adequately represented? Int J Technol Assess Health Care 2014; 30(01): 10–19.

Devlin

Parkin

Browne

. Patient-reported outcome measures in the NHS: new methods for analysing and reporting EQ-5D data. Health Econ 2010; 19(8): 886–905.

Devlin

Shah

Feng

Valuing health-related quality of life: An EQ-5D-5L value set for England. Health Economics. 2018;27(1):7–22.

Brealey

Armstrong

Brooksbank

. United Kingdom Frozen Shoulder Trial (UK FROST), multi-centre, randomised, 12 month, parallel group, superiority study to compare the clinical and cost-effectiveness of Early Structured Physiotherapy versus manipulation under anaesthesia versus arthroscopic capsular release for patients referred to secondary care with a primary frozen shoulder: study protocol for a randomised controlled trial. Trials. 2017;18(1):614.

10.

Rangan

Handoll

Brealey

. Surgical vs nonsurgical treatment of adults with displaced fractures of the proximal humerus: the PROFHER randomized clinical trial. JAMA 2015; 313(10): 1037–1047.

11.

Carr

Cooper

Campbell

. Clinical effectiveness and cost-effectiveness of open and arthroscopic rotator cuff repair [the UK Rotator Cuff Surgery (UKUFF) randomised trial]. Health Technol Assess 2015; 19(80): 1–218.

12.

Wailoo

Alava

Grimm

. Comparing the EQ-5D-3L and 5L versions. What are the implications for cost effectiveness estimates? NICE Decision Support Unit, http://www.nicedsu.org.uk/DSU_3L%20to%205L%20FINAL.pdf (2017, accessed 4 May 2017).

13.

Olerud

Tidermark

Ponzer

. Responsiveness of the EQ-5D in patients with proximal humeral fractures. J Shoulder Elbow Surg 2011; 20(8): 1200–1206.

14.

Dawson

Rogers

Fitzpatrick

. The Oxford shoulder score revisited. Arch Orthop Trauma Surg 2009; 129(1): 119–123.

15.

van Reenen

Oppe

. EQ-5D-3L user Guide. EuroQol Res Found. 2015;22.

16.

Deyo

Diehr

Patrick

. Reproducibility and responsiveness of health status measures statistics and strategies for evaluation. Control Clin Trials 1991; 12(4): S142–S158.

17.

Husted

Cook

Farewell

. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol 2000; 53(5): 459–468.

18.

Beaton

Hogg-Johnson

Bombardier

. Evaluating changes in health status: reliability and responsiveness of five generic health status measures in workers with musculoskeletal disorders. J Clin Epidemiol 1997; 50(1): 79–93.

19.

Cohen

. Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988, pp. 20–26.

20.

Juniper

Guyatt

Willan

. Determining a minimal important change in a disease-specific quality of life questionnaire. J Clin Epidemiol 1994; 47(1): 81–87.

21.

Norman

Stratford

Regehr

. Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach. J Clin Epidemiol 1997; 50(8): 869–879.

22.

van Kampen

Willems

van Beers

. Determination and comparison of the smallest detectable change (SDC) and the minimal important change (MIC) of four-shoulder patient-reported outcome measures (PROMs). J Orthop Surg Res 2013; 8(1): 40.

23.

Christiansen

Frost

Falla

. Responsiveness and minimal clinically important change: a comparison between 2 shoulder outcome measures. J Orthop Sports Phys Ther 2015; 45(8): 620–665.

24.

Beard

Harris

Dawson

. Meaningful changes for the Oxford hip and knee scores after joint replacement surgery. J Clin Epidemiol 2015; 68(1): 73–79.

25.

Terwee

Bot

de Boer

. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007; 60(1): 34–42.

26.

Dawson

Doll

Fitzpatrick

. The routine use of patient reported outcome measures in healthcare settings. BMJ 2010; 340: c186.

27.

Rasmussen

Olsen

Fevang

BTS

. A review of national shoulder and elbow joint replacement registries. J Shoulder Elbow Surg 2012; 21(10): 1328–1335.

28.

Busse

Bhandari

Guyatt

. Use of both Short Musculoskeletal Function Assessment questionnaire and Short Form-36 among tibial-fracture patients was redundant. J Clin Epidemiol 2009; 62(11): 1210–1217.

29.

Dattani

Slobogean

O’Brien

. Psychometric analysis of measuring functional outcomes in tibial plateau fractures using the Short Form 36 (SF-36), Short Musculoskeletal Function Assessment (SMFA) and the Western Ontario McMaster Osteoarthritis (WOMAC) questionnaires. Injury 2013; 44(6): 825–829.

30.

Hyland

. A brief guide to the selection of quality of life instrument. Health Qual Life Outcomes 2003; 1(1): 24.

31.

Tuğay

Gelecek

. Oxford Shoulder Score: cross-cultural adaptation and validation of the Turkish version. Arch Orthop Trauma Surg 2011; 131(5): 687–694.

32.

Ekeberg

Bautz-Holter

Tveitå

. Agreement, reliability and validity in 3 shoulder questionnaires in patients with rotator cuff disease. BMC Musculoskelet Disord 2008; 9(1): 68.

33.

Slobogean

Noonan

O’Brien

. The reliability and validity of the Disabilities of Arm, Shoulder, and Hand, EuroQol-5D, Health Utilities Index, and Short Form-6D outcome instruments in patients with proximal humeral fractures. J Shoulder Elbow Surg 2010; 19(3): 342–348.

34.

Herdman

Gudex

Lloyd

. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res 2011; 20(10): 1727–1736.

35.

Middel

van Sonderen

. Statistical significant change versus relevant or important change in (quasi) experimental design: some conceptual and methodological problems in estimating magnitude of intervention-related change in health services research. Inter J Int Care 2002; 2(4): e15.

36.

Fitzpatrick

Davey

Buxton

. Evaluating patient-based outcome measures for use in clinical trials. Health Technol Assess (Winchester, England) 1998; 2(14): 1–4.

37.

Brooks

and Group E. EuroQol: the current state of play. Health Policy 1996; 37(1): 53–72.

38.

Stavem

Frøland

Hellum

Comparison of preference-based utilities of the 15D, EQ-5D and SF-6D in patients with HIV/AIDS. Qual Life Res 2005; 14(4): 971–980.

39.

Dawson

Fitzpatrick

Carr

. Questionnaire on the perceptions of patients about shoulder surgery. J Bone Joint Surg Br 1996; 78(4): 593–600.

Responsiveness of the EQ-5D-3L in elective shoulder surgery: Does it adequately represent patient experience?

Abstract

Aim:

Methods:

Results:

Discussion:

Keywords

Introduction

Patients and methods

Statistics methods

Internal responsiveness

External responsiveness

Results

Discussion

Conclusion

Footnotes

Declaration of conflicting interests

Funding

References