Abstract
Background:
The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed to capture patient-reported outcomes (PROs) in an efficient manner. Few studies have assessed this instrument postoperatively.
Purpose:
To compare the PROMIS Physical Function computer adaptive test (PROMIS PF CAT) and Upper Extremity (PROMIS UE) item bank to other previously validated PRO instruments and to evaluate ceiling and floor effects and construct validity responsiveness in patients who underwent operative interventions for shoulder instability.
Study Design:
Cohort study (diagnosis); Level of evidence, 2.
Methods:
A total of 72 patients who underwent operative interventions for shoulder instability completed the American Shoulder and Elbow Surgeons (ASES) assessment form, Marx shoulder activity scale (Marx), 36-Item Short Form Health Survey physical function (SF-36 PF) and general health (SF-36 GH), Western Ontario Shoulder Instability Index (WOSI), PROMIS PF CAT, and PROMIS UE before surgery and then at 6 weeks and 6 months postoperatively. Correlation coefficients were calculated among these tools. The effect size of change was also calculated for each tool at each time point. A total of 91 patients who had also undergone surgery for shoulder instability completed these PRO instruments 2 years postoperatively. The percentage of patients hitting the ceiling and floor effects of each of the PRO instruments was calculated at all time points.
Results:
The PROMIS PF CAT demonstrated excellent-good correlation with the SF-36 PF at all postoperative time points (0.61 at 6 weeks, 0.68 at 6 months, and 0.64 at 2 years; P < .01 for all). The PROMIS UE showed excellent correlation with the ASES at 6 weeks postoperatively (0.73, P < .01). Both the PROMIS PF CAT and PROMIS UE demonstrated the ability to detect change after surgical interventions with a medium to large effect size. The PROMIS UE demonstrated a ceiling effect at 6 months (68.1%) and 2 years (67.0%) postoperatively. The PROMIS PF CAT demonstrated no ceiling effect at any time point.
Conclusion:
The PROMIS PF CAT demonstrated good to excellent correlation with other previously validated PRO instruments that assess physical function in patients with shoulder instability postoperatively. The PROMIS UE demonstrated good correlation with other PRO tools but had a significant ceiling effect and is not recommended for this patient population. Both tools demonstrated an ability to detect change after surgical interventions with a good effect size.
Patient-reported outcomes (PROs) are becoming increasingly important in our current health care delivery system. Identifying tools to accurately and efficiently capture PROs will help providers better understand patients’ perspective of their abnormality and treatment. These tools are also valuable to providers by allowing them to appropriately counsel patients about expected outcomes before and after operative interventions.
Shoulder instability is not uncommon, especially in young and athletic populations. The incidence of anterior instability has been shown to be between 11.2 per 100,000 person-years 18 and 23.9 per 100,000 person-years 19 in the United States. These patients are often young and male, and collision and overhead athletes have a higher risk. 7,8
Previously validated PRO tools used for the evaluation of patients with shoulder instability include the American Shoulder and Elbow Surgeons (ASES) assessment form, 12 the Western Ontario Shoulder Instability Index (WOSI), 11 and the Marx shoulder activity scale. 5 Other PRO instruments that address general health–related quality of life have also been used in the assessment of patients with shoulder instability, such as the 36-Item Short Form Health Survey, which includes subscales for physical function (SF-36 PF), general health (SF-36 GH), vitality, bodily pain, physical role functioning, emotional role functioning, social role functioning, and mental health, 13 and the EuroQol–5 Dimensions (EQ-5D) questionnaire. 10 Each of these instruments includes several domains to evaluate a patient’s outcome, including physical function, pain, and general health. The ideal tool should have a low respondent burden and have the ability to detect change before and after an intervention.
The Patient-Reported Outcomes Measurement Information System (PROMIS) was developed by the National Institutes of Health in an effort to improve the assessment of PROs by creating a tool that is highly reliable and precise while decreasing the burden on both patients and health care providers. Specifically, the PROMIS Physical Function computer adaptive test (PROMIS PF CAT) has been investigated in multiple patient populations with orthopaedic abnormalities. 3,4,17 The PROMIS Upper Extremity (PROMIS UE) item bank has also been utilized to study PROs in patients with rotator cuff abnormalities 2,16 and those undergoing shoulder arthroplasty. 6 Both of these tools have been previously investigated in patients with a diagnosis of shoulder instability before operative interventions. It was shown that there was good to excellent correlation with other legacy upper extremity PRO instruments; however, the PROMIS UE had significant ceiling effects in patients aged ≤21 years. 3 Fewer studies have looked at these 2 tools to evaluate patients after surgery when activity levels and use of the operative extremity should be higher and pain levels should be improved.
For patients undergoing a surgical intervention, it is of utmost importance to have a PRO instrument that can capture a meaningful change to judge the impact of the surgical intervention. Ideally, a PRO tool should have the ability both to detect change over a time frame (responsiveness) and to correspond to changes in a reference PRO instrument (construct validity). For these reasons, we aimed to investigate the PROMIS PF CAT and PROMIS UE in patients with shoulder instability before and after operative interventions. We hypothesized that (1) these tools can detect a change before and after an intervention and demonstrate a large effect size of that change; (2) the PROMIS UE may show a ceiling effect after shoulder stabilization in young patients; and (3) there would be convergent validity with other measures of physical function, such as the SF-36 PF, ASES, and WOSI, and divergent validity with measures of general health, such as the SF-36 GH and EQ-5D.
Methods
This study was approved by our institutional review board and was determined to be HIPAA (Health Insurance Portability and Accountability Act) compliant. Patients enrolled in our institution’s prospective shoulder instability database are asked to complete the ASES, WOSI, Marx, SF-36 (PF and GH subscales included), EQ-5D, PROMIS PF CAT, and PROMIS UE preoperatively and at 6 weeks, 6 months, and 2 years postoperatively. Data from patients enrolled in the database from January 21, 2015, to November 28, 2016, were extracted. A total of 96 patients were identified from this data extraction, and 72 were found to have complete preoperative as well as 6-week and 6-month postoperative data available for longitudinal analysis. Only 18 patients from this group had complete longitudinal data from preoperatively to 2 years postoperatively. Therefore, a second group of patients was identified from the database who were eligible for 2-year follow-up from June 9, 2015, to November 26, 2017. A total of 106 patients were identified in this time frame, and 91 patients had complete data available at 2 years postoperatively for cross-sectional analysis. Eighteen of these patients had preoperative data and overlapped with the longitudinal group. Patient demographic data including age, body mass index (BMI), sex, operative side, and procedure type were extracted from patient records. The typical rehabilitation protocol followed the Multicenter Orthopaedic Outcomes Network (MOON) shoulder instability protocols, which included starting physical therapy 2 weeks postoperatively and progressing to active range of motion and resisted isometric exercises by 6 weeks after surgery. 14
Statistical analysis was carried out to determine Spearman correlation coefficients between the PROMIS PF CAT and PROMIS UE with the previously mentioned legacy PRO instruments: ASES, WOSI, SF-36, Marx, and EQ-5D. Correlations were described as excellent (>0.7), excellent-good (0.61-0.7), good (0.4-0.6), or poor (0.2-0.39). 9 Responsiveness to change was assessed using the effect size (Cohen d) and standardized response mean at 6 weeks and 6 months postoperatively. Values were defined as small (0.2), medium (0.5), or large (0.8). 15 The percentage of patients achieving the lowest (floor) and highest (ceiling) possible scores of each PRO tool was calculated, and the presence of a floor or ceiling effect was determined if >15% of patients were achieving these values. SAS statistical software (version 9.4; SAS Institute) was used for analyses, with P < .05 considered statistically significant.
Results
Of the 72 patients with data available for longitudinal analysis, the mean age was 22.1 years (range, 14-44 years), 79% were male, and the mean BMI was 26.6 kg/m2 (range, 20-38 kg/m2). Of the 91 patients who had complete data at 2 years postoperatively, the mean age was 24.8 years (range, 17-56 years), 80% were male, and the mean BMI was 26.4 kg/m2 (range, 20-42 kg/m2). Demographics and procedures performed can be found in Table 1. The median number of questions answered was 7 questions for the PROMIS PF CAT (range, 3-12) and 16 questions for the PROMIS UE. Among the other PRO tools, there were 10 questions for the SF-36 PF, 5 questions for the SF-36 GH, 10 questions for the ASES, 7 questions for the Marx, 5 questions for the EQ-5D, and 21 questions for the WOSI.
Demographic Data and Procedures Performed a
a Data are reported as mean (range) unless otherwise indicated. Data were available for the longitudinal group preoperatively and at 6 weeks and 6 months postoperatively and for the cross-sectional group at 2 years postoperatively.
PROMIS Changes After Surgery
The PROMIS UE and PROMIS PF CAT demonstrated a statistically significant change between the 3 time points used in longitudinal analysis: preoperatively, 6 weeks postoperatively, and 6 months postoperatively. Scores initially decreased (PROMIS PF CAT: 51.48 to 45.42, P < .01; PROMIS UE: 44.02 to 35.14, P < .01) at 6 weeks postoperatively and then increased above the preoperative scores at 6 months postoperatively (PROMIS PF CAT: to 57.18, P < .01; PROMIS UE: to 52.15, P < .01). The effect size (Cohen d) was medium to large at each time point (PROMIS UE: –1.05 ± 0.30 from preoperatively to 6 weeks and 1.09 ± 0.33 from preoperatively to 6 months; PROMIS PF CAT: –0.94 ± 0.33 from preoperatively to 6 weeks and 0.76 ± 0.32 from preoperatively to 6 months). Results are summarized in Table 2.
Effect Size and SRM of PRO Instruments Between Study Time Points a
a Values were defined as small (absolute value of 0.2), medium (absolute value of 0.5), or large (absolute value of 0.8). ASES, American Shoulder and Elbow Surgeons; EQ-5D, EuroQol–5 Dimensions; PRO, patient-reported outcome; PROMIS PF CAT, Patient-Reported Outcomes Measurement Information System Physical Function computer adaptive test; PROMIS UE, Patient-Reported Outcomes Measurement Information System Upper Extremity; SF-36 PF, 36-Item Short Form Health Survey physical function; SRM, standardized response mean; WOSI, Western Ontario Shoulder Instability Index.
Floor and Ceiling Effects
Ceiling effects were demonstrated at 6 months (68.1%) and 2 years (67.0%) postoperatively for the PROMIS UE. There was no ceiling effect for the PROMIS PF CAT at any time point. Ceiling effects were also present at 6 months and 2 years postoperatively for the ASES (23.6% and 39.0%, respectively), SF-36 PF (41.7% and 69.0%, respectively), and EQ-5D (48.6% and 39.0%, respectively). A subgroup analysis of patients aged ≤21 years (n = 38) demonstrated even larger ceiling effects at 6 months (71.1%) and 2 years (81.0%) postoperatively. The PROMIS UE has been expanded to a CAT to include 46 items (PROMIS UE CAT v 2.0). We had a total of 18 patients who completed this updated version of the PROMIS UE at 2 years postoperatively. Analysis on this group still demonstrated a ceiling effect of 44.0%. These results are summarized in Table 3.
Ceiling and Floor Effects of PRO Instruments a
a Data are reported as percentage of patients. The presence of a ceiling effect was defined as >15% of patients achieving the maximum possible score. ASES, American Shoulder and Elbow Surgeons; PRO, patient-reported outcome; PROMIS PF CAT, Patient-Reported Outcomes Measurement Information System Physical Function computer adaptive test; PROMIS UE, Patient-Reported Outcomes Measurement Information System Upper Extremity; SF-36 PF, 36-Item Short Form Health Survey physical function; WOSI, Western Ontario Shoulder Instability Index.
Correlation With Previously Validated Instruments
The PROMIS UE demonstrated the strongest correlation with legacy PRO tools used in shoulder instability, such as the ASES, WOSI, and SF-36 PF. It showed excellent correlation with the ASES at 6 weeks postoperatively (0.73, P < .01) and excellent-good correlation with the WOSI (0.62, P < .01) and SF-36 PF (0.68, P < .01) at 6 weeks postoperatively. It demonstrated good correlation at 6 months and 2 years postoperatively with the ASES (0.54 and 0.58, respectively; P < .01), WOSI (0.48 and 0.58, respectively; P < .01), SF-36 PF (0.56 and 0.54, respectively; P < .01), and EQ-5D (0.46 and 0.55, respectively; P < .01). There was poor correlation at all time points with the Marx (6 weeks: 0.06, P = .61; 6 months: 0.26, P = .03) and SF-36 GH (6 weeks: 0.20, P = .10; 6 months: 0.18, P = .12).
The PROMIS PF CAT showed excellent-good correlation with the SF-36 PF at all time points (6 weeks: 0.61, P < .01; 6 months: 0.68, P < .01; 2 years: 0.64, P < .01). There was excellent-good correlation at 2 years postoperatively with the ASES (0.64, P < .01) and WOSI (0.62, P < .01). Results are summarized in Table 4.
Correlation of PROMIS PF CAT and PROMIS UE With Other PRO Instruments a
a There was general convergent validity with other instruments focused on physical function and divergent validity with instruments aimed to measure general health. Correlations were defined as excellent (>0.7, black shading), excellent-good (0.61-0.7, dark gray shading), good (0.4-0.6, light gray shading), or poor (0.2-0.39, no shading). ASES, American Shoulder and Elbow Surgeons; EQ-5D, EuroQol–5 Dimensions; PRO, patient-reported outcome; PROMIS PF CAT, Patient-Reported Outcomes Measurement Information System Physical Function computer adaptive test; PROMIS UE, Patient-Reported Outcomes Measurement Information System Upper Extremity; SF-36 GH, 36-Item Short Form Health Survey general health; SF-36 PF, 36-Item Short Form Health Survey physical function; WOSI, Western Ontario Shoulder Instability Index.
Discussion
The present study determined that the PROMIS PF CAT and PROMIS UE are able to adequately detect changes after shoulder stabilization for instability. They both correlated well with shoulder outcome measures that have been previously validated. However, we did find significant ceiling effects for the PROMIS UE. The PROMIS PF CAT did not demonstrate floor or ceiling effects either before or after shoulder stabilization for instability and could be considered for use in this population.
PROs are becoming increasingly important in our health care delivery system and are being incorporated into health care legislation. There is incredible value in identifying PRO tools with a low respondent burden that can detect meaningful change before and after an intervention. The PROMIS was developed by the National Institutes of Health to fulfill these requirements and has been shown to correspond with other previously validated PRO tools for patients with shoulder instability preoperatively; yet, the PROMIS UE was found to have ceiling effects in patients aged ≤21 years. 3 Few reports in the literature have studied the ability of the PROMIS to detect change before and after surgical interventions, and no studies have investigated this in patients with shoulder instability.
This study aimed to examine the internal and external responsiveness of 2 PROMIS tools after operative interventions. Internal responsiveness demonstrates the ability of a PRO to change before and after treatment. 15 The PROMIS PF CAT and PROMIS UE both demonstrated a statistically significant change in scores preoperatively compared with 6 weeks and 6 months postoperatively. The PROMIS PF CAT showed a medium to large effect size at 6 weeks and 6 months postoperatively compared with preoperatively, and the PROMIS UE demonstrated a large effect size at both time points. These effect sizes were comparable with or larger than those of the legacy PRO tools evaluated in this study.
External responsiveness can be assessed by examining the correlation between a PRO tool and previously validated PRO instruments that measure a similar health domain. In the current study, we examined the correlation between the PROMIS PF CAT and PROMIS UE and legacy PRO tools used to analyze shoulder function (WOSI, ASES, and Marx) as well as general health and function (SF-36 PF, SF-36 GH, and EQ-5D). The PROMIS UE generally demonstrated convergent validity with the other measures of shoulder function (ASES, WOSI), while the PROMIS PF CAT showed the most robust correlation with the SF-36 PF at all postoperative time points. Therefore, the PROMIS PF CAT could be administered in place of the SF-36 PF and possibly the ASES and WOSI. There was generally divergent validity between the PROMIS PF CAT and PROMIS UE with other tools that measure other health domains, such as general health and quality of life. This suggests that co-administration of these other tools may need to be continued when investigating these other health domains.
Ceiling and floor effects represent the percentage of patients who achieve the highest or lowest possible scores of a particular PRO instrument. Ideally, a PRO tool would have few patients achieving the extremes to stratify patients appropriately. Our study demonstrated a large ceiling effect with the PROMIS UE, especially at 6 months and 2 years postoperatively, which is not unexpected if a patient’s pain and function improve after surgery. This effect was more dramatic in patients aged ≤21 years, which was also detected in preoperative patients. 3 Other studies have also suggested that the PROMIS UE seems to focus on lower levels of function, which is problematic when trying to evaluate young and healthy populations. 9 Such ceiling effects were not present in other studies using the PROMIS UE, including patients with rotator cuff tears 2,16 or those undergoing shoulder arthroplasty, 6 who represent older and likely less functional populations. The PROMIS UE has subsequently been expanded to a CAT to include 46 items (PROMIS UE CAT v 2.0). We had a subgroup of patients (n = 18) complete this version at 2 years postoperatively, and although this improved the percentage of patients hitting the ceiling effect, 44.0% of this group still achieved the maximum score. This means that the PROMIS UE and PROMIS UE CAT v 2.0 will not be able to detect higher levels of function in these patients and would need to be revised, or other PRO tools such as the Kerlan-Jobe Orthopaedic Clinic overhead athlete shoulder and elbow scale 1 should be utilized in conjunction with this.
There are several limitations to this study. First, we had incomplete longitudinal data among the patients who had outcome scores at 2 years postoperatively, necessitating cross-sectional analysis at this time point. Therefore, although we were able to measure the correlation of the PROMIS with legacy PRO instruments as well as ceiling and floor effects at 2 years postoperatively, we could not assess responsiveness to change at this time point. Along with this, the longitudinal group was only evaluated to 6 months in the postoperative period, which is near the time to return to full activity. Therefore, conclusions can only be drawn about the correlation of the PROMIS with other PRO instruments at this time point and beyond. Second, all patients were required to complete multiple PRO tools at each visit, and questionnaire fatigue may have had an influence on our results. However, in a study conducted on these patients preoperatively, it was found that the order in which the surveys were filled out had no significant influence on the results. 3 In the current study, approximately half the patients filled out PROMIS questions first, and half filled them out last. Third, the population of patients was heterogeneous in terms of the type of shoulder stabilization procedure performed secondary to our methods of data extraction from our prospective database that includes patients based on diagnosis (shoulder instability) rather than procedure. The procedures investigated in the current study were heavily skewed to represent arthroscopic stabilization procedures. Although we were able to perform a small subgroup analysis on patients who had used the expanded PROMIS UE CAT v 2.0, further studies are warranted to assess this updated tool. Finally, this study did not address one of the most important outcomes in shoulder instability: recurrence. An additional study is warranted to assess the correlation of the PROMIS and other PRO tools in evaluating outcomes in patients with and without recurrent shoulder instability.
Conclusion
The PROMIS PF CAT and PROMIS UE correlated well with previously validated tools in patients with shoulder instability. Both tools demonstrated the ability to detect meaningful change before and after operative interventions for shoulder instability. The PROMIS UE demonstrated ceiling effects at all time points, but no ceiling effect was found for the PROMIS PF CAT. The PROMIS PF CAT can be considered in place of the SF-36 PF. The PROMIS UE or PROMIS UE CAT v 2.0 may not have the ability to stratify higher levels of function, especially in patients aged ≤21 years, and should not be utilized at this time.
Footnotes
One or more of the authors has declared the following potential conflict of interest or source of funding: R.W.W. has received educational support from Arthrex and Smith & Nephew and hospitality payments from Medical Device Business Services. M.B. has received nonconsulting payments from Arthrex. B.R.W. has received research support from OREF, educational support from Arthrex and Smith & Nephew, and consulting payments from Linvatec. C.H. has received research support from Zimmer Biomet, nonconsulting payments from Pacira Pharmaceuticals, and hospitality payments from Arthrex and Tornier. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.
Ethical approval for this study was obtained from the University of Iowa Institutional Review Board (ID: 201201714).
