Abstract
Background:
There has not yet been a pictorial version of a patient-reported outcome measure for shoulder pain.
Purpose:
To translate the English version of the Oxford Shoulder Score (OSS) to a simplified Chinese version (SC-OSS) and to validate a new face-scale version of the OSS (FS-OSS), while investigating cross-cultural adaptation, validation, and reproducibility of both versions in patients with shoulder pain.
Study Design:
Cohort study (diagnosis); Level of evidence, 2.
Methods:
The translation and cross-cultural adaptation of the SC-OSS was performed using a forward-backward translation method. The FS-OSS was developed on the basis of the SC-OSS, using the Wong-Baker FACES Pain Rating Scale for reference. Participants were asked to complete the SC-OSS, FS-OSS, Simple Shoulder Test (SST), Constant-Murley score (CMS), and 36-Item Short Form Health Survey (SF-36). Validation and reproducibility were tested by calculating Cronbach α values for internal consistency as well as by intraclass correlation coefficients. Time needed to complete the scores was used to test cross-cultural adaption.
Results:
A total of 312 respondents participated in the research and completed all outcome measures. The internal consistency was strong, with a Cronbach α of .94 and .91 for the FS-OSS and SC-OSS, respectively. High intraclass correlation coefficient values for the FS-OSS score (0.95) and SC-OSS (0.92) were obtained, which indicated excellent test-retest reliability. The Pearson correlation coefficients of the SC-OSS and FS-OSS with the SST (
Conclusion:
The FS-OSS and SC-OSS were validated as reliable instruments for patients with shoulder pain. For Chinese patients, the face-scale version was easier to understand than the cross-cultural text version.
Shoulder pain is the third most common musculoskeletal problem encountered in orthopaedic practice after low back pain and neck pain. 25 The Oxford Shoulder Score (OSS) is an internationally recognized assessment instrument to assess the pain perception and quality of life in patients with shoulder pain and to evaluate the effectiveness of different treatments. 23,24 After its original English version was published in 1996, 6 it has been translated into many languages in different cultural settings to communicate internationally. 2,3,8 –12,17,21,22,29 –32 In 2015, a simplified Chinese version of OSS (SC-OSS) was developed by Xu et al. 39
Even so, high illiteracy and low income in the developing countries may limit the applicability of the OSS. Facial expression drawings (face scales) are a popular method of assessing pain severity in pediatric populations; they are easier to understand and more suitable for children and illiterate people. The Wong-Baker FACES Pain Rating Scale (Wong-Baker scale) 38 is one of several scales that have been used for pain assessment in multiple pediatric settings. 4,5
To evaluate shoulder function in Chinese patients, we translated the English version of the OSS to a simplified Chinese version (SC-OSS) by partially referencing the Xu et al version, 39 and we also developed a face-scale version of the OSS (FS-OSS). The purpose of this study was to validate the SC-OSS and FS-OSS using cross-cultural adaptation, validation, and reproducibility of these 2 versions in Chinese patients with shoulder pain. We hypothesized that both versions of the OSS could overcome language and literacy barriers.
Methods
Translation and Cultural Adaptation Procedure
All study participants signed informed consent forms before inclusion, and the clinical research ethics committee of our hospital approved the study protocol. The translation and cross-cultural adaptation of the SC-OSS (Appendix 1) was performed according to the guidelines reported by Beaton et al, 1 using a “forward-backward translation” method. First, the original OSS (Table 1) was translated by an expert committee that consisted of 1 orthopaedic surgeon (J.J.G.), 1 rehabilitation physician, 1 physical therapist, and 1 language expert, and an initial Chinese-version OSS was created. In this process, some of the translation referenced a previous study. 39 After the first process, a mediation session between the expert committee and our research team was performed to obtain a modified Chinese version. Backward translation was made by a native English speaker and the research team, then the original version and modified Chinese version were compared.
In the cognitive debriefing step, a cohort of 6 patients with shoulder pain (proficient in Mandarin Chinese) was asked to test the initial version, after which the SC-OSS was finalized. The FS-OSS was then created on the basis of the SC-OSS. A flowchart of the translation and testing procedures is shown in Figure 1.

Flowchart of the translation and cultural adaptation process of the OSS from English into SC-OSS and face-scale version. *Expert committee consisting of 1 orthopaedic surgeon, 1 rehabilitation physician, 1 physical therapist, and 1 language expert. OSS, Oxford Shoulder Score; SC-OSS, simplified Chinese version of OSS.
Participants
The finalized SC-OSS and FS-OSS were administered to a sample of 312 consecutive patients affected by shoulder pain who visited the outpatient clinic of The First Affiliated Hospital of Soochow University and West China Hospital, Sichuan University from April 2016 to December 2017. Patients with glenohumeral instability and fracture of the shoulder were excluded. Patient characteristics, including age, age groups, sex, affected side, duration of symptoms, education, and clinical diagnosis, were collected during the first visit to our outpatient clinic.
Psychometric Assessments
All participants were asked to complete the SC-OSS, FS-OSS, Simple Shoulder Test (SST), Constant-Murley score (CMS), and 36-Item Short Form Health Survey (SF-36) during their first visit. Within an interval of 5 to 7 days after the first visit, they completed the SC-OSS and FS-OSS for a second time to evaluate test-retest reliability. The time required to complete the questionnaires and any difficulty encountered in answering a question were recorded. For illiterate or low-literate participants, researchers were available to help with the questionnaires if needed; these situations were also recorded to assess the ability of communication.
OSS and SC-OSS
The OSS is a valid and reliable questionnaire for shoulder injuries (excluding instability) and comprises 12 items on pain and disability (see Table 1). Each question has 5 possible responses with scores from 0 (worst pain and maximal limitation) to 4 (no pain and no functional limitation), for a total possible score of 48. 39 The SC-OSS as developed for this study is shown in Appendix 1.
FC-OSS Scoring
After developing the SC-OSS, a pictorial version was developed using the Wong-Baker scale 38 as a reference (Moola et al, unpublished data, [2020]). The questions, possible responses, item scores, and total score were modeled on the OSS with corresponding pictures placed after each question. Each question had a scale of 5 facial options from no pain (4 points) to worst pain (0 points). Lower scores indicate higher levels of pain and disability. The FC-OSS is shown in Chinese and English in Appendices 2 and 3.
Simple Shoulder Test
The SST is an internationally used, simple self-report questionnaire, which was developed for measuring functional limitations of the affected shoulder in patients with shoulder dysfunction. 18 It consists of 12 questions (yes = 1/no = 0). The items of the SST are about function-related pain (2 items), function/strength (7 items), and range of motion (3 items). The total scores range from 0 (worst function) to 12 (best function).
Constant-Murley Score
The CMS is widely used for evaluating the outcomes for the treatment of shoulder disorders and measures pain perception, functional assessment, range of motion, and strength. The Chinese version of the CMS has been found to have excellent validity and reliability. 20
36-Item Short Form Health Survey
The SF-36 is a generic questionnaire for assessing the health status of patients, and it consists of 8 domains: physical function (PF), bodily pain (BP), general health (GH), vitality (VT), social function (SF), role-physical (RP), role-emotional (RE), and mental health (MH). 36 Scores for each dimension range from 0 (poor health) to 100 (good health). The SF-36 has also been translated and culturally adapted into Chinese. 16
Statistical Analysis
All statistical analyses were performed using SPSS Version 26 (IBM). Descriptive statistics of means, standard deviations, proportions, and percentages were used to describe the baseline characteristics of the participants. The internal consistency was determined by calculating Cronbach α. The intraclass correlation coefficient (ICC) was used to calculate test-retest reliability. The standard error of measurement (SEM) and the minimal detectable change (MDC) were calculated to determine the measurement errors. The Pearson correlation coefficient (
Results
Translations and Cross-Cultural Adaptation
After the forward and backward translation process, certain cross-cultural differences were noted and resolved. Chinese usually use chopsticks and spoons; therefore, the “knife and fork” were replaced in question 4. “Bowl” was substituted for “tray” for a similar reason. In addition, considering the education level of the patients, the translators used simple words as much as possible for clear expression.
There were no major problems in the graphical translation. Some discrepancies were caused by cultural differences and daily habits. For example, in question 1, “worst pain” in the OSS was interpreted by graphics showing pain during typical Chinese competitive sports, such as badminton and swimming. In question 3 of the OSS, “car” and “public transport” have abundant meanings in Chinese. The illustrator used the most representative elements—a Chinese bus and a handhold—which are familiar images for more elderly patients. The term “usual work” in OSS question 11 was represented by illustrations of office work, hoeing, and housekeeping being performed, which were indicative of activities associated with omalgia in the population as a whole.
Participant Characteristics
All 312 participants completed the questionnaires. The average age was 52.2 ± 10.8 years (range, 18-75 years). Educational qualification was divided into 4 levels: tertiary education (43; 14%), secondary school (156; 50%), primary school (91; 29%), and illiterate (22; 7%). The demographic characteristics of the patients are summarized in Table 2.
Participant Characteristics
Ceiling and Floor Effects
There were no missing data for individual items in both the SC-OSS and the FC-OSS. No patient scored the minimum or maximum possible scores; therefore, no significant ceiling or floor effects were observed.
Reliability
The average FS-OSS score was 24.23 ± 9.11 initially and 24.04 ± 8.75 in the retest taken 5 to 7 days later, with the median (interquartile range) being 24 (18-32) and 24 (17-32). The average initial SC-OSS was 25.56 ± 9.98, and it was 24.78 ± 9.51 during retesting, with the median (interquartile range) values of 25 (16-33) and 24 (16-32), respectively. The Cronbach α was .91 and the ICC was 0.92 (95% CI, 0.86-0.95) for the SC-OSS, proving high internal consistency and excellent test-retest reliability. The Cronbach α was 0.94 and the ICC was 0.95 (95% CI, 0.91-0.98) for the FS-OSS, which indicated better internal consistency and test-retest reliability than the SC-OSS. The measurement errors expressed in SEM were 2 (8%) and 3 (11%) for the FC-OSS and SC-OSS, respectively, while the MDC values were 6 (23%) and 8 (31%), respectively (Table 3). Table 4 shows the Cronbach α distributions.
Descriptive Analysis of the FS-OSS and SC-OSS
Internal Consistency of the SC-OSS and FS-OSS
Construct Validity
Table 5 shows the correlation among the SC-OSS, FS-OSS, SST, CMS, and each domain of the SF-36 to assess the construct validity. The Pearson
The FS-OSS was marginally better than the SC-OSS, with very good correlation between it and the SST (
Construct Validity of the SC-OSS and FS-OSS
Time to Completion
Table 6 shows the time needed to complete the FS-OSS, SC-OSS, and SST, used as a surrogate marker of comprehension. Patients with a higher education level needed a shorter time to complete the questionnaires. The mean time needed to complete the FS-OSS (182 ± 107 seconds) was notably less than that needed for the SC-OSS (222 ± 180 seconds) and the SST (306 ± 307 seconds). Figure 2 presents comparisons of the time spent completing the FS-OSS and SC-OSS by education level. Illiterate and primary school groups needed significantly less time to complete the FS-OSS compared with the SC-OSS (
Time Needed to Complete the Scores

Comparison of the time needed to complete the score of the simplified Chinese version and the face-scale version of the OSS. *
Discussion
The present findings are generally consistent with our hypotheses and suggest that FS-OSS and SC-OSS are valid and reliable instruments for patients with shoulder pain in China. The present findings also suggest that the FS-OSS is easier to understand.
Various questionnaires evaluate shoulder symptoms, such as Western Ontario Osteoarthritis of the Shoulder Index, 19 the Shoulder Pain and Disability Index, 28 the OSS, and the Western Ontario Rotator Cuff Index. 14,15 The OSS was originally created to evaluate patients with chronic shoulder complaints by Dawson et al in 1996, 6 and it has been proven to be a valid and reliable instrument, which is well-accepted and easily completed by patients. It can be used to evaluate most types of shoulder pain, including rotator cuff tears 7 and frozen shoulder. 26 Table 7 summarizes the cross-cultural adaptation versions of the OSS.
Description of Cross-Cultural Adaptation Versions for the OSS
Translation, cross-cultural adaptation, and validation are necessary for direct international comparisons and cultural equivalence. Currently, the Western Ontario Rotator Cuff Index, 34 the Western Ontario Osteoarthritis of the Shoulder Index, 13 the Rotator Cuff Quality of Life Index, 35 and the Shoulder Pain and Disability Index 33 are available for application in Chinese patients, as well as the simplified Chinese version of the OSS developed in 2015. 39 Unevenly distributed wealth and educational resources as well as high illiteracy rates are prevailing challenges in most developing countries, and simplified patient-reported outcome measures are needed to overcome some of the resulting cultural and language barriers. In the current study, scores on the SC-OSS and FS-OSS achieved conceptual, semantic, idiomatic, and experiential equivalence during the translation and cross-cultural adaption process. The short time to complete the newly developed questionnaires suggests they were acceptable to the participants and easily understood.
In terms of reliability, the ICC values of SC-OSS (0.92) and FS-OSS (0.95) assessed at 5 to 7 days were high and indicated excellent test-retest reliability. Our version of the SC-OSS had comparable ICC values at 5 to 7 days when compared with the previously published version 39 (0.97 at 3-5 days). The ICC value for the SC-OSS was higher than that of Norwegian (0.83) 8 and French (0.91) 32 versions and equal to the Portuguese (0.92) 10 and Brazilian (0.92) 17 versions. Regarding the FS-OSS, the ICC value was higher than that of the SC-OSS and Persian (0.93) 22 versions and comparable with that obtained for the Korean (0.95) 29 and Romanian (0.95) 11 versions. The test-retest interval (5-7 days) of our study was longer than that of most of the other studies and marginally shorter than versions validated for Brazil (7-15 days) 17 and Poland (7-14 days). 2 It also showed good internal consistency, with a Cronbach α of .70 to .95. This was higher than the Norwegian (0.87) 8 and Portuguese (0.90) 10 versions and comparable with the Korean (0.91) 9 version. For the FS-OSS, the Cronbach α was equal to the German (0.94) version 12 and higher than most versions of the OSS.
The SST, CMS, and SF-36 subscales demonstrated satisfactory construct validity of the SC-OSS and FS-OSS. The SC-OSS showed fair correlation with the GH, RE, and MH subscales of the SF-36 and good correlation with the remaining SF-36 subscales, the SST, and the CMS. Only fair correlation between the FS-OSS and the SF-36 RE subscale was found. According to the results, the SC-OSS and FS-OSS had good convergent and divergent validity.
Pictorial scores have been used in obstetrics and gynecology, 27 as well as in health-related quality of life instruments. 37 We used the time needed for participants to complete our instruments as a surrogate marker of comprehension. The results suggest that the pictorial version (FS-OSS) was more easily understood than SC-OSS and SST, especially for low-literacy participants, and might therefore be more applicable for countries and regions with high illiteracy rates.
Limitations
Several limitations exist in the present study. First, all participants included in our study were patients who visited our hospital, which could not fully represent the entirety of our country. Second, the responsiveness of SC-OSS and FS-OSS were not evaluated accurately, as we used the time needed to complete the instruments as a surrogate marker of responsiveness and understanding. Further research and more accurate assessment are still needed. Third, although we divided the participants into 4 groups according to the level of education, the sample size was too small to compare literate versus illiterate patients; this parameter was included mainly to describe the population adequately. Further studies should focus on the comparison of reliability between literate and illiterate populations with a large-powered study sample. In addition, we did not have a standardized way to introduce the image-only scoring system, and all illiterate patients needed explanation. Therefore, investigators may have had an influence on the understanding of the questionnaire by illiterate patients, which may have introduced bias.
Conclusion
The FS-OSS and SC-OSS were validated as reliable instruments for patients with shoulder pain. For Chinese patients, the face-scale version was easier to understand than the cross-cultural text version.
Footnotes
Acknowledgment
The authors thank Mr An Liu, medical student at Soochow University School of Medicine, for collecting and analyzing the data for this study. They also acknowledge all the translators and expert committee members (Professor Ling Qin from The Chinese University of Hong Kong, Professor Tiansi Tang, Professor Huilin Yang, and Mr Ju-Sheng Shieh) who assisted with the translation of the OSS into Chinese, the patients who participated in the study, and the illustrators who assisted in the process of making the pictorial version.
Final revision submitted January 27, 2021; accepted February 24, 2021.
The authors declared that there are no conflicts of interest in the authorship and publication of this contribution. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.
Ethical approval for this study was obtained from The First Affiliated Hospital of Soochow University.
