Abstract
Study Design
systematic review of cross-cultural adaptation.
Objectives
SOSGOQ 2.0 was widely used to assess the HRQQOL of patients with spinal metastasis. Due to the lack of methodological quality assessment, it is a challenge to use the questionnaire in routine practice. This study aims to comprehensively evaluate the translation procedures and measurement attributes of SOSGOQ 2.0 according to COSMIN guidelines.
Methods
The literature was reviewed adhering to the PRISMA guidelines. Each translation process and different cultural adaptation methods were classified according to the guidelines for Cross cultural Adaptation Process of Self Reporting Measures, and the methodological quality of the identified research was evaluated according to the consensus based on the selection criteria of health measurement tools.
Results
6 publications finally met the inclusion criteria. As for the evaluation of translation procedures and cross-cultural adaptability, two adaptations did not report the detailed information in translation and cross-cultural adaptation (synthesis, back translation, review by expert committee, pre-test), factor analysis and sample size calculation were only mentioned in two studies, and only one adaptation met the minimum sample size standard. Regard to the methodological quality assessment of measurement attributes, all adaptations completed internal consistency, structural effectiveness and reliability. However, none of the adaptations reported measurement errors and only one reported response sensitivity.
Conclusions
We found that the methodological quality of the current adaptation was uneven, and the report of measurement attribute results was not comprehensive. We recommend higher quality German, Italian and Chinese adaptation.
Keywords
Introduction
Spinal metastases (SM) affect up to 30-70% of cancer patients. 1 With the advancement of more precise radiotherapy techniques, recent anticancer agents, and various targeted drugs, survivals for patients with SM have been prolonged. However, a large number of patients have reduced quality of life due to skeletal-related events. Short Form 36 (SF-36), the Euro Qol 5-Dimensions-5-Level (EQ-5D-5L) questionnaire, the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire C30 (EORTCQLQ-C30) and Eastern Co-operative Oncology Group (ECOG) are the valid evaluation methods to assess the HRQOL of cancer patients. However, they are not specific HRQOL questionnaire targeted for patients with SM.
In 2010, a novel HRQOL questionnaire called the Spine Oncology Study Group Outcomes Questionnaire (SOSGOQ) was developed by Spine Oncology Study Group, which exhibited a satisfactory internal consistency and coverage.2,3 It was composed by 27 questions with 5 answer options, divided into 6 domains: physical function with 4 items, neurological function with 4 items, pain with 4 items, mental health with 4 items, social function with 4 items, and post-therapy questions with 7 items. Subsequently, Versteeg et al further validated the first version of SOSGOQ and introduced its revision (SOSGOQ 2.0). 4 SOSGOQ 2.0 consists of 5 domains (physical function, neurological function, pain, mental health, and social function) and an additional set of post treatment questions for follow-up evaluation.
The SOSGOQ 2.0 had been cross-cultural translated into Italian, German, Thai, simple Chinese and Dutch.5–9 Due to the cultural difference, a direct simple translation of the original questionnaire often can’t guarantee similar measurement properties, and the rough translation may lead to bias to impact the validity of cross-cultural adaptation. Consequently, methodological considerations are important in cross-cultural comparisons, the process of cross-cultural adaptation should be followed, as well as the measurement properties should be used to assess. In 2005, the Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) started to develop practical tools for selecting the most suitable measurement instrument in research and clinical practice to improve the selection of outcome measurement instruments for health outcomes. It is necessary to bring the tools of psychometrics closer to this field to evaluate the measurement instruments in a clinical context.10,11 The purpose of this systematic review was to comprehensively address the translation procedures and measurement properties of cross-cultural adaptations of SOSGOQ 2.0.
Materials and Methods
Study Selection and Eligibility Criteria
The literature was systematically reviewed adhering to the Preferred Items for Systematic Evaluation and Meta-Analysis (PRISMA) statement guidelines. 12 In order to identify cross-cultural adaptations of the SOSGOQ 2.0 translated for non-English languages/cultures, PubMed, Embase, SinoMed, Web of Science, Google Scholar and Scopus were searched from their inceptions to September 2022, without language limitations. The keywords included (“The Spine Oncology Study Group Outcomes Questionnaire” OR “SOSGOQ”) AND (“cross-cultural” OR “valid” OR “equivalence” OR “transl” OR “adaptation” OR “version” OR “cultur”).
The following inclusion criteria were used to identify: (1) studies related to the cross-cultural adaptation of the SOSGOQ 2.0 in a specific language/culture; (2) studies that reported the process of cross-cultural adaptation; (3) studies that reported the evaluation of at least one measurement property; (4) studies that published in a peer-reviewed journal as a manuscript. In this systematic review, studies that could not obtain complete articles or the detailed translation processes of cross-cultural adaptation should been excluded. Two reviewers independently reviewed the relevant abstracts and full texts upon inclusion and exclusion criteria and then selected eligible studies. The differences were resolved by reaching a consensus with the third author.
Data Extraction of Eligible Studies
Relevant characters in the literature were extracted respectively by two reviewers, including the language and population, sample size, calculation of sample size, the agreement of original author, recruited consecutively or not, the test-retest interval, and the type of patients. If there is a disagreement between the two reviewers, it would be resolved through consultation with the third author.
Methodological Quality Assessment of the Translation Process and Cross-Cultural Adaptation
Each translation process and cross-cultural adaptation methods were classified according to the Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measures. The complete process of cross-cultural adaptation includes initial translation, synthesis of the translation, back translation, expert committee, a test of the prefinal adaptation, and appraisal of the adaptation process. If there is a disagreement between the two reviewers, it would be resolved through consultation with the third author.
Methodological Quality Assessment of the Measurement Properties
According to the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) guideline, we classified measurement properties, including content validity, internal structure (structural validity and internal consistency), reliability, measurement error, construct validity and responsiveness. The methodological quality assessment was conducted by two reviewers. If there is a disagreement between the two reviewers, it would be resolved through consultation with the third author.
Results
Description of Cross-Cultural Adaptation of the SOSGOQ 2.0
A total of 63 publications were identified by our search. We omitted 38 cases of duplication, 19 irrelevant studies. Ultimately, 6 publications met the inclusion criteria in this review (Figure 1). A total of 5 cross-cultural adaptations of the SOSGOQ 2.0 for 5 different languages/cultures were enrolled, including Chinese, German, Italian, Thai and Dutch adaptations. Furthermore, 1 original study was identified in this review (Table 1). Flow chart. Characteristics of Included Studies.
The sample size on validity ranged from 68 to 238. Only the original, Italian and Dutch adaptions reported sample size calculation and met the criterion for the required number of patients for testing according to the guideline (should be more than 140, and 7 times of the items). All the adaptions didn’t report whether they recruited patients consecutively or not. In the original studies, the measurement properties were tested on patients who had a diagnosis of spinal metastatic disease, and had undergone surgery and/or radiotherapy. Likewise, Chinese, German, Italian, Thai and Dutch adaptations recruited patients with spinal malignancy. Only the Thai, Italian and German adaptations were translated with the approval of the original author.
Methodological Quality Assessment of the Translation Process and Cross-Cultural Adaptation
Cross-Cultural Adaptation of Self-Report Measures.
Note: +, positive rating; −, negative rating; 0, there was no information.
In the “translation” stage, Chinese, Thai and Italian adaptations used 2 translators, one of whom was a non-medical translator. While Chinese and Thai adaptations used bilingual translators (2 in Chinese adaptation and 1 in Thai adaptation), Italian adaptations used 2 Italian-native-speakers. And Dutch adaption was translated ed by two native Dutch-speaking researchers. In the “backward translation” stage, Chinese adaptation was performed by a new bilingually professional translator and a spinal tumor expert. Italian adaptation by 2 new translators (both English native speakers and both without background in the subject). Dutch adaption by 2 native English-speakers. Thai adaptation just by 1 native English-speaking professional translator.
Chinese, Italian and Dutch adaptations used expert committees, but German and Thai didn’t. There was methodologist involved only in Italian and Dutch adaptations. Although there was an expert committee in Chinese adaptations, the composition was unclear. Except for the Italian and Dutch adaptations, the others didn’t have sufficient sample size. Only Chinese, Italian and Dutch adaptations provide the information about the prefinal adaptation.
As the final stage of the adaptation process, pretests were conducted in the process of most adaptations, except for Thai adaptation. The details of pretest stage was provided by Chinese, Italian, German and Dutch adaptations, in which the sample size of Chinese and Italian adaptations were 30, that of Dutch adaptation was 15 and German adaptation was 20. According to guidelines, the sample size of Dutch and German adaptations didn’t meet the requirement. 13
Methodological Quality Assessment of the Measurement Properties
The Consensus-Based Standards for the Selection of Health Status Measurement Instruments (COSMIN) Checklist for Cross-Cultural Validity.
Note: +, positive rating; −, negative rating; 0, there was no information.
The Summary of the Measurement Properties of Cross-Cultural SOSGOQ 2.0 Adaptation.
Internal consistency was evaluated in all the adaptations using Cronbach’s a coefficient. German, Italian and Dutch adaptions didn’t quite meet the criterion for internal consistency (Cronbach’s α should range from .70 and .95). Cronbach’s alpha of German in D4 and D5, Italian in D5, and Dutch in D5 was found less than .7, which meant an unaccepted reliability.
The original, Chinese, Italian, Thai and Dutch adaptations reported concurrent validity using the Spearman correlation coefficients, but German adaptation didn’t. It used the Bland-Altman method. The original adaptation was reported with SF-36 and NRS in pain domain. The Chinese adaptation was reported with EQ-5D-5L and SF-36. The Italian adaptation was assessed with SF-36 and ECOG. The Thai evaluated the relationships with the EQ-5D-5L and the EQ-VAS. The Dutch adaptation was assessed with the SF-36 and Pain NRS. Spearman’s correlation coefficient of Thai with EQ-5D-5L in D3, D4 and D5 were less than .50. In the Dutch adaptation, the Spearman’s correlation coefficient of D4 and the social functioning domain of SF-36 was only .37, which increased to .54 after removing item 20. The Spearman’s correlation coefficient of other adaptations and tested scales ranged from .53 to .86, representing an acceptable concurrent validity. And deviation in each domain of German adaption were within the 10%, which meant an excellent agreement.
In addition, Test-retest reliability of Chinese, German, Thai and Dutch adaptions were assessed by intraclass correlation coefficient (ICC). ICC values ranged from zero to 1, .61-.80 indicated a very good repeatability, and .81-1.0 indicated excellent. The value of original adaption in D6, Chinese adaption in D5, Dutch adaption in D2 (bowel) and D5 were less than .6. And the result of TOST test in German adaption meant no significant difference. The interval between the two tests ranged from 2-14 days and the sample size ranged from 20 to 68. According to Terwee et al’s guidelines, only the Thai and Chinese adaptations achieved the standard for sample size (more than 50). The assessment of test-retest reliability wasn’t reported in the Italian adaptation.
As for the discriminative validity, in the original, German, Dutch and Italian adaptations, patients with an Eastern Cooperative Oncology Group (ECOG) performance score of zero or 1 were compared with patients with an ECOG score of ≥2. The Chinese adaptation compared the scores of spinal metastatic diseases with healthy participants. Good differentiations were achieved in all these 4 adaptations.
The Floor and ceiling effects were mentioned just in Chinese and Dutch adaptations. Measurement error was not reported in any adaptations.
The construct validity was evaluated in the original and Italian adaptations by confirmatory factor analysis (CFA). In both 2 adaptations, the comparative fit index (CFI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual showed a good model fit.
The convergent validity and divergent validity were assessed by comparison of the item score correlation with the total score of its own domain and the correlation to the total score of other domains in the original, German and Italian adaptations. The Chinese adaptation only evaluated the divergent validity and Thai adaptation only compared the correlation between the total score of each domain.
The results of all above construct validity assessments supported the internal validity. Only the German adaptation reported the response sensitivity, the minimum clinically-relevant thresholds in each domain varied between 4 and 7.5 points compared to the EORTC QLQ-C30.
Discussion
In the past decades, the scientific community has increasingly recognized the importance of using tools to measure patients’ HRQOL. 14 Due to the heterogeneity of spinal metastases population and the difference of clinical symptoms, the comprehensive tool for measuring HRQOL failed to show good medical benefits. SOSGOQ 2.0 is a disease specific HRQOL measurement tool for SM population, and its clinical effectiveness has been fully verified, but the cross-cultural adaptation has never been correctly evaluated. Therefore, it is urgent to evaluate the translation, cross-cultural adaptation process and measurement characteristics of SOSGOQ 2.0 adaptation, as to help achieve more standardized and effective cross-cultural adaptation in future research and clinical practice.
In this systematic review, six studies were identified and their translation procedures and measurement characteristics were evaluated. These studies are different in terms of patient population, sample size, adaptation details, types of translators, etc. Two fifths of the adaptations did not report detailed information in translation and cross-cultural adaptation (synthesis, back translation, expert committee review, pre-test). Factor analysis and sample size calculation were only mentioned in two studies. Only 1/5 of the adaptations meet the sample size standard. This reflects many deficiencies in translation process and cross-cultural style. The overall quality of methodology is good, and all adaptations have achieved internal consistency, structural effectiveness and reliability. However, none of the included studies reported sufficient information on all the measurement characteristics of the instruments studied according to the COSMIN standard. Three fifths of the adaptations did not report whether factor analysis had been carried out before the evaluation of structural validity. Another three fifths of the adaptations did not fully meet the internal consistency criteria. Most studies did not assess response sensitivity. In most studies, consistency, protocol responsiveness, and upper and lower bound effects have been found. Based on the process and measurement characteristics of cross-cultural adaptation, we recommend higher quality German, Italian and Chinese adaptations.
The quality of the adapted version of the methodology still needs to be improved. The cross-cultural verification and measurement attributes of the QOL questionnaire are difficult to deal with. The internationally recognized guidelines are critical and useful for assessing functional equivalence and operational equivalence, but they are not perfect, because it is difficult to investigate scale equivalence and measurement equivalence. In other words, there is still no perfect method (gold standard) for cross-cultural verification, and research on these issues is still continuous. Peder believed that the adaptation of the questionnaire from one language to another and from one culture or environment to another will affect the results despite its high accuracy. 15 In addition, the nonstandard translation of the questionnaire will further reduce the accuracy of the measurement results.
The five adaptations included in this paper all have different translation defects, and none of them shows a complete description of quality. As for the details of the translation process, Thai only provides translation, synthesis and back translation, while German hardly provides any relevant information. The lack of such information seriously affects the methodological quality assessment of the translation process, and the ambiguity of relevant details misleads the reviewers to a certain extent.
The selection of translators is another important link, and most adaptations do not have uniform standards for translators. Both the Italian and Dutch versions use two native language translators, while the Thai version uses only one researcher. In the back translation link, different versions use translators with different numbers and backgrounds. At the same time, whether the participation of translators with no medical background will cause differences in the accuracy of this link needs further verification. In the prediction test, the sample size of the Netherlands and Germany did not meet the requirements of the guidelines. Although the statistical results were satisfactory, the random error caused by the small sample size could not be ignored. Therefore, establishing a unified standard for translation procedures, standardizing the selection of translators and improving translation details will help to improve the quality of methodology.
In some cases, due to cultural differences between different populations, the translated version may show different measurement characteristics from the original version. Only Dutch was positive in hypothesis tests, and no relevant information was reported in other studies. Only Germany mentioned the item of response sensitivity. When using Cronbach’s α When evaluating internal consistency by coefficient, German D4 and D5, Italian D5 and Dutch D5 are all less than .7, which means that the reliability is unacceptable. In the assessment of cross-cultural adaptation, all studies did not use a unified reference system. The original report used SF-37 and NRS to test the standard validity, while other reports used different scales, including EQ-5D-5L, EQ-VAS, etc. The deviation of German adaptation in each field can be controlled within 10%, which makes the accuracy of German version more satisfactory than other adaptations. In Thai version, no matter EQ-VAS or EQ-5D-5L, the Spearman coefficients of D3, D4 and D5 are less than .5, which means that the simultaneous validity is unacceptable and the methodological quality is evaluated as unqualified.
In the link of testing stability, the minimum sample size standard of the adaptation scheme should be 50. Three reports were not met, so the methodological quality was evaluated as fuzzy. All adaptations conform to the measurement interval specified in the COSMIN guidelines, but during this period, the subjects’ constructs to be tested and the similarity of the scenarios before and after the measurement are also very important. If the measurement scenarios change, the stability of the measurement tools may be underestimated. Most of the adaptation researchers did not provide clear evidence to prove that the subjects’ constructs to be tested and life scenarios are stable. More attention should be paid to this aspect in the future research design.
This study has certain advantages and limitations. To the best of our knowledge, this is the first time that PRISMA and COSMIN guidelines have been used to evaluate the cross-cultural adaptability and measurement attributes of the SOSGOQ 2.0. Based on the recommendations of the COSMIN team, two independent reviewers, with the help of a third reviewer, assessed the quality of each study to prevent inconsistencies. Although we have used a wide range of search strategies, we can only find six relevant studies, which indicates that the cross-cultural adaptation of SOSGOQ 2.0 needs to be further promoted. Some cross-cultural adaptations may have been carried out correctly, but are not fully described according to COSMIN standards, thus affecting their methodological quality assessment. Although the methodological quality assessment shows that most cross-cultural adaptations are effective, the specific practical effects under different cultural differences need to be further confirmed.
Conclusion
Based on the COSMIN guidelines, this study systematically evaluated the cross-cultural adaptation version of SOSGOQ 2.0. The research shows that the methodological quality of the adaptation is uneven at present, and the report of measurement attribute results is not comprehensive. This study points out the shortcomings of most adaptations, which promotes the standardization of translation and cross-cultural adaptation process.
Footnotes
Acknowledgments
We would like to thank the reviewers for their thorough review of our manuscript, especially under the severe circumstance of worldwide epidemic COVID-19, and we wish that everybody pulls through safe and sound.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was supported by the Project of Shanghai Municipal Health Commission (20224Y0165), Project of Nantong Health Commission (MB2021022), National Natural Science Foundation of China (82205145), and Shanghai “Rising Stars of Medical Talents”-Youth Development Program-Youth Medical Talents-Specialist Program SHWSRS (2023-062).
