Abstract
Background:
Patient-reported outcome measures are essential tools for assessing function and quality of life. The Banff Patellofemoral Instability Instrument Version 2.0 (BPII 2.0) is specifically designed to evaluate adolescents and adults experiencing patellofemoral instability (PFI).
Purpose:
Because no French version was available, this study aimed to translate and validate a Canadian-French (CF) version of the BPII 2.0.
Study Design:
Cohort study (Diagnosis); Level of evidence, 3.
Methods:
The BPII 2.0 was translated using a forward-backward translation method (BPII 2.0-CF), following guidelines from the American Association of Orthopaedic Surgeons and the Institute for Work & Health. Participants aged ≥14, fluent in French, and with a confirmed diagnosis of PFI were included. Patients completed the questionnaire at their initial consultation, 7 days later, and 6 months after treatment. Construct validity was assessed through Spearman correlation with the Kujala score. Internal consistency was evaluated using the Cronbach alpha coefficient, and test-retest reliability using the intraclass correlation coefficient (ICC). Responsiveness to change was analyzed using the Wilcoxon signed-rank test and the corresponding effect size r.
Results:
A total of 45 participants were included (female, n = 33; 73%), with a median age of 21 years (range, 14-54). The median symptom duration before consultation was 48 months (range, 0.5-480.0). Mean ± SD BPII 2.0-CF scores were 38.46 ± 14.48 (N = 45), 39.52 ± 15.99 (n = 34), and 60.99 ± 20.24 (n = 29) at baseline, day 7, and 6 months after surgery, respectively. No floor or ceiling effects were observed. The construct validity of the translated BPII 2.0 and the French Kujala score at baseline was fair and statistically significant (r = 0.56; 95% CI, 0.28-0.77; P < .001). Internal consistency was acceptable with Cronbach alpha coefficient of .87. Retest reliability was good with an ICC of 0.75 (95% CI, 0.56-0.87; P < .001), and the response to change showed strong reactivity with an effect size of 0.79 (95% CI, 0.60-0.90; P < .001).
Conclusion:
The CF version of the BPII 2.0 is a valid, reliable, and appropriate tool for assessing patients with PFI. The rigorous translation and validation process supports its use in Francophone clinical settings.
Keywords
The use of patient-reported outcome measures has become essential in evaluating overall quality of care. Once grouped within the broad category of patellofemoral syndromes, patellofemoral instability (PFI) is now recognized as a distinct clinical entity. PFI is more prevalent among adolescents and young adults in the active population, with an incidence of 77 per 100,000 people per year in at-risk populations. 11 Approximately 50% of these patients will experience recurrent symptoms,15,24,29 which can become disabling and hinder their ability to fully participate in daily or recreational activities, thereby negatively affecting their quality of life.
With advances in the understanding of PFI, its clinical presentation, diagnosis, and treatment, condition-specific quality-of-life assessment tools have increasingly been developed or adapted. These include the Banff Patellofemoral Instability Instrument 2.0 (BPII 2.0), the Norwich Patellar Instability score, and the Kujala score. The Norwich Patellar Instability score is a questionnaire that assesses the patient’s perceived patellar instability during various movements and activities, but it has not yet been translated into French. 11 The Kujala score is another tool that evaluates anterior knee pain associated with patellofemoral disorders; although it is not specific to PFI, a French cross-cultural translation was performed by the University of Liège in Belgium. 14 The BPII 2.0 is a questionnaire specific to PFI and has also shown effectiveness in adolescent target populations. 18
The BPII 2.0, originally validated and published in English in 2016, replaced the first version from 2013 by reducing the number of questions from 32 to 23. It evaluates PFI across 5 domains: physical symptoms or complaints, work- and/or school-related concerns, recreation, sports and activities, lifestyle, and social and emotional. Patients are asked to place a vertical mark on a horizontal 100-mm visual analog scale. Each question contributes equally to the overall score, and the final score ranging from 0 to 100 is the mean of all answered items. Of the 23 questions, ≥19 must be completed for the score to be valid. A higher score indicates a better quality of life. 17
It subjectively assesses the functional impact of PFI on patients’ quality of life, one of the most important factors in evaluating treatment outcomes.10,18 Growing interest in the BPII 2.0 has prompted its translation into multiple languages, including German, 4 Spanish, 19 Indonesian, 23 Norwegian, 13 Portuguese, 9 Swedish, 27 Dutch, 26 Arabic, 2 and Turkish. 30 However, to date, no French or Canadian-French (CF) version has been documented in the literature. Validating the BPII 2.0 in CF would enhance its relevance and applicability within Canada’s francophone population and potentially in other French-speaking communities worldwide.
The objective of this study was to produce a CF translation of the BPII 2.0 and to test the validity, reliability, and responsiveness of the translated version. Our hypothesis was that the cross-cultural translation of the BPII 2.0 into CF would be reliable and valid for evaluating pre- and postoperative function in francophone patients ≥14 years of age with PFI.
Methods
Before initiating the cross-cultural adaptation and translation process, authorization was obtained from the original developers of the BPII 2.0. The study was approved by the research ethics board of the Centre de recherche du Centre intégré universitaire de santé et de services sociaux de l’Estrie–Centre hospitalier universitaire de Sherbrooke (CIUSSS de l’Estrie–CHUS). Calibration and validation of the psychometric properties of the final version of the questionnaire were conducted following the recommendations of the International Quality of Life Assessment Project. 22
Cross-Cultural Translation Process
The original BPII 2.0 questionnaire was translated into CF using a double reverse translation method. The process followed 6 distinct steps, in accordance with the cross-cultural adaptation guidelines developed by the American Association of Orthopaedic Surgeons and the Institute for Work & Health. 3 These steps included initial translation, synthesis, back translation, expert committee review, pretesting of the prefinal version, and submission of the final version to the senior author (F.V.) (Figure 1).

Flowchart of translation process. BT1-2, first and second back translator; FACS, Fear-Avoidance Components Scale; Fr/CF, French/Canadian-French; T1-2, first and second translator.
For the initial translation, 2 independent translators, both native French speakers, translated the English questionnaire into French. A synthesized version of their translations was developed after discussions with the translators and the research team. Two other independent translators, who were blinded to the original questionnaire and whose mother tongue was English, then translated the synthesized version back into English. An expert committee composed of the 4 translators and the research team consolidated the prefinal version.
The content validity of this prefinal CF version was pretested with 30 participants diagnosed with varying knee pathologies to evaluate item clarity and relevance. Each participant was invited to comment on interpretation difficulties and any additional feedback for every questionnaire item. 21 Finally, the corrected CF version of the BPII 2.0 (BPII 2.0-CF), along with its translation process documentation, was submitted to the senior author.
The translation process took place between August 27, 2021, and November 11, 2021.
Participant Selection and Data Collection
This was a prospective study conducted at 2 sites: the CIUSSS de l’Estrie–CHUS in Sherbrooke and the Shriners Hospital for Children in Montreal. Eligible participants were ≥14 years of age, fluent in French, and had a diagnosis of PFI confirmed by an orthopaedic surgeon. Participants unable to understand or answer the questionnaire and/or who had a concurrent diagnosis of other knee conditions were excluded from the study. Participants completed both the BPII 2.0-CF and the Kujala questionnaire during their initial consultation (t0). Sociodemographic information including age, sex, education level, native language, and "country of origin" were also recorded.
Seven days later (t1), participants completed only the BPII 2.0-CF, either on paper via postal mail or online through the Research Electronic Data Capture (REDCap; Vanderbilt University) platform. The final assessment (t2), including both questionnaires, was completed during the 6-month follow-up visit.
Sample Size and Statistical Analysis
Following guidelines established by the Consensus-based Standards for the Selection of Health Measurement Instruments, 10 the following psychometric properties of the French version of the BPII 2.0 were analyzed: construct validity, internal consistency, test-retest reliability, and responsiveness.
A target sample size of 41 participants was calculated to evaluate these criteria, based on an alpha error rate of .05, power of 0.80, and a 10% expected attrition rate. Psychometric property validation of the BPII 2.0-CF was carried out by comparing it with the Kujala score.
Normality of distribution was assessed through visual inspection of histograms and QQ plots, as well as the Shapiro-Wilk test, with a significance threshold set at .05. Appropriate statistical tests were selected based on the distribution characteristics. All coefficients were reported with 95% CI. A pairwise deletion method was applied to missing data, as visual inspection confirmed a missing at random pattern.
Construct validity (convergent validity) was evaluated by correlating each patient’s BPII 2.0-CF score with his or her Kujala score at t0, using Spearman rho nonparametric correlation coefficient. Correlation values were interpreted as follows, based on previously published thresholds: <0.3 = poor; 0.3 to 0.65 = fair; 0.66-0.8 = moderately strong; >0.8 = very strong; and 1 = perfect.1,7
Internal consistency was assessed by analyzing the intercorrelation between the 23 questionnaire items at t0. Cronbach alpha was calculated, with values between .70 and .90 considered acceptable. 25
Test-retest reliability was determined using the intraclass correlation coefficient (ICC), based on paired data collected at t0 and t1 for the same patients. A 2-way mixed-effects model with absolute agreement and single measurement was used, corresponding to ICC (3,1). Based on previously published thresholds, ICC values <0.50 were defined as poor, between 0.50 and 0.75 as moderate, between 0.76 and 0.9 as good, and >0.9 as excellent. 16 According to established thresholds, standard error of measurement (SEM) = SD ×√(1 − ICC) and minimal detectable change at 95% confidence (MDC95) = 1.96 ×√2 × SEM. 5 An approximation of the minimal clinically important difference (MCID) was also calculated using a distribution-based method by dividing the baseline standard deviation by 2, because no anchor item was included in the patient questionnaire. 20
Responsiveness was analyzed by comparing BPII 2.0-CF scores at t0 and t2 using the nonparametric Wilcoxon signed-rank test. Rank-biserial correlation was reported as the effect size, with 95% CI. Effect size magnitudes were interpreted as follows: ≥0.20 was considered small, ≥0.50 medium, and ≥0.80 large. 8
Floor and ceiling effects were computed as the percentage of scores at the minimal (0) and maximal (100) ends of the BPII 2.0-CF scale. A threshold of 15% was set a priori for defining significant floor or ceiling effects, as commonly cited in the literature. 25
All statistical analyses were performed using SPSS Version 30.0.0.0 (IBM Corp). A significance level of P≤ .05 was applied to all tests.
The manuscript was initially written in French, then translated into English using the artificial intelligence software affiliated with the University of Sherbrooke (Copilot; Microsoft 2025; https://copilot.microsoft.com/). After that, the authors with a better command of English reviewed and made corrections.
Results
A total of 52 participants were enrolled in the study, of whom 45 were included in the final analysis (Figure 2). Among the 45 participants included, there were 33 women and 12 men, with a median age of 21 years (range 14-54). Women represented the majority at 73.3%. The median duration of symptoms before consultation was 48 months (range, 0.5-480.0 months; IQR, 116 months). Among all study participants, 91.1% were of Canadian origin, 84.4% reported CF as their native language, and 97.8% had an education level ≥7th grade (Table 1).

Flowchart of participant recruitment for the validation of the Canadian-French version of the Banff Patellofemoral Instability Instrument Version 2.0. n1, participants from Integrated University Health and Social Services Center of Estrie–University Hospital Center of Sherbrooke; n2, participants from the Shriners Hospital.
Sociodemographic Characteristics a
Data are presented as median (range), n (%), or %.
Chinese (2.2%), First Nations (2.2%), Mexican (2.2%), Vietnamese (2.2%).
English (4.4%), Spanish (2.2%), Mandarin (2.2%), Romanian (2.2%), Vietnamese (2.2%).
Not reported.
The mean ± SD BPII 2.0-CF scores were 38.46 ± 14.48 (N = 45), 39.52 ± 15.99 (n = 34), and 60.99 ± 20.24 (n = 29) at baseline, day 7, and 6 months after surgery, respectively.
Across the baseline and 2 follow-up evaluations (t0-t2) (Table 2), no floor or ceiling effects were observed for the BPII 2.0, as 0% of participants scored the minimal or maximal possible values.
Descriptive Results of Floor and Ceiling Effects a
t0, during initial consultation; t1, 7 days after initial consultation; t2, final assessment at 6-month follow-up.
Regarding construct validity, a fair correlation was shown between the BPII 2.0-CF and the French version of the Kujala score at t0, with a Spearman correlation coefficient of 0.56 (95% CI, 0.28-0.77; P < .001) (Table 3).
COSMIN, Consensus-based Standards for the Selection of Health Measurement Instruments; ICC, intraclass correlation coefficient; t0, during initial consultation; t1, 7 days after initial consultation; t2, final assessment at 6-month follow-up. Dash indicates not applicable.
Internal consistency of the BPII 2.0-CF at t0 was acceptable, with a Cronbach alpha coefficient of .87 (95% CI, 0.77-0.92) (Table 3).
Test-retest reliability over a 7-day interval for the BPII 2.0-CF was good, with an ICC of 0.75 (95% CI, 0.56-0.87; P< .001) (Table 3). The SEM was 7.1 points, yielding an MDC95 of 19.7 points. Mean scores for the BPII 2.0-CF were 38.46 ± 14.48 at t0 and 60.99 ± 20.24 at t2, or 6 months after surgery. Responsiveness, based on effect size from distribution, showed statistically significant improvement using the Wilcoxon signed-rank test (P < .001) (Table 3). The distribution-based MCID was estimated at 6.77. The median paired difference (Hodges-Lehmann estimate) was 22.78 points (95% CI, 14.96-31.09). The effect size was large, at r = 0.79 (95% CI, 0.60-0.90).
Discussion
This study aimed to establish and validate a cross-cultural adaptation of the BPII 2.0 questionnaire in CF. Following a forward and backward translation process, and feedback from 30 patients, a final version of the BPII 2.0-CF was established (see Supplemental Material, available separately). The psychometric properties were then assessed in a sample of 45 patients suffering PFI who completed the final BPII 2.0-CF questionnaires at 3 time points. The main finding of this study is that the CF version of the BPII 2.0 demonstrates construct validity comparable with that of the original English version. Construct validity was confirmed by comparing the CF translation of the BPII 2.0 with the French version of the Kujala score, revealing a fair correlation (r = 0.56; P < .001). Similar correlations were reported in the German and Turkish translations of the BPII 2.0, with coefficients of 0.58 and 0.53, respectively.4,30 In the original BPII 2.0 validation by its developers, Hiemstra etal 10 reported a good construct validity with the English version of the Kujala score (r = 0.50), already supporting a significant correlation. These findings indicate that the BPII 2.0-CF meaningfully captures the essential elements needed to assess knee function in the context of PFI.
The internal consistency of the translated version was strong, with a Cronbach alpha of .87, considered acceptable and reliable. The original English version and other translations have consistently demonstrated high internal consistency, with Cronbach alpha coefficients of .91 (English), 17 .93 (German), 4 .98 (Indonesian), 23 and .96 (Swedish). 27 These results underscore a robust interconnection among BPII 2.0 items and highlight the structural integrity of the questionnaire in evaluating the effect of PFI symptoms, without significant redundancy.
Test-retest reliability showed a good correlation (ICC = 0.75). This result is lower than that of the original English BPII 2.0 (ICC = 0.97) 17 and other translations, which reported high ICC values ranging from 0.90 to 0.98.2,19,27 This discrepancy may stem from the smaller sample size (n = 34), which could have introduced some instability in the ICC calculation compared with the original version. The response rate between t0 and t1 was 76% (34/45), surpassing the minimal threshold (n = 33) required for meaningful validation of patient-reported outcome measures such as the BPII 2.0,6,28 but potentially introducing additional variability. The data collection method may have also contributed: responses at time point t0 were collected in person, while those at t1 were mostly submitted via the REDCap online platform.
A significant difference between the 2 means, with a large effect size (r = 0.79; P < .001),8,12 supports the good responsiveness of the CF version of the BPII 2.0. The clinically expected functional improvement 6 months after nonoperative or operative treatment was effectively captured by this measurement tool. In the original version of the BPII 2.0, a medium to large effect size (r = 0.40) also supported its ability to detect improvement across different evaluation periods. 17 Using a distribution-based method, an MCID of 6.77 points was identified, which aligns closely with the anchor-based MCID of 7 points established in the original BPII 2.0 scale validation paper.
The mean ± SD BPII 2.0-CF score at t0 in our sample was 38.46 ± 14.48, comparable with results at similar time points in the German (39.58 ± 18.49) 4 and original English versions (30.14 ± 15.24). 17 With a median age of 21 years and a median symptom duration of 4 years, our participants had been experiencing symptoms for several months, which may explain the low self-reported quality of life related to knee symptoms. Similar scores were also observed in other translations, even in adolescent-dominant populations, reinforcing the conclusion that PFI has a substantial impact on patients’ quality of life, regardless of age or symptom duration.
Limitations
This study presents a few limitations. The use of nonrandom convenience sampling limits the representativeness of the sample for the broader population with PFI. However, the impact is minimal, given that the study’s objectives are centered not on clinical outcomes, but on changes in scores over time and their correlation with other questionnaires. Participant attrition during follow-up led to smaller sample sizes in some assessments, nearing the lower limit of the recommended 50 participants typically required to assess floor and ceiling effects with acceptable validity. 25
Responsiveness to change was assessed over 2 time points spaced 6 months apart. Only the surgically treated patients completed all questionnaires at the designated intervals, and therefore, the analysis was conducted on this subgroup.While the original English version of the BPII 2.0 is intended for all PFI types, surgically or nonsurgically treated, 17 our findings apply specifically to the postsurgical population.
The questionnaire was translated and validated within the cultural and linguistic context of French-speaking Canadians. Certain terms or expressions may reflect region-specific communication habits, which could limit the optimal use of the tool in other francophone populations. Cultural or linguistic adaptations may be necessary in other French-speaking contexts.
Conclusion
The assessment criteria used throughout the translation and validation process confirmed appropriate construct validity and internal consistency, satisfactory test-retest reliability, and strong responsiveness to change. The CF version of the BPII 2.0 can now be employed to evaluate quality of care, functional outcomes, and quality of life in patients with PFI.
Supplemental Material
sj-pdf-1-ojs-10.1177_23259671251413262 – Supplemental material for Translation and Validation of the Canadian-French Version of the Banff Patellofemoral Instability Instrument (BPII 2.0-CF)
Supplemental material, sj-pdf-1-ojs-10.1177_23259671251413262 for Translation and Validation of the Canadian-French Version of the Banff Patellofemoral Instability Instrument (BPII 2.0-CF) by Laurent Désiré Ndzié Essomba, Yoan Bourgeault-Gagnon, Catherine Fleury, Evelyne Dumas, Sonia Bedard, Thierry Pauyo, Laurie A. Hiemstra and François Vézina in The Orthopaedic Journal of Sports Medicine
Footnotes
Final revision submitted August 24, 2025; accepted November 17, 2025.
One or more of the authors has declared the following potential conflict of interest or source of funding: F.V., Y.B.-G., L.D.N.E., C.F., E.D., and S.B. receive unrestricted educational and research grants through the Sherbrooke Orthopedic Research and Teaching Foundation from the following organizations: DePuy, a Johnson & Johnson company, Conmed, and Medtronic. L.A.H. receives unrestricted grants for orthopaedic research from Smith & Nephew and Conmed and is on the board of the Banff Sport Medicine Foundation and the Patellofemoral Foundation. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.
Study approval was obtained through the research ethics board of the Centre de recherche du Centre intégré universitaire de santé et de services sociaux de l’Estrie–Centre hospitalier universitaire de Sherbrooke (reference No. 2022-4329).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
