Abstract
Background:
The Norwich Patellofemoral Instability (NPI) score is a disease-specific questionnaire for patients with this condition; however, it has not been validated in Spanish-speaking patients.
Purpose:
To (1) translate the NPI score into Spanish and validate its psychometric properties and (2) to correlate it with the Spanish versions of the Kujala score and the Banff Patellofemoral Instability Instrument (BPII) 2.0.
Study Design:
Cohort study (Diagnosis); Level of evidence, 2.
Methods:
The translation and validation of the NPI score into Spanish were conducted following the Consensus-Based Standards for the Selection of Health Measurement Instruments guidelines. This included forward and backward translations, conciliation, and a pilot study in 5 patients. Internal consistency was evaluated using the Cronbach alpha and test-retest reliability using intraclass correlation coefficients (ICCs). Pearson correlation coefficients were used to calculate the correlation of the NPI score with Kujala and BPII scores.
Results:
The study included 58 voluntary participants, predominantly female (71%), with a mean ± SD age of 17.4 ± 6.6 years. The Spanish version of the NPI showed high internal consistency (Cronbach alpha, .93) and an ICC of 0.88 (95% CI, 0.80-0.92). Additionally, it exhibited a moderate negative correlation with the Kujala score (–0.57), and high negative correlation with the BPII score (–0.70). There was no ceiling or floor effect.
Conclusion:
Our study demonstrated that the Spanish version of the NPI score has high internal consistency and test-retest reliability, enabling its use in Spanish-speaking patients with patellar instability. It also demonstrated high correlation with another disease-specific score for patellofemoral instability (BPII 2.0) and moderate correlation with the Kujala score.
Patellofemoral instability refers to the dislocation or subluxation of the patella in relation to its trajectory over the femoral trochlea. Its incidence ranges from 5.8 to 77.8 per 100,000 inhabitants, with girls aged 10 to 17 years being the most affected population.4,9,18 After the initial episode, a 15% rate of redislocation is reported. Patients with patellofemoral instability can present symptoms such as anterior knee pain, feelings of patellofemoral instability, and patellar locking.6,15 Treatment can be either conservative or surgical depending on recurrence.4,8,12
Although multiple scales to evaluate knee pathologies have been developed and validated in recent years,11,13,14 there are only 2 disease-specific scores for patients with patellofemoral instability, the Banff Patellofemoral Instability Instrument (BPII) 2.0 and the Norwich Patellofemoral Instability (NPI) score.7,17,21 These patient-reported outcome measures are useful in different contexts such as evaluation of patients’ progress in clinical practice, deciding who may need a surgical procedure, or in research to evaluate outcomes of different treatments. It has been translated and validated into several languages such as Turkish, 25 Dutch, 24 and Brazilian Portuguese, 1 showing excellent psychometric properties. However, it has not been translated to or validated in Spanish.
The purpose of this study was to translate the original English version of the NPI to Spanish and validate its psychometric properties. The secondary purpose was to correlate the Spanish version of the NPI with the Spanish version of the Kujala score and the BPII 2.0. The hypothesis is that the Spanish version of the NPI would be valid and reliable in patients with patellofemoral instability.
Methods
A validation study of the Spanish version of the NPI score was conducted following the Consensus-Based Standards for the Selection of Health Measurement Instruments guidelines.19,20 The study was conducted in 2 phases: (1) scale translation into Spanish and (2) evaluation of measurement properties.
The study population comprised patients diagnosed with patellofemoral instability, treated between January 2022 and June 2024 at Fundación Valle del Lili, Cali, Colombia. Patients of either sex with age between 10 and 40 years of age were included, provided they could read and understand Spanish. The study was reviewed and approved by the institutional ethics committee; participants or their parents supplied written informed consent. Permission to translate and validate was obtained from the authors of the original NPI score before beginning the study.
NPI Score
The original score consists of 19 items representing scenarios in which the patient may perceive patellofemoral instability, each with 6 possible answers (“always,”“frequently,”“sometimes,”“rarely,”“never,” and “does not do”). Depending on the item, each response has a different value. To compute the final score, the result obtained for each item was added to obtain the numerator, and the denominator was derived by adding each item’s maximal score, with a maximum of 250. If the participant selected “does not do,” the maximal value of that item was excluded from the denominator calculation, ensuring that the scale's value was solely applied to activities performed by the patient. Subsequently, the numerator was divided by the denominator and multiplied by 100, resulting in a final score expressed in percentage. The higher the score, the higher the instability presented by the patient. 21
Translation
The translation process followed the recommendations by Beaton et al. 2 The score was initially translated from its English version into Spanish by 2 independent, bilingual orthopaedic surgeons, both of whom were native Spanish speakers and had >10 years of practice in knee surgery (J.P.M.-C. and S.M.). The 2 surgeons translated their own independent version. These forward translations were conciliated into a single version. There were only minimal differences for some terms that were conciliated between both translators choosing the most similar to the original one. This version was then back-translated into English by a paid native English medical translator. Any discrepancies between the original version translation and the backward translation were reviewed by the same paid translator, but no inconsistencies were identified. This version of the Spanish translation of the NPI score was then evaluated during outpatient consultation in a pilot study in 5 patients diagnosed with patellofemoral instability. These patients evaluated the overall comprehensibility of the Spanish version, giving feedback if they deemed necessary, but no changes were necessary after the piloting of the score. Figure 1 shows the final Spanish version of the score and Figure 2 shows the score sheet for the NPI.

Spanish version of the Norwich Patellofemoral Instability score.

Score sheet for the Spanish version of the Norwich Patellofemoral Instability score with the number of points received according to the option selected.
Measurement Property Evaluation and Statistical Analysis
Means, standard deviations, and frequencies were used to summarize participants’ baseline characteristics. The psychometric properties evaluated included internal consistency, test-retest reproducibility, measurement error, and convergent validity. A sample size of 50 patients was calculated as the minimum sample necessary to detect an intraclass correlation coefficient (ICC) as low as 0.4, with 90% power. RStudio Version 4.4.1 was used for analysis.
Following the recommendations of Terwee et al, 23 internal consistency was evaluated using the Cronbach alpha coefficient, which was considered adequate with values between .7 and .95. Lower values would indicate poor correlation between scale items, limiting interpretability, while higher values would suggest redundancy. To evaluate test-retest reproducibility, ICC was used and participants were asked to complete the questionnaire twice, with a time interval of 7 to 14 days between the initial and the second measurement. ICC values ≥0.7 were considered adequate. 23 Measurement error was analyzed using the standard error of measurement (SEM), 5 and a Bland-Altman plot was used to evaluate bias in the correlation between the NPI retest. 3 We defined floor and ceiling effects as ≥15% of the sample scoring the lowest or highest possible score on the NPI.
Finally, the correlation of the Spanish version of the NPI with both the Spanish version of the BPII 2.0 score 17 and the Spanish version of the Kujala Patellofemoral Disorder Score 16 was evaluated. The BPII consists of 23 items divided into 5 categories: (1) symptoms and physical discomfort, (2) problems related to work and/or school, (3) recreation/sport/activity, (4) lifestyle, and (5) social and emotional. The final score is a mean of the 23 items, ranging from 0 to 100, where a score of 100 indicates no impact on quality of life. The Kujala score consists of 13 items related to anterior knee pain, with some items scoring between 0 and 5 points and others between 0 and 10 points. The final score ranges from 0 to 100, with 100 representing the best health status. Pearson correlation coefficients were calculated between the NPI score and the Kujala score and between the NPI score and the BPII 2.0 to assess convergent validity. Correlational values between 0.1 and 0.3, 0.3 and 0.6, and >0.6 indicated small, medium, and large association, respectively. The significance level was P < .05.
Results
A total of patients diagnosed with patellofemoral instability completed the questionnaires. The mean ± SD age was 17 ± 6.6 years; the population was mainly female (71%) and had unilateral patellofemoral instability most of the time (88%) (Table 1). From this group, 22% (n = 13) had been surgically treated and 78% (n = 45) were nonsurgically treated or not treated yet. The internal consistency was very high, with a Cronbach alpha of .93. There was no ceiling effect (0%) or floor effect (10.4%) for the NPI.
Baseline Characteristics of Participants
Test-Retest Reliability
All patients completed the questionnaire twice, 7 to 14 days apart, and an ICC of 0.88 (95% CI, 0.80-0.92) was obtained, along with an SEM of 1.22. The Bland-Altman plot (Figure 3) showed a random distribution of the measurements, meaning that there was no systematic error.

Bland-Altman plot for the agreement between test-retest measurements of the Norwich Patellofemoral Instability (NPI) score. The x-axis shows NPI score mean measurements for each patient; y-axis shows the difference between the first NPI score measurement and the second measurement. The continuous line (black) measured bias of −1.931 indicates a small negative bias. The dotted lines (red) show concordance limits with a 95% CI for test-retest measurements (Nt1-Nt2).
Convergent Validity
The NPI scale showed a moderate linear negative correlation with the Spanish version of the Kujala score (r = −0.57; P < .0001) (Figure 4), and a high linear negative correlation with the BPII 2.0 (r = −0.70; P < .0001) (Figure 5).

Scatterplot showing the correlation between Norwich Patellofemoral Instability and Kujala scores.

Scatterplot showing the correlation between Norwich Patellofemoral Instability score and Banff Patellofemoral Instability Instrument 2.0.
Discussion
The major findings of our study showed that the Spanish version of the NPI score has very good psychometric properties, with high internal consistency (α = .93) and test-retest reliability (ICC = 0.88). These results are similar to those obtained in the original development article of the scale. 22
As we compare our findings with other translations of the NPI, we have similar results. The Norwegian version (N = 107) found good internal consistency (α = .88) and moderate test-retest reliability (ICC = 0.65; 95% CI, 0.47-0.77). 10 The Dutch version (N = 97), had excellent internal consistency (α = .97). 24 The Portuguese version (N = 60), reported high internal consistency (α = .93). 1 The Turkish version (N = 51), found good internal consistency (α = .88) and test-retest reliability (ICC = 0.915; p < .05). 25
Similar to the Turkish study and the baseline results of the Norwegian study, we found no ceiling or floor effect. 25 The Dutch study did find floor effect (17%). 24 The Norwegian study only found floor effect of 28% in the postoperative cases, which could be expected if patients manage to improve that much in their patellar stabilization, and thus in their function, after surgery. 10 The original NPI score from Smith et al 21 found floor effect in 13 of the 19 items. In general, other versions of the NPI may find difficulty differentiating patients that have very low scores (have better function) in between them. This was not the case for the Spanish version, probably because most patients were not operated when they filled out the questionnaire.
There was a strong negative correlation between NPI and both Kujala and BPII 2.0. This was expected to be a negative correlation, as the best scenario for NPI is zero, while for Kujala and BPII 2.0 the best scenario is 100. The Spanish version of the NPI showed better negative correlation with BPII 2.0 (r = −0.70) compared with Kujala (r = −0.57). The Portuguese study reported exactly the same correlation with Kujala (r = −0.57); the original English score showed similar correlation with Kujala (r = −0.66); the Norwegian study found moderate negative correlation with BPII (r = −0.43); the Dutch study showed strong negative correlation with Kujala (r = −0.78); and the Turkish study found better negative correlation with Kujala (r = −0.82) than BPII 2.0 (r = −0.55), contrary to our findings.1,10,21,24,25 As both BPII and NPI are patellar instability–specific patient-reported outcome measures, a better correlation would be expected between them, compared with Kujala, that is specific for the patellofemoral joint; but only 1 of 13 questions directly assesses instability symptoms. 13
It is important to mention the differences between the NPI score and BPII 2.0. Both scores are disease-specific for patellofemoral instability, although the NPI score focuses on clinical scenarios during sports and daily activities. Meanwhile, the BPII 2.0 centers on quality of life of patients with patellar instability. Both scales complement each other and have an additive effect in understanding outcomes for this group of patients. We recommend that both be used together to have a complete image of the patient’s condition.
Limitations
This study has some potential limitations. The NPI score was translated into Colombian Spanish, which might have some subtle variations compared with Spanish spoken in other regions of Latin America or Spain. However, dialectic differences are minimal and that this translation may be used in all Spanish-speaking countries.
Conclusion
Our study demonstrated that the Spanish version of the NPI score has high internal consistency and test-retest reliability, enabling its use in Spanish-speaking patients with patellar instability. It also demonstrated high correlation with another disease-specific score for patellofemoral instability (BPII) and moderate correlation with the Kujala score.
Footnotes
Acknowledgements
The authors would like to thank Andres Felipe Plaza-Hernandez for his help with the statistical analysis.
Final revision submitted April 11, 2025; accepted May 6, 2025.
The authors declared that there are no conflicts of interest in the authorship and publication of this contribution. AOSSM checks author disclosures against the Open Payments Database (OPD). AOSSM has not conducted an independent investigation on the OPD and disclaims any liability or responsibility relating thereto.
Ethical approval for this study was obtained from Fundación Valle del Lili (reference No. 299-2020).
