Abstract
Abstract
Purpose
The signs for clubfoot relapse are poorly defined in the literature and there is a lack of a scoring system that allows assessment of clubfeet in ambulatory children. The aim of this study is to develop an easy to use, reliable and validated evaluation tool for ambulatory children with a history of clubfoot.
Methods
A total of 52 feet (26 children, 41 clubfeet, 11 unaffected feet) were assessed. Three surgeons used the seven-item PBS Score to rate hindfoot varus, standing and walking supination, early heel rise, active/passive ankle dorsiflexion and subtalar abduction blinded to the other examiners. All parents answered the modified Roye score questionnaire prior to the clinical assessment. Correlation between the mean PBS Score and the Roye score was evaluated using Spearman's rank correlation coefficient. Interobserver reliability was tested using weighted and unweighted Cohen's Kappa coefficients.
Results
The Spearman's rank correlation coefficient for correlation between mean PBS Score and Roye score was 0.73 (moderate to good correlation).The interobserver agreement for the total PBS Score resulted in an intraclass correlation coefficient of 0.93 (almost perfect agreement).
Conclusion
The PBS score is an easy to use, clinical assessment tool for walking age children with clubfoot deformity. It includes passive and active criteria with a very good interobserver reliability and moderate to good validity.
Level of Evidence
Level I - Diagnostic study
Keywords
Introduction
The treatment concept for congenital talipes has experienced a major shift from a surgical1,2 towards a more conservative, non-surgical approach as a result of long-term outcome studies following the Ponseti method3–5 and others. 6 As a result, our understanding of what a good result is, has equally shifted to more function-based outcome measures.7–10 It has become clear that post-treatment radiographic parameters are a poor measure for function. 11 The increasing popularity of the simple, descriptive and easy to apply Pirani score 12 over the Dimeglio score 12 in most contemporary publications supports this notion. The Roye score, a patient-based assessment tool has shown to have excellent validity and finds increased representation in our literature. 13 Scores including radiographic readings on the other hand, such as the Laaveg/Ponseti score 3 are hardly used in clinical practice.
Although the Pirani score allows us to reliably evaluate a clubfoot in a newborn and non-ambulatory toddler, the authors believe that it does not allow for functional evaluation in an ambulatory child in which early signs of recurrence and subsequent functional limitations are the main concern to the examining clinician. Furthermore, descriptive signs such as medial and posterior skin crease or softness of the heel are less relevant in treated, ambulatory children. Signs for clubfoot relapse are poorly explored in the literature but generally seen as decreasing range of movement in the ankle joint and subtalar complex or as an increase in deformity, be it functional or fixed deformities. The aim of this study is to develop an easy to use, reliable and validated evaluation tool for ambulatory children with a history of clubfoot.
In order to identify and validate clinical signs for the evaluation of clubfoot in the walking child the two authors travelled with a team of paediatric orthopaedic surgeons and a neurophysiologist to a test centre where 19 clinical signs were assessed by one of the authors (MFS) in 100 children (200 feet) and validated against the Roye score as well as dynamic pedobarographic measurements. Herd et al 14 developed a pedobarographic assessment tool for surgically treated clubfoot using a specific mask measuring peak pressure distributions. He concluded that abnormalities are more easily detected in dynamic pedobarography with which the authors concur. The authors evaluated the clinical outcome measures which showed that not all signs were significant in determining the outcome of ambulatory clubfeet. The number of signs was reduced to a total of seven relevant items which form the basis of our current proposed scoring system. No interobserver reliability measurement was performed in this first trial centre.
In order to assess for interobserver reliability a second trial run was performed with the two authors and a third experienced paediatric orthopaedic surgeon. Unfortunately, the interobserver reliability for the seven remaining signs remained unsatisfactory and a clearer definition on how to evaluate each sign seemed necessary.
Clear guidelines on how to assess for the remaining seven signs were developed and a final study with the same three paediatric orthopaedic surgeons took place to assess interobserver reliability as well as validate the final version of the score against the modified Roye score.
Materials and methods
A total of 52 feet (26 children) were assessed. Among them 41 feet were clubfeet and 11 unaffected feet. In all, 18 patients were male, eight female. The three surgeons used the seven-item Pirani/Böhm/Sinclair (PBS) Score to rate all feet blinded to the other examiners.
A total of 41 clubfeet were recruited from a dedicated clubfoot clinic for evaluation by three paediatric orthopaedic surgeons in order to verify interobserver reliability of the seven remaining signs following a clear standardization of the assessment protocol. All 52 feet were assessed by all three examiners.
The children aged two to 14 years (mean seven years) were diagnosed with idiopathic, non-syndromic clubfoot and presented with no comorbidities. All children were treated at a single centre.
Two children had to be excluded from our study due to an incomplete Roye score questionnaire.
All parents were asked to answer the modified Roye score prior to the clinical assessment in order to assess validity of the PBS Score towards the questionnaire.
The seven-item PBS score was assessed by the three paediatric orthopaedic surgeons individually and in the same predefined approach. The seven signs evaluate the standing, walking and sitting child.
The first two signs are assessed in a standing child. The child stands barefoot on an even surface. The surface should be firm; a soft examination couch is not suitable. Existing leg-length discrepancies are addressed by an appropriately sized foot lift to the shorter leg. Both legs are positioned neutrally with the patella facing forward and both feet are at a distance that is equal to the distance of both hips.
Hindfoot varus (HV)
The examiner evaluates the child from the back. The insertion of the tendoachilles to the calcaneus is identified and serves as the first reference point, marking the starting point of a vertical line toward the standing surface (blue line). The second reference point is the centre of the heel which can be found medial or lateral to the vertical reference line. If the centre of the heel is ‘on’ the vertical line or lateral from the vertical line, no varus deformity is present. No varus deformity is scored with one point, varus deformity with two points (Fig. 1).

Heel varus.
Standing supination (SS)
The examiner evaluates the child from the front. In a full weight-bearing child all metatarsal heads should be in contact with the underlying surface. A lack of floor contact of the first metatarsal head suggests a fixed supination deformity of the foot. If SS present the foot is scored with two points, no SS is scored with one point.
The following signs are assessed in the unassisted, walking child. The child should be assessed when walking barefoot on even ground with walking distance of at least ten steps. More accurate assessment is gained when lowering the examiners viewing point by kneeling or sitting on a low chair (Fig. 2).

Standing supination.
Walking supination (WS)
WS is a dynamic sign evaluated when the child is walking towards the examiner. If forefoot supinates at the time of active dorsiflexion in swing phase is observed, the foot is scored with two points, one point if WS is absent (Fig. 3).

Swing phase supination.
Heel rise (HR)
An early HR is best seen when examining the walking child from the side. In a normal gait cycle, HR occurs only after the opposite foot has achieved heel strike. An early HR is present when the involved heel lifts of the ground before the contralateral foot achieves heel strike. If early HR is observed the foot is scored with two points, one point if early HR is absent (Fig. 4).

Early heel rise.
The remaining three signs, passive/active ankle dorsiflexion (pAD/aAD) and subtalar abduction (SA) are evaluated in the sitting child. The child sits on the examination table with hips and knees flexed to 90°.
pAD
pAD of the ankle is measured with the heel in neutral alignment and the knee in 90° of flexion. The examiner passively dorsiflexes the ankle until resistance is felt. No excessive force should be exerted to avoid a rocker bottom deformation of the foot during the assessment. The long axis of the fibula serves as the first reference line, the lateral border of the foot being the second reference line. If both lines are perpendicular the ankle is considered to be in a neutral position. If the dorsiflexion angle measured above neutral is more than 10°, a score of one is assigned, for angles between 6° and 10° a score of two is assigned, for angles of 0° to 5° three points are assigned and for passive dorsiflexion below neutral confirming an equinus deformity a score of four is assigned (Fig. 5).

Passive ankle dorsiflexion.
aAD
With the knee in 90° of flexion, the patient is asked to move his toes upwards, bringing the ankle into maximum dorsiflexion. The examiner supports the hindfoot, maintaining a neutral heel alignment at the same time avoiding any supporting pressure from plantar. A goniometer is positioned with one arm parallel to the posterior fibular cortex, the second arm is aligned parallel to the lateral border of the foot. If the resulting angle is measured below 90° aAD is scored with one point. If the resulting angle is above 90° a score of two is assigned to the foot (Fig. 6).

Active ankle dorsiflexion.
SA
Evaluation of subtalar movement as described in the Pirani classification for clubfoot describes the subtalar movement, specifically between the head of the talus and the navicular. The examiner abducts the subtalar joint against the counterpressure over the lateral aspect of the head of the talus until resistance is felt. The angle between the long axis of the tibia and long axis of first metatarsal is measured. The abduction beyond neutral foot position is measured. If abduction exceeds 10° a value of one point is assigned. If maximum abduction is measured from 6° to 10° two points are given. A SA of 0° to 5° is rated with three points and no SA, defined as an abduction of less than 0° is rated with four points (Fig. 7).

Subtalar abduction.
The sum of all points for all seven signs are added, resulting in a score between seven and 18 (Fig. 8).

PBS Data Collection Tool.
Statistical analysis
Descriptive statistics are presented as mean, sd, median, range, frequency and percentage. For pairs of raters, weighted, using equal-spacing and unweighted Cohen's Kappa coefficients were calculated using function Kappa in R package vcd (R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/). For all three raters combined, unweighted Fleiss’ Kappa coefficients were calculated using function kappam.fleiss in R package irr. Intraclass correlation coefficients (ICCs) were calculated using function icc in R package irr. All calculations were made using R version 3.5.0.
The Kappa statistic takes values less than or equal to 1. Kappa is a measure of agreement between two raters. Any disagreement between the raters is weighted equally, which can be problematic when the rating variable is an ordinal scale variable. The level of disagreement between two raters is lower when raters choose adjacent categories of the variables rather than very different categories. The Weighted Kappa statistic takes into account that a variable has an ordinal scale. A rule of thumb for interpreting the Kappa statistic is provided by Landis and Koch: 15
- < 0.00 poor level of agreement;
- 0.00 to 0.20 slight level of agreement;
- 0.21 to 0.40 fair level of agreement;
- 0.41 to 0.60 moderate level of agreement;
- 0.61 to 0.80 substantial level of agreement;
- 0.81 to 1.00 almost perfect level of agreement.
Correlation between the mean PBS score between the three raters and the Roye score was evaluated using Spearman's rank correlation coefficient using cluster robust standard errors, taking into account the dependencies between left and right feet for children with bilateral affected feet. Only affected feet were included.
Spearman's rank correlation coefficient takes vales between -1 and 1, where values close to -1 indicate high negative correlation, values close to 1 indicate high positive correlation, and values close to 0 indicate no or very week correlation. A rule of thumb for interpreting the coefficient is provided by Colton 16 :
- 0 to 0.25 (0 to -0.25) little or no relationship;
- 0.25 to 0.50 (-0.25 to -0.50) fair degree of relationship;
- 0.50 to 0.75 (-0.50 to -0.75) moderate to good relationship;
- 0.75 to 1.00 (-0.75 to -1.00) very good to excellent relationship.
Results
The Spearman's rank correlation coefficient for correlation between mean PBS Score and Roye score was 0.73 (p < 0.0001). This indicates moderate to good correlation according to the rule of thumb provided by Colton 16 . A scatterplot of mean PBS Score and Roye score is shown in Figure 9. The figure shows that feet with a high mean PBS Score are more likely to have a high Roye score, and vice versa, although there is some variation.

Scatterplot of Roy score and mean PBS Score. Dot size indicates number of observations.
The interobserver agreement for the total PBS Score resulted in an ICC of 0.93; coefficients ranging from 0.81 to 0.99 confirming an almost perfect agreement. In looking at the signs individually pAD reached an ICC of 0.81 suggesting a similar level of agreement. Substantial agreement (ICC from 0.61 to 0.80) was found for heel varus (0.79), SS (0.74) and aAD (0.74). Moderate agreement (ICC from 0.41 and 0.60) was found for WS (0.45), early HR (0.58) and passive SA (0.52) (Tables 1 to 3).
Frequencies of the different PBS signs
PBS Score, depending on foot status
Kappa coefficients and intraclass correlation coefficients for PBS signs
Cohen's weighted Kappa, using equal-spacing weights
Intraclass correlation coefficients used to assess agreement between raters
NA, not applicable
Discussion
As much as our approach to clubfoot and its outcome measures have changed, there is no scoring system that allows the assessment of the clubfoot in an ambulatory child in the same way as the Pirani score 17 does for the first few months of life. Given the lack of functional components in the Pirani scoring system it seems ill equipped for the older child with more complex requirements to a good foot function.9,10
The PBS score is a new, simple to apply scoring system focusing on the two main functional units defining outcome; the ankle joint and the subtalar movement as described by Ponseti. 3 pAD has been found to correlate with foot function. Cooper and Dietz 18 suggest that ankle dorsiflexion of less than 5° resulted in poorer foot function. The amount of SA as described in the Pirani classification 19 corresponds with talo-calcaneal alignment. Both are therefore considered in the PBS score with moderate (0.52 SA) and almost perfect (0.81 pAD) interobserver reliability. Documenting passive range of movement alone, however, is a poor predictor for early stages of recurrences, which reportedly occur in 18% of feet after successful conservative treatment. 20 A dynamic forefoot supination during swing phase, an early HR and loss of aAD are commonly early indicators for relapse. 3
In more advanced recurrences, structural changes to the foot anatomy can be increasingly observed. A more functional supination of the first ray during the swing phase can develop into a rigid deformity, visible not only during a gait cycle but also in a neutral weight-bearing position. The PBS score accounts for this in one of its seven criteria with an interobserver reliability of 0.74. The subtalar anatomy defined by its capacity to invert and evert significantly influences the heel alignment and is the basis for the required over correction during the final stages of the Ponseti treatment for clubfoot. 3
A loss of eversion or SA always results in an alignment shift of the heel from valgus to varus. 3 The interobserver reliability suggests that HV is easily assessed with substantial agreement regarding interobserver reliability (0.79).
It is no surprise to the authors that the dynamic signs of early HR and WS only show moderate agreement between examiners. Although a gait lab setting or video recording would certainly improve accuracy, it remains a challenge in a normal clinic setting and the authors feel it important for the child to walk several laps prior to making a decision on these signs. A strict adherence to the protocol as earlier described can help to minimize false interpretation of these signs as they occur within a very short time interval of the gait cycle, still moderate agreement was found for these signs.
Interobserver reliability of the PBS score compares well with the reliability reported in the most commonly used scores. In comparing the results from trained physiotherapy assistants with the scoring of an orthopaedic surgeon a mean agreement percentage of 83% was stated with Kappa scores for the individual components ranging from 0.50 to 0.72. 21 Agreement between physicians has been found to be as high as 89%. Jain et al 17 report a total Kappa score of 0.71 when comparing interobserver reliability between five orthopaedic surgeons using the Pirani classification. In a blinded trial with two orthopaedic surgeons Flynn et al 21 states correlation coefficients of 0.83 for the Dimeglio score and as much as 0.90 for the Pirani score.
More recently a plantaris, adductus, varus, equinus of the ankle and rotation around the talar head in the frontal plane (PAVER) severity score was introduced by Nunn et al 22 selecting variables from the Pirani and Dimeglio score with good inter- and intraobserver agreement. The novel score proposes five signs including plantaris deformity, adductus, varus, equinus and rotation around the talar head. The angles are measured and given a number according to predefined sectors, which are then again multiplied with an age-specific multiplier. The explicit aim of this score is to predict the amount of casts required for correction in a poor resource environment with late presentation of clubfoot. With an intraobserver variation of 0.89 and interobserver variation of 0.92 the authors suggest PAVER to be a valid tool in the assessment of delayed presenting clubfoot.
In contrast to the PAVER score, however, the PBS score has been developed, not to identify the amount of casting required for a late presenting clubfoot, but rather to score functional outcome after treatment and identify any signs of relapse at an early stage. The authors feel that by adding functional components (active dorsiflexion, early HR, WS) rather than relying solely on passive range of movement assessment it allows for a more complete evaluation of foot function.
Recent literature has shown that a good functional outcome needs to be seen in the socioeconomic and cultural context of the sample group. 23 A good result in one country might not necessary reflect a perceived good result in another country, depending on expectations and requirements within the prevalent culture. The Roye score has been proposed by Dietz et al 13 as an outcome assessment tool. As a patient-based outcome score it takes these regional expectations of foot function into consideration. PBS validation with the Roye score in a country with high expectations in terms of function, if at all, would negatively affect the validity of the PBS score. Reversely, a plantigrade, pain free and shoe-able foot might be considered a good result in many places,24,25 but this might not be the case everywhere. If at all we, therefore, see a bias towards less validity and expect that with increased use of the PBS and Roye score in different environments a more global validation to take place.
The role of pedobarography in assessing clubfeet has shown14,26 that specific changes in pressure patterns can be used to assess foot function after clubfoot treatment. Validation of the PBS against objective pedobarographic protocols as suggested by Herd et al 14 or the Oxford foot model 27 are necessary to further validate the proposed score in the future.
Conclusions
The PBS score is an easy to use, clinical assessment tool for walking age children with clubfoot deformity. It includes passive and active criteria with a very good interobserver reliability (ICC 0.93) and moderate to good validity. Currently the score is being used at the Swedish national clubfoot registry and hopefully with further validation by pedobarography or gait analysis will find its way into common use facilitating the early detection of clubfoot relapse.
