Abstract
Abstract
Introduction
Reimers migration percentage (MP) is the gold standard for measuring hip displacement in children with cerebral palsy (CP). Hip surveillance registries proposed using the top of the Gothic arch (GA) as a modification in patients with acetabular dysplasia because the classical method (CM) described by Reimers may underestimate hip migration. The aim of this study is to assess the inter- and intra-observer reliability of the modified method (MM) versus the CM and identify their effect on the MP.
Methods
We performed a retrospective review of 50 children with CP, who had a hip radiograph at our institution between 1st April 2014 and 28th February 2018. All hip radiographs were carefully selected to show the presence of a GA. Four observers measured the MP using the CM and MM for each patient. Interclass coefficient was used to estimate inter- and intra-observer reliability.
Results
Inter-observer reliability was excellent for the CM with ICC 0.96 (95% CI 0.94 to 0.97) and good for the MM, ICC 0.78 (95% CI 0.51 to 0.89) p < 0.001. Intra-observer reliability was excellent for both methods raging from ICC 0.94 to 0.99 for the CM and ICC 0.89 to 0.95 for the MM. The mean MP was 19% for the CM and 28% for the MM (p < 0.001).
Conclusion
The CM is more reliable than the MM to measure hip migration in children with CP. If the CM is used and acetabular dysplasia with a GA are present on the hip radiograph, then a 9% hip migration underestimation should be considered on decisions for both referral and surgical management.
Level of evidence
II
Introduction
Hip dislocation is a significant but preventable cause of morbidity in children with cerebral palsy (CP) and is often insidious in onset. The prevalence of hip dislocation is reported in the literature as between 10% and 20% in a population of children with CP.1,2 The risk of hip dislocation increases with increasing Gross Motor Function Classification System (GMFCS) Level. 3 Hip dislocation may cause pain and impacts upon the patient's health-related quality of life. 4
Hip displacement in children with CP is difficult to assess clinically due to the development of contractures that may affect hip range of motion. Regular clinical examinations should be combined with series of pelvic radiographs to record progressive hip displacement over time, enabling documentation of pathological changes through comparison with previous measures, provided standard positioning and radiographic measurements are used. 5 In 1994, a hip surveillance programme was established in Southern Sweden with the aim of preventing hip dislocation in children with CP. 6 Ten years following the introduction of the programme, the dislocation rate was reported to have dropped from 8% to 0.5%. Similar findings were reported in the Australian hip surveillance programme, which commenced in 1997. 7
Reimer's migration percentage (MP) is the gold standard measure in the radiographic assessment of hip displacement. 8 This method describes the lateral displacement of the femoral head relative to the edge of the acetabulum as denoted by Perkin's line (Fig. 1). Using this measure, hip displacement is defined as a MP greater than 30% or 33% in most published studies. 7

Graphical depiction of Reimers migration percentage.8 Hilgenreiner's line (H) is a horizontal line that enables orientation of the pelvis in the horizontal plane. Perkins line (P) is a line which runs from the lateral edge of the roof of the acetabulum. The migration percentage is the part of the femoral head that extends beyond this line.
Hip dislocation is also controversial amongst surgeons, with definitions ranging between a migration of 80% to 100%.9,10
Controversy also exists in hip surveillance studies on the timing and the type of the preventative surgery to prevent hip dislocation. Some hip surveillance tertiary centres use a MP over 30% to proceed to hip soft tissue surgery, while others wait until the migration reaches 40% before they perform hip bony surgery. 11
Reimers proposed that his MP was more reliable than other validated measures of hip displacement and he estimated a standard error of +/–10%. 12 However, some studies have suggested that the inter-observer measurement differences are actually greater and range from 0% to 18%. 13 Factors that may influence the interpretation of radiographs include pelvic obliquity and acetabular dysplasia. 14 In neuromuscular patients, acetabular deficiency is found on computerized tomography (CT) scans to be global with significant posterior deficiency. 15 Reimer's migration is based on an antero-posterior (AP) radiograph of the hips and does not provide any information on the presence of an anterior or posterior hip dislocation.
Nevertheless, it has been confirmed that the MP had a strong linear relationship with acetabular dysplasia, which was expressed by 3D acetabular indices. 16 The same study showed that the increased MP is the best criterion for determining the need for hip reconstructive surgery and 3D evaluation using techniques such as 3D CT scans have been suggested for preoperative hip evaluation particularly with a higher GMFCS level.
Standardized hip surveillance radiographic protocols have been developed in tertiary centres as a result to reduce variability in measurement. 5
The Gothic Arch (GA) is an important abnormal radiographic feature found on AP pelvic radiographs in children with CP with acetabular dysplasia, which affects the accuracy of hip migration measurements. 17 Unfortunately, it is a confusing term in the international literature as it has also been used to describe a characteristic feature of AP hips radiographs in healthy individuals without CP or hip dysplasia.18–20
Bombelli described the GA in a normal AP pelvic radiograph with the apex of the arch lying directly above the centre of the femoral head on otherwise fit and healthy children without CP (Fig. 2). 19 He hypothesized that hips with an abnormal GA are mechanically jeopardized due to different loading stresses (Wolff's Law) on the acetabulum and, therefore, predisposed to developing osteoarthritis. He reported that in abnormal hips, the apex of the arch lies medial or lateral to a vertical line drawn through the centre of the femoral head, resulting in craniomedial or craniolateral orientation of the GA. Bombelli's description of the GA has been described as a reliable radiographic measurement for developmental dysplasia of the hip. 20

Roach et al used 3D CT analysis to describe acetabular insufficiency and reported that in a dysplastic-subluxated hip. The displaced femoral head produced constant contact pressure to the central, superior acetabulum, causing severe erosions that consequently produced the shape of a GA. 21 Roach's description of a GA was different to that of Bombelli. Roach's GA is found on the superolateral margin of the acetabulum and it is always associated with acetabular dysplasia.
Roach et al reported that anterior acetabular column was underdeveloped and did not extend far laterally, causing the acetabulum to be shallow anteriorly. Acetabular anteversion was excluded as an additional cause for the appearance of Roach's GA, as none of their observed acetabular fossae pointed excessively anterior. They concluded that the true acetabulum rim was evident just below the tip of the arch. Roach's observations were made on non-CP patients.
Cooke et al were the first to notice that when measuring the Reimers hip migration in children with CP, the apex of the acetabulum was not always at its most lateral margin. 22 They stated that acetabulum anteversion was common in children with CP and he suggested a modified method (MM) to mark the apex of the acetabulum when drawing the Perkins line for Reimers migration (Fig. 3).

Graphical representation of the MM for calculating the migration percentage utilized the top of the ‘Gothic arch’ in case of a dysplastic acetabulum (B) and the P (Perkins) line will therefore run more medial than the CM on the right (A). H represents Hilgenreiner's line
In a more recent study, Chang et al analysed 3D CT scans of children with CP and displaced hips and they reported that, although the direction of spastic hip subluxation is generally agreed to be posterolateral, the acetabular dysplasia in spastic hip subluxation in contrary is global and more apparent in the anterior aspect. 23
Parrot et al published their results on inter- and intra-rater reliability of Reimers MP and acetabular index on pelvic radiographs of children with CP and displaced hips. 24 They used the definition of GA, as described by Roach et al, as the result of eccentric pressure from a subluxated femoral capital epiphysis inhibiting the ossification of the superolateral aspect of the cartilaginous anlage. They used the midpoint of the GA instead of at the lateral margin to place the Perkins line for their measurements (Fig. 4).

The red arrow shows the top of Gothic arch, where the apex of acetabulum is located in case of dysplasia in children with CP.
Both the Swedish as well as the Australian hip surveillance groups have adopted the GA concept and proposed a modification of Perkins line in which the ‘top of the GA is used’.24,6
However, difficulty in identifying the top of the GA has led to many clinicians persisting with the classical method (CM) as described by Reimers and routinely using the lateral edge of the acetabulum to draw the Perkins line when documenting the MP.
When the CM is used, the femoral head migration can be underestimated, and this may result in a delayed referral of children with at risk hips to a specialist referral centre. Utilization of the MM for hip surveillance may result in more timely referrals for children with CP but might also be associated with reduced measurement reliability between different observers.
For this reason, Parrot et al suggested using the same observer for assessing hip displacement and any decision making regarding appropriate intervention. 24 Their results indicated that with good attention to detail, such as patient position, standard measurement protocol and using the same experienced clinician who regularly measures radiographs, the amount of variability can be reduced.
The aim of this study is to assess the inter-observer and intra-observer reliability of the modified MP which utilizes the GA versus the classic measure described by Reimers using digital templating software. If the reliability is found to be higher with the CM, which is our hypothesis, we will subsequently calculate the average degree of underestimation of the hip migration as this is a very important factor that hip surveillance groups around the world would need to take into account for both referral as well as surgical management purposes.
Materials and methods
Study design
We performed a retrospective radiographic review of children with CP who attended the Evelina London Children's Hospital between 1st April 2014 and 28th February 2018. This cohort consisted of children with CP who were enrolled into our regional hip surveillance programme with a catchment area of 2.5 million paediatric patients. We identified 50 AP pelvic radiographs with the presence of an abnormal GA as described in the literature with regards to hip surveillance for children with CP.6,24 We did not use the hip radiographs of the same patient more than once. The radiographs were anonymized and retrieved from the hospital Picture Archiving and Communication System (PACS). AP hips radiographs were taken in accordance with a documented local hip surveillance protocol to avoid variability in patient positioning. Our institution is the tertiary paediatric referral centre for South East of England, leading an established regional hip surveillance programme since 1997 and a hip surveillance positioning radiographic protocol is strictly followed to ensure appropriate handling of children with contractures, resulting into the highest possible imaging quality. 5
We excluded all radiographs in patients with metalwork or previous bony surgery and radiographs with pelvic obliquity and/or rotation.
The MP was measured using the CM described by Reimers and the MM as used in hip surveillance registries.6,12 This was measured independently by four observers with varying degrees of experience in paediatric orthopaedics. Both the right and left hips on the AP radiographs were measured. The observers included a senior paediatric physiotherapist, specialized in neurodisability Anita Patel (AP); the superintendent paediatric radiographer Lucy Clough (LC); a specialist registrar in orthopaedics (CW) and a final year medical student (PC). The lead author (MK), a senior paediatric orthopaedic surgeon, provided training in the measurement of the MP using CM and MM for all the observers at the same sitting using the same criteria.
The radiographs were analysed using the paediatric section of the TraumaCad software (TraumaCad version 2.0, Orthocrat™, Westchester, USA), which is a PACS integrated computer software program (Fig. 5). To evaluate intra-rater reliability, the measurements were documented two weeks apart by the observers. This amounted to a total of 200 measurements per observer.

Hip radiograph in patient with CP with the datum points and lines (shown in green) using the TraumaCad software. The Amber line represents Hilgenreiner's line.
The primary outcome measure was the inter-rater agreement between the observers in the use of CM and MM in calculating the MP as well as the intra-rater agreement and the standard error between the two measures. We also hypothesized as expected that there will be a statistically significant difference between the average MP values between CM and MM.
Statistical analysis
The statistician Lucy Sayer (LS) was consulted prior to the study to determine the sample size required for statistical significance. Inter- and intra-rater reliability were evaluated using the intraclass correlation coefficient (ICC). ICC values range from 0 to 1, with higher values reflecting more reliable measurements. An ICC value of 0.0 suggests that the variance is entirely due to measurement error and that none is the result of interpatient variability, whereas a value of 1.0 implies that the variance is due entirely to inter-subject variability. 25
ICC estimates were taken for the inter-observer reliability and their 95% confidence interval (CI) using SPSS statistical package version 25 (IBM SPSS Inc., Chicago, Illinois) based on a single measure, absolute agreement, two-way random effects model. For the intra-observer reliability, ICC estimates and their 95% CI were based on a single measure, absolute agreement, two-way mixed effects model. As reported in the literature, an ICC value greater than 0.75 suggests good measurement reliability and a value greater than 0.90 as excellent. 26
We also quantified the difference in average MP values between CM and MM using an analysis of variance (ANOVA) test.
Ethics
Ethical approval was obtained from the Clinical Academic Research and Development.
Results
A total of 100 hips (50 children) were analysed. There were 18 children with GMFCS level II, 18 children with GMFCS level III, 12 children with GMFCS level IV and two children with GMFCS level V. There were 30 male and 20 female patients with a mean age of 8.4 years (age range 3 to 13). All four observers (AP, LC, CW and PC) performed 100 measurements twice on the study patients. This provided 800 measurements for analysis.
Analysis of the inter-observer reliability for the CM of hip migration measurement provided an ICC value of 0.963 (range 0.944 to 0.976). However, for the MM of hip migration measurement, the inter-observer reliability had an ICC of 0.777 (CI 0.510 to 0.885). There was a strongly statistically significant difference in the ICC values between the classic and modified measure of hip migration (p < 0.001).
There was also variability in the intra-observer reliability between all four observers. For the CM, the ICC values for the intra-observer reliability were AP 0.993 (0.990 to 0.995); CW 0.980 (0.971 to 0.987); LS 0.939 (0.910 to 0.958); PC 0.982 (0.973 to 0.988). For the MM, the ICC value for the intra-observer reliability was AP 0.993 (0.990 to 0.996); CW 0.948 (0.920 to 0.965); LS 0.900 (0.806 to 0.943); PC 0.894 (0.823 to 0.934) (Table 1).
Table of the intra-rater reliability between four observers
The difference in average MP values between CM and MM was analysed in this the study. For the same data set, the mean MP for the classical MP was 19% and for the MM 28%. This difference was also strongly statistically significant following analysis with ANOVA (p < 0.001).
Discussion
The use of a hip surveillance programme in children with CP is becoming more prevalent. In the United Kingdom, The National Institute for Health and Care Excellence (NICE) recommends the establishment of regional hip surveillance programmes to screen children with CP for displacement. The longest established programme for Sweden uses greater than 30% of hip displacement as the threshold to referral for follow-up and intervention. In patients with hip dysplasia, the top of the GA has been used in these programmes to correspond to the lateral edge of the acetabulum. This threshold value is also utilized by other national registries.27,28
In the United Kingdom, regional hip migration programmes use a network of highly specialized physiotherapists to perform the hip surveillance measures in the community. 29 Our study is the first to use multidisciplinary cohort of observers of different experience levels to estimate the ICC. We found that the agreement for the use of the CM for the migration index was significantly higher than that of the MM with the use of the GA (0.963 versus 0.777). However, there was no significant difference in the intra-observer reliability between the observers in the calculation of the MP.
We also found that the use of the CM to calculate hip migration in the presence of a GA may underestimate the MP by 9% (the mean MP for the classical MP was 19% and for the MM 28%, which was strongly statistically significant with p < 0.001.) This finding is very important for both specialist referral as well as surgical management, and this has to be taken into account when the GA is used for measurements (Figs. 6 and 7).

This is a hip radiograph of a child with CP, found to have a left dislocated hip and a right migrated hip. This is an example where decision making for surgical treatment will be influenced by the method chosen to measure Reimer's migration as the MM shows a right hip migration of 48% while the CM shows a right hip migration of 39% (underestimation of 9%). Some surgeons would not choose to operate on the right hip unless there is migration over 40%. Green: Hilgenreiner's line; red: Perkin's line; blue: inner and outer margins of ossific nucleus; yellow: Gothic arch.

This is an example where decision making for referral to a specialist centre for further management will be determined by the method chosen to measure Reimer's migration. The MM shows of a right hip migration of 37% while the CM a right hip migration of only 28% (9% migration underestimation), which means that using the CM in this case could cause delay to referral and appropriate management. Green: Hilgenreiner's line; red: Perkin's line; blue: inner and outer margins of ossific nucleus; yellow: Gothic arch.
The observers in the study had varying degrees of experience with all four observers having been provided the same training prior to the commencement of the project. A study by Faraj et al, in which the MP was measured using the CM and manual measures, found a standard error of measurement of 12.9% for one observer and 22% between both observers. 13 In this study the standard error was greater between observers when the MM was used to estimate the MP (0.73 vs 0.65).
The use of digital templating software such as the program utilized in this study has been shown to improve the accuracy of the hip migration measure. 26 Segev et al utilized five paediatric orthopaedic consultants as observers using digital templating software to measure the MP using the CM. 26 The ICC for the observers ranged between 0.83 to 0.92 at a 95% CI but the sample size utilized was only ten patients with CP. Another analysis compared the reliability of the MP between an experienced and inexperienced physiotherapist and found excellent ICC of 0.94. 30 Our findings support the use of digital templating software as the ICC between the four observers in this study ranged between 0.939 and 0.993 for the CM and 0.894 and 0.993 for the MM.
Conclusion
The CM is a more reliable method to measure hip migration in digital templating software compared to the MM, but it does underestimate the hip migration by 9% when a GA is present on the AP hips radiograph. We recommend that health professionals involved in hip surveillance programmes for children with CP consider this when making decisions about specialist referral as well as surgical management.
Both CM and MM are fit for purpose. Hip surveillance groups with a minimum number of experienced migration observers could use the MM as the variability could be kept low. Large hip surveillance observer groups should probably use the CM for their measurements as it is more reliable and less confusing for the observers involved and they could include the possibility of a 9% migration underestimation on their final report. This underestimation on hip migration should also be considered by subspecialty surgeons for decision making at the tertiary referral centre.
Footnotes
Acknowledgements
We would like to thank Senior Specialist Paediatric Physiotherapist Anita Patel (AP) and Superintendent Radiographer Lucy Clough (LC) for all their work with the radiographic measurements as well as statistician Lucy Sayer for her help with the statistical analysis.
PC: Data collection, Analysis.
CS: Data collection, Analysis.
MK: Senior author, Study design.
