Sage Journals: Discover world-class research

Abstract

Study design

Retrospective, mono-centric cohort research study.

Objectives

The purpose of this study is to validate a novel artificial intelligence (AI)-based algorithm against human-generated ground truth for radiographic parameters of adolescent idiopathic scoliosis (AIS).

Methods

An AI-algorithm was developed that is capable of detecting anatomical structures of interest (clavicles, cervical, thoracic, lumbar spine and sacrum) and calculate essential radiographic parameters in AP spine X-rays fully automatically. The evaluated parameters included T1-tilt, clavicle angle (CA), coronal balance (CB), lumbar modifier, and Cobb angles in the proximal thoracic (C-PT), thoracic, and thoracolumbar regions. Measurements from 2 experienced physicians on 100 preoperative AP full spine X-rays of AIS patients were used as ground truth and to evaluate inter-rater and intra-rater reliability. The agreement between human raters and AI was compared by means of single measure Intra-class Correlation Coefficients (ICC; absolute agreement; >.75 rated as excellent), mean error and additional statistical metrics.

Results

The comparison between human raters resulted in excellent ICC values for intra- (range: .97-1) and inter-rater (.85-.99) reliability. The algorithm was able to determine all parameters in 100% of images with excellent ICC values (.78-.98). Consistently with the human raters, ICC values were typically smallest for C-PT (eg, rater 1A vs AI: .78, mean error: 4.7°) and largest for CB (.96, -.5 mm) as well as CA (.98, .2°).

Conclusions

The AI-algorithm shows excellent reliability and agreement with human raters for coronal parameters in preoperative full spine images. The reliability and speed offered by the AI-algorithm could contribute to the efficient analysis of large datasets (eg, registry studies) and measurements in clinical practice.

Keywords

coronal balance artificial intelligence deep learning adolescent idiopathic scoliosis X-ray spinal deformity surgical planning coronal alignment

Introduction

Adolescent idiopathic scoliosis (AIS) is a complex two- or even three-dimensional spinal deformity.¹ In both conservative therapies using customized braces and surgical interventions, AIS patients undergo regular radiological examinations. Additional follow-up evaluations require the estimation of coronal curve characteristics as well as the progression of the coronal deformity. However, currently employed manual methods for the radiological image analysis are time-consuming and dependent on physician experience and therefore hinder the efficient and objective assessment of relevant coronal radiographic parameters in clinical routine and the analysis of large databases for research purposes.²

Standing full-spine X-ray images and lateral bending images are the most common imaging modalities for the radiographic analysis of AIS. These images are used to assess the severity of the deformity, for which Lenke et al have developed a widely adopted classification system.³ The Cobb angle is fundamental for the Lenke et al classification and thus the evaluation of AIS patients and their therapeutic decision-making process.^4-6 Therefore, a reliable measurement of the Cobb angle is critical for clinical routine. Due to its significance in the characterization and treatment of AIS patients, the Cobb angle furthermore serves as a central parameter in studies analyzing inter-rater reliability in the radiographic assessment of coronal parameters. A recent study rated the inter-rater reliability (quantified by intra-class-correlation coefficients, ICCs) for the determination of the Cobb angle as excellent, with ICCs greater than .98 and an average variability of 3°.⁷ These findings correspond to those in earlier studies demonstrating a correlation of .98 for repeated measurements with a standard deviation of the differences in primary Cobb angle of 2.5°-3.2°.^8,9 However, excellent agreement did not apply to smaller curves (<20°) in the secondary Cobb angle, with a correlation of only .52.⁹ In addition, only very few studies reported reliability on further essential coronal parameters (eg, shoulder balance).¹⁰ Discrepancy in the assessment of manual measurements as well as the required time for the assessment of multiple radiographic parameters by physicians highlight the need for the objective, observer-independent, automated assessment of coronal radiographic parameters in clinical routine and research.

Automatic determination of radiographic parameters of the coronal balance could provide the efficiency and accuracy required in clinical assessment and treatment of AIS. Algorithms based on artificial intelligence (AI) represent an approach for the rapid and independent computation of essential radiographic parameters, which could improve measurement validity and streamline radiology workflows.¹¹ In 2013, Langensiepen et al conducted a review of the results of eleven promising studies using novel, semi-automatic, app-controlled, and automatic measurement approaches.¹² However, all of the methods required the manual placement of anatomical landmarks or identification of anatomical regions of interest. Although few recent publications have presented the results of fully automated assessment procedures,^13,14 they are limited to Cobb angles in the coronal plane and thus omit other essential parameters, such as coronal and shoulder balance or T1-tilt.^13,14 Furthermore, they lack a comprehensive comparison of inter- and intra-rater reliability in larger patient cohorts with different physicians, thus impeding the statistical interpretation of the findings due to missing high-quality reference measurements by experienced raters.

Therefore, the goal of the present research study is to develop and scientifically validate a fully automated AI-based algorithm able to determine several essential coronal parameters by comparing its predictions with assessments of 2 experienced physicians in 100 AIS patients.

Methods

This study hypothesized that a novel AI-based algorithm can determine essential coronal parameters fully automatically with excellent reliability (ICC >.75) compared to experienced physicians from 2 different scoliosis centers.

Study Design and Patient Selection

In this study, 100 prospectively collected preoperative AP whole spine X-rays of AIS patients from a database of a single scoliosis center were used for retrospective measurements of coronal parameters. The images were taken preoperatively in a neutral standing position and recorded between May 2019 and September 2021 by experienced technicians using the EOS^® system (sterEOS imaging, Paris, France, version: 1.8.7.66 R; 2D postural assessment). At the time of surgery, the average age of the patients (80♀/20♂) was 14.6 years (range: 12-17 years) and the mean BMI was 20.4 kg/m² (13.8-34.6 kg/m²).

The de-identified radiographs in DICOM format were made available to 2 physicians (rater 1 and rater 2) experienced in the radiographic evaluation of AIS patients and performing measurements in daily routine. Using a standardized digital imaging tool (Surgimap® Version 2.3.2.1), both raters manually measured coronal parameters separately and independently from each other. Rater 1 conducted the measurements independently twice (rater 1A, B) to allow the assessment of intra-rater reliability in addition to inter-rater reliability between both raters. All measurements from both human raters were blinded to each other, as these values served as the ground truth to validate the algorithm’s predictions.

For this research study, no additional X-ray imaging was performed than the clinical standard. This study adhered to legal data protection regulations and to the 1964 Helsinki Declaration, its amendments, and other equivalent ethical standards. Patients signed written informed consent and approval was granted by the Ethics Committee of the doctors’ chamber of Schleswig Holstein (registry number 037/18 m).

Evaluated Coronal Parameters

The following coronal radiographic parameters were evaluated in accordance with scientific literature.^15-19

• T1-tilt (°): angle between a line along the superior endplate of T1 and the horizontal line (positive sign: patient’s right side of the T1-vertebral body (VB) is located more cranially than the left side).

• Clavicle angle (CA, °): angle between the line passing through the most cranial points of both clavicles and the horizontal plane (positive sign: patient’s right shoulder is located more cranially than the left side).

• Coronal balance (CB, mm): distance between the central sacral vertical line (CSVL) and the C7 plumb line (C7PL; positive sign: the C7PL was located on the right side of the CSVL).

• Cobb angles (C, °): angles between the superior endplate of a cranial VB and the inferior endplate of a caudal VB. In accordance with the Lenke classification,³ Cobb angles were classified as:

o proximal thoracic (C-PT), if the curve’s apex lied between T2 and T6,

o thoracic (C-T; apex between vertebra T6 and disc T11/T12),

o thoracolumbar/lumbar (C-TL; apex between disc T12/L1 and L4).

• Lumbar modifier (LM): classification in grades A, B or C depending on the CSVL intersection with the lumbar apex³:

o ‘A’: CSVL oriented between the pedicles,

o ‘B’ CSVL touched a pedicle and

o ‘C’ CSVL lateral to the lumbar apex.

AI-Algorithm

A fully automated algorithm was developed, containing a deep learning convolutional neural network to at first identify different anatomical structures in AP spine X-rays (“phase 1”) and subsequently compute parameters based on the network’s output (“phase 2”). As a result, given an AP full spine X-ray image as input, the algorithm computes all aforementioned coronal parameters fully automatically and visualizes them in a proving image as the output (Figure 1).

Figure 1.

Schematic representation of the pipeline of the presented algorithm showing an AP full spine X-ray as input (left); segmentation of anatomical structures of interest (middle left); computation of parameters based on the segmentation results (middle right); visualization of the coronal parameters (right).

Phase 1: Segmentation

Preprocessing and Data Enhancement

The goal of the developed segmentation model was to localize and classify the sacrum, both clavicles and all VBs respectively to their designated region (cervical, thoracic, lumbar; Figure 1 – phase 1). In a first preprocessing step, to ensure better visibility of bony structures, brightness and contrast were enhanced on all training, validation, and test images using the window width and center information from the respective DICOM tags and adaptive histogram equalization.

Both training and test datasets were comprised of anonymized preoperative and postoperative, full spine and full body AP images, that were obtained from 3 different clinical sites. The majority of patients in both datasets suffered from spinal coronal deformities. Ground truth segmentation information for all visible anatomical entities for the training and test dataset were created by trained medical professionals using a web-based annotator. Annotations consisted of bounding polygon masks around each anatomical structure and the respective labels (cervical/thoracic/lumbar VB, sacrum, clavicle, first rib, femoral head). The final training dataset consisted of 271 full spine and 161 full body images.

Training

A Mask Region-Based Convolutional Neural Network (RCNN) was trained for 370 epochs with a constant learning rate of .002 using PyTorch framework on a NVIDIA GeForce GTX 1080 GPU.²⁰ The training was initialized with pretrained weights from an instance segmentation model trained on the publicly available ImageNet model zoo²¹ to improve the robustness of the model.^22,23 During training, flipping augmentation was applied to half of the randomly selected training images. Inference results of the model consisted of structure localization (segmentation mask, bounding box), assigned category and certainty score.

Phase 2: Parameter Determination

In phase 2, predictions from the trained model were used to compute coronal parameters using calculus and geometrical interrelations. Initially, as the model only distinguishes between cervical, thoracic, and lumbar VBs, the algorithm labelled all VBs from bottom to top assuming 5 lumbar, 12 thoracic and 7 cervical VBs. Subsequently, a spline curve was fitted through the centers of mass of all VBs as shown in Figure 2.

Figure 2.

(a) Midpoints of all VB segmentations; (b) spline fit through midpoints; (c) extremum (blue) and points of inflection (yellow) derived from the spline fit and considered end vertebra according to the Lenke Cobb area classification (green).

For Cobb angles, apexes and end vertebra were identified by computing extrema and points of inflections on the spline, respectively. In case of less than 4 detected inflection points, additional end vertebra locations were defined using the Lenke’s classification for Cobb angle areas³ (Figure 2(c)).

Statistical Analysis

To ensure comparability to similar studies,^13,24,25 the degree of absolute agreement between raters was determined using two-way mixed (intra-rater reliability) or random (inter-rater) single-measure Intra-class Correlation Coefficients²⁶ in addition to the Pearson correlation coefficient (“r”). Furthermore, the mean error and its 95% confidence interval (CI), the standard deviation (SD) and the root-mean-square-error (RMSE) were evaluated. The LM was examined with overall accuracy and a Cohen’s kappa statistical measure²⁷ for inter-rater agreement using quadratic weights. ICC values and Cohen’s kappa above .75 were considered excellent.^28,29 All statistical evaluations were performed in Python 3 programming language.³⁰

Results

All 100 X-ray images could be evaluated successfully by human raters and AI, which allowed a comparison of 100 radiographic measurements for each radiographic parameter.

The results of the intra- and inter-rater reliability analysis between human raters are demonstrated in Tables 1 and 2. The intra-rater reliability analysis (rater 1A vs rater 1B) yields ICCs ranging from .97 for CA to 1.0 for CB and a Cohen’s kappa value of .94 for LM. The smallest RMSE was observed for CA (.8°) and largest for C-T and C-TL (2.2°).

Table 1.

Intra-rater reliability for rater 1A vs rater 1B (n = 100).

Statistical Method	CB	T1-tilt	CA	C-PT	C-T	C-TL	LM
ICC (95% CI)	1.0 (1.0-1.0)	.99 (.98-.99)	.97 (.95-.98)	.98 (.97-.99)	.99 (.99-.99)	.99 (.98-.99)	-
Mean error (95% CI)	-.1 mm (-.4 mm-.1 mm)	.1° (-.2°-.3°)	.1° (-.1°-.2°)	.7° (.3°-1.1°)	.0° (-.4°-.5°)	.1° (-.4°-.5°)	-
SD	1.3 mm	1.3°	.8°	2.0°	2.2°	2.2°	-
RMSE	1.3 mm	1.3°	.8°	2.1°	2.2°	2.2°	-
Pearson correlation (r) (P-value)	1.0 (<.001)	.99 (<.001)	.97 (<.001)	.99 (<.001)	.99 (<.001)	.99 (<.001)	-
Cohen’s κ (95% CI) (P-value)	-	-	-	-	-	-	.94 (.88-.99) (<.001)
Accuracy	-	-	-	-	-	-	.93

Table 2.

Inter-rater reliability of rater 1A, rater 1B vs rater 2 (n = 100).

Rater 1A vs Rater 2
Statistical method	CB	T1-tilt	CA	C-PT	C-T	C-TL	LM
ICC (95% CI)	.99 (.99-.99)	.95 (.93-.97)	.98 (.98-.99)	.85 (.20-.95)	.92 (.68-.97)	.91 (.84-.95)	-
Mean error (95% CI)	.0 mm (-.4 mm-.5 mm)	-.1° (-.6°-.5°)	.0° (-.2°-.1°)	5.5° (4.6°-6.5°)	4.1° (3.2°-5.1°)	2.6° (1.6°-3.7°)	-
SD	2.2 mm	2.7°	.6°	4.7°	5.0°	5.5°	-
RMSE	2.2 mm	2.7°	.6°	7.2°	6.5°	6.1°	-
Pearson correlation (r) (P-value)	.99 (<.001)	.95 (<.001)	.98 (<.001)	.94 (<.001)	.95 (<.001)	.93 (<.001)	-
Cohen’s κ (95% CI) (P-value)	-	-	-	-	-	-	.87 (.79-.94) (<.001)
Accuracy	-	-	-	-	-	-	.84
Rater 1B vs Rater 2
Statistical method	CB	T1-tilt	CA	C-PT	C-T	C-TL	LM
ICC (95% CI)	.99 (.99-.99)	.95 (.93-.97)	.96 (.94-.97)	.88 (.34-.96)	.93 (.65-.97)	.93 (.85-.96)	-
Mean error (95% CI)	.2 mm (-.3 mm-.6 mm)	-.1° (-.7°-.4°)	-.1° (-.3°-.1°)	4.8° (3.9°-5.7°)	4.1° (3.2°-5.0°)	2.6° (1.6°-3.6°)	-
SD	2.1 mm	2.7°	1.0°	4.4°	4.6°	5.0°	-
RMSE	2.1 mm	2.7°	1.0°	6.6°	6.2°	5.6°	-
Pearson correlation (r) (P-value)	.99 (<.001)	.95 (<.001)	.96 (<.001)	.94 (<.001)	.96 (<.001)	.94 (<.001)	-
Cohen’s κ (95% CI) (P-value)	-	-	-	-	-	-	.84 (.75-.93) (<.001)
Accuracy	-	-	-	-	-	-	.83

The inter-rater reliability analysis (exemplarily rater 1A vs rater 2) shows highest agreement for the CB (ICC: .99), lowest agreement for C-PT (ICC: .85) and a Cohen’s kappa of .87 for LM. Mean errors and RMSEs are lowest for CA (mean error: .0°, RMSE: .6°) and highest for C-PT (mean error: 5.5°, RMSE: 7.2°).

ICC values for the inter-rater reliability agreement between AI and the human raters’ range between .78 (C-PT, rater 1A vs AI) and .98 (CA; rater 1A vs AI; Table 3). Consistent with the results of the inter-rater reliability analysis between human raters, the mean errors and RMSEs are lowest for the CA (RMSE: .7°-1.1°) and highest for the C-PT (7.3°-8.3°). Exemplarily, scatter diagrams showing the agreement between rater 1A and the AI-algorithm are displayed in Figure 3. The proposed algorithm takes on average 15 seconds for analyzing 1 image, whereas human raters reported a required measurement time between 3 to 7 minutes for the same analysis.

Table 3.

Inter-rater reliability of rater 1A, 1B and rater 2 vs the algorithm (n = 100).

Statistical Method	CB	T1-tilt	CA	C-PT	C-T	C-TL	LM
Rater 1A vs AI
ICC (95% CI)	.96 (.94-.97)	.94 (.90-.96)	.98 (.96-.98)	.78 (.5-.89)	.93 (.89-.95)	.83 (.76-.88)	-
Mean error (95% CI)	-.5 mm (-1.4 mm-.4 mm)	-.9° (-1.4° to -.3°)	.2° (.1°-.3°)	4.7° (3.3°-6.0°)	-.8° (-2.0°-.5°)	1.0° (-.7°-2.7°)	-
SD	4.6 mm	2.8°	.7°	6.9°	6.2°	8.5°	-
RMSE	4.6 mm	3.0°	.7°	8.3°	6.3°	8.6°	-
Pearson correlation (r) (P-value)	.96 (<.001)	.96 (<.001)	.98 (<.001)	.84 (<.001)	.93 (<.001)	.83 (<.001)	-
Cohen’s κ (95% CI) (P-value)	-	-	-	-	-	-	.84 (.75-.93) (<.001)
Accuracy	-	-	-	-	-	-	.82
Rater 1B vs AI
ICC (95% CI)	.97 (.95-.98)	.94 (.90-.96)	.95 (.93-.97)	.80 (.60-.88)	.92 (.88-.95)	.82 (.75-.88)	-
Mean error (95% CI)	-.4 mm (-1.2 mm-.5 mm)	-.9° (-1.5° to -.3°)	.1° (-.1°-.3°)	3.9° (2.5°-5.3°)	-.8° (-2.1°-.5°)	1.0° (-.8°-2.7°)	-
SD	4.3 mm	2.8°	1.1°	7.0°	6.5°	8.6°	-
RMSE	4.3 mm	2.9°	1.1°	8.1°	6.5°	8.7°	-
Pearson correlation (r) (P-value)	.97 (<.001)	.96 (<.001)	.95 (<.001)	.84 (<.001)	.92 (<.001)	.83 (<.001)	-
Cohen’s κ (95% CI) (P-value)	-	-	-	-	-	-	.82 (.72-.92) (<.001)
Accuracy	-	-	-	-	-	-	.81
Rater 2 vs AI
ICC (95% CI)	.96 (.94-.97)	.90 (.86-.94)	.96 (.94-.97)	.84 (.77-.89)	.86 (.64-.93)	.82 (.74-.87)	-
Mean error (95% CI)	-.5 mm (-1.5 mm-.4 mm)	-.8° (-1.5° to -.1°)	.2° (.1°-.4°)	-.9° (-2.3°-.6°)	-4.9° (-6.3° to -3.5°)	-1.6° (-3.3°-.0°)	-
SD	4.8 mm	3.5°	.9°	7.3°	7.1°	8.2°	-
RMSE	4.8 mm	3.6°	1.0°	7.3°	8.7°	8.4°	-
Pearson correlation (r) (P-value)	.96 (<.001)	.92 (<.001)	.96 (<.001)	.84 (<.001)	.90 (<.001)	.82 (<.001)	-
Cohen’s κ (95% CI) (P-value)	-	-	-	-	-	-	.83 (.73-.93) (<.001)
Accuracy	-	-	-	-	-	-	.83

Figure 3.

Scatterplots exemplarily displaying the correlation between the measurements of rater 1A and the AI-algorithm for all 100 evaluations.

Discussion

The reliable evaluation of AP X-ray images is essential for the diagnosis, surgical planning, and post-interventional assessment of AIS patients. The presented results confirm the study hypothesis and demonstrate that the novel AI-based algorithm is able to compute essential preoperative radiological parameters with excellent ICC values (>.75;^27,28) and RMSE values similar to those of experienced physicians. The reliability and speed with which AI is able to conduct measurements supports its application in clinical routine as well as the analysis of large datasets for research purposes.

Previous studies have favored Cobb angles as the primary object of investigation for determining the reliability of AI-generated and human-generated measurements. For example, the investigation from Pan et al¹³ on the reliability of a Mask R-CNN model against manual measurements by radiologists generated excellent ICC values at .85. However, their model did not differentiate between primary and secondary Cobb angles, recording only 14 double curve scoliosis. A further limitation was the misclassification of 3 scoliotic curves as 1 single combined curve. In line with the present study, Galbusera et al²⁴ also used EOS images. However, their evaluation of a single Cobb angle (the most severe curve) relied on a smaller, 50-patient cohort, and Cobb angle measurements were based on pre-defined end vertebrae not identified by humans and the AI-algorithm independently. Furthermore, they did not evaluate intra- and inter-rater reliability between different human raters (1 rater approach), and quantified agreement between human assessment and AI with a Pearson’s correlation coefficient value (r = .83) and not with an ICC.²⁴ Sun et al presented a promising CNN-based approach to automatically determine the Cobb angles on a smaller cohort of 36 images, resulting in ICC values of .99.³¹ The specific type of ICC used for the analysis was however not reported. Horng et al²⁵ investigated the automatic measurement of radiographs of the whole spine including a smaller sample of 35 images. The ICC for Cobb angles were determined above .93, similar to the present study for C-T.²⁵ The application of different statistical approaches (eg, no usage of ICCs as recommended or the unclear definition of the used ICC model (absolute agreement vs consistency)) and a different measurement methodology complicates a rigorous comparison in the scientific literature. The present study addresses these limitations in 3 ways. First, it conducts an in-depth intra- and inter-rater reliability analysis using in particular ICCs and measurements of 2 independent physicians as ground truth values. Second, the study evaluates a comparatively large patient cohort of 100 patients. Lastly, it evaluates additional parameters relevant to the characterization of scoliotic deformities including a subdivision of Cobb angles. For all the selected parameters, ICC values for inter- and intra-rater reliability of human raters could be determined between .85 and 1.0, substantiating the quality of the applied ground truths and facilitating a more rigorous statistical comparison than preceding studies. Based on this profound human measured reference, the AI analysis resulted in excellent agreement (>.75;^28,29).

Consistently, for the inter-rater reliability between both human raters and between humans and AI-algorithm, the highest ICC values were typically calculated for the parameters CA and the CB (ICCs between .95 and .99; Tables 2 and 3). The high reliability between human raters and between physicians and AI for these parameters is due to the visibility and identifiability of the requisite anatomical entities (clavicles, C7 vertebrae, sacrum) in the images. The smallest ICC values were found for the C-PT (ICCs inter-rater reliability humans: .85-.88, Table 2; AI vs human: .78-.84, Table 3), when comparing results between physicians and between AI and human raters. As demonstrated by Goldberg et al,⁹ the proximal curve is often short-stretched and unpronounced, complicating an unambiguous identification of each end vertebra of the scoliotic curves. In addition, in the proximal thoracic region, the proximity and/or potential superimposition of adjacent anatomical structures such as the scapulae, clavicles, ribs, sternum, and the mediastinum make the identification of anatomical entities more difficult and thus impede measurements. Therefore, standardized, and deterministic AI-based analyses could in the future also enable the efficient, consistent assessment of even more demanding measurements in day-to-day clinical practice.

Despite the advances the study makes in using a fully automatic method for the reliable determination of several coronal parameters, it demonstrates some limitations. The AI-algorithm was evaluated to measure exclusively coronal parameters in this study, although parameters of sagittal spinal balance are also instrumental in the treatment of AIS patients. In fact, the algorithm has already been proven to reliably determine sagittal parameters in mainly non-scoliotic patients.^11,32 However, a validation study on sagittal parameters in AIS patients with pronounced coronal deformities is more challenging and necessitates further investigation in a future study. A further limitation of the current scientific literature and the current study is the analysis of exclusively preoperative images. Postoperative images with spinal implants (eg, cages, screws, etc.) can obscure spinal bony structures and thus might complicate automated analysis, which requires further development and validation. Furthermore, the study relied on a mono-centric approach, evaluating EOS images from a single clinical site. Future studies should source images from more spine centers with images from different X-ray machines and a more heterogenous patient cohort (eg, de novo and other secondary scoliosis or spinal deformities) to account for the diversity of clinical routine.

In conclusion, the study thoroughly evaluated a novel developed AI-based algorithm for coronal radiographic parameters that alleviates physicians from time-consuming routine work and error-prone measurements, thereby allowing the analysis of large datasets for research purposes (eg, in the large registry studies) that ultimately improve the quality of care for patients suffering from spinal deformities.

Footnotes

Acknowledgments

The authors greatly appreciate the support of the AO Spine International (AO Spine Start-up Grant) for this research project. The authors would also like to thank Preston Melchert for his excellent editing and proofreading of the manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Clara Berlin

References

Bullmann

Liljenqvist

. Die idiopathische Skoliose. ortho-unfall-u2d. 2019;14(06):571-585. doi:10.1055/a-0734-5344.

Somoskeöy

Tunyogi-Csapó

Bogyó

Illés

. Accuracy and reliability of coronal and sagittal spinal curvature data based on patient-specific three-dimensional models created by the EOS 2D/3D imaging system. Spine J. 2012;12(11):1052-1059. doi:10.1016/j.spinee.2012.10.002.

Lenke

Betz

Harms

Bridwell

Clements

Lowe

Blanke

. Adolescent idiopathic scoliosis: a new classification to determine extent of spinal arthrodesis. J Bone Joint Surg Am. 2001;83(8):1169-1181.

Kuklo

Lenke

Graham

Won

Sweet

Blanke

Bridwell

. Correlation of radiographic, clinical, and patient assessment of shoulder balance following fusion versus nonfusion of the proximal thoracic curve in adolescent idiopathic scoliosis. Spine (Phila Pa 1976). 2002;27(18):2013-2020. doi:10.1097/01.BRS.0000024162.02138.

Sielatycki

Cerpa

Beauchamp

Shimizu

Wei

Pongmanee

Wang

Xue

Zhou

Liu

Yang

Suomao

Lenke

Harms Study

. The Amount of Relative Curve Correction Is More Important Than Upper Instrumented Vertebra Selection for Ensuring Postoperative Shoulder Balance in Lenke Type 1 and Type 2 Adolescent Idiopathic Scoliosis. Spine (Phila Pa 1976). 2019;44(17):E1031-E1037. doi:10.1097/BRS.0000000000003088.

Berlin

Quante

Freifrau von Richthofen

Halm

. Analysis of Preoperative and Operative Factors Influencing Postoperative Shoulder Imbalance in Lenke Type 1 Adolescent Idiopathic Scoliosis. Z Orthop Unfall. 2021;160:307-316. doi:10.1055/a-1337-3435.

Prestigiacomo

Hulsbosch

Bruls

VEJ

Nieuwenhuis

. Intra- and inter-observer reliability of Cobb angle measurements in patients with adolescent idiopathic scoliosis. Spine Deform. 2022;10(1):79-86. doi:10.1007/s43390-021-00398-0.

Pruijs

Hageman

Keessen

van der Meer

van Wieringen

. Variation in Cobb angle measurements in scoliosis. Skeletal Radiol. 1994;23(7):517-520. doi:10.1007/bf00223081.

Goldberg

Poitras

Mayo

Labelle

Bourassa

Cloutier

. Observer variation in assessing spinal curvature and skeletal development in adolescent idiopathic scoliosis. Spine (Phila Pa 1976). 1988;13(12):1371-1377. doi:10.1097/00007632-198812000-00008.

10.

Clement

Anari

Bartley

Bastrom

Shah

Talwar

Upasani

. What are normal radiographic spine and shoulder balance parameters among adolescent patients? Spine Deform. 2020;8(4):621-627. doi:10.1007/s43390-020-00074-9.

11.

Grover

Siebenwirth

Caspari

Drange

Dreischarf

Le Huec

Putzier

Franke

. Can artificial intelligence support or even replace physicians in measuring sagittal balance? A validation study on preoperative and postoperative full spine images of 170 patients. Eur Spine J. 2022;31(8):1943-1951. doi:10.1007/s00586-022-07309-5.

12.

Langensiepen

Semler

Sobottke

Fricke

Franklin

Schönau

Eysel

. Measuring procedures to determine the Cobb angle in idiopathic scoliosis: a systematic review. Eur Spine J. 2013;22(11):2360-2371. doi:10.1007/s00586-013-2693-9.

13.

Pan

Chen

Wang

Zhu

Fang

. Evaluation of a computer-aided method for measuring the Cobb angle on chest X-rays. Eur Spine J. 2019;28(12):3035-3043. doi:10.1007/s00586-019-06115-w.

14.

Papaliodis

Bonanni

Roberts

Hesham

Richardson

Cheney

Lawrence

Carl

Lavelle

. Computer Assisted Cobb Angle Measurements: A novel algorithm. Int J Spine Surg. 2017;11:21. doi:10.14444/4021.

15.

Kim

Moon

Yoon

Chung

Song

Suh

Lee

Kim

. Scoliosis imaging: what radiologists should know. Radiographics. 2010;30(7):1823-1842. doi:10.1148/rg.307105061.

16.

Lewis

Keshen

Kato

Dear

Gazendam

. Risk Factors for Postoperative Coronal Balance in Adult Spinal Deformity Surgery. Global Spine J. 2018;8(7):690-697. doi:10.1177/2192568218764904.

17.

Malfair

Flemming

Dvorak

Munk

Vertinsky

Heran

Graeb

. Radiographic evaluation of scoliosis: review. AJR Am J Roentgenol. 2010;194(3 suppl l):S8-S22. doi:10.2214/ajr.07.7145.

18.

Menezes

Lima

Falcon

Souza

. RED (2019) The Importance of Clavicle Angle and Height of the Coracoid Process in Idiopathic Scoliosis. Coluna/Columna. 2019;18(3):196-199. doi:10.1590/s1808-185120191803196866.

19.

Zhang

Wang

Chi

. Coronal T1 Pelvic Tilt, a Novel Predictive Index for Global Coronal Alignment in Adult Spinal Deformity. Spine (Phila Pa 1976). 2020;45(19):1335-1340. doi:10.1097/brs.0000000000003522.

20.

Gkioxari

Dollar

Girshick

. Mask R-CNN. IEEE International Conference on Computer Vision (ICCV); 2017. Paper presented at the 2017.

21.

Contributors

. Model Zoo; 2020. https://pytorch.org/serve/model_zoo.html.

22.

Desautels

Calvert

Hoffman

Mao

Jay

Fletcher

Barton

Chettipally

Kerem

Das

. Using Transfer Learning for Improved Mortality Prediction in a Data-Scarce Hospital Setting. Biomed Inform Insights. 2017;9:1178222617712994. doi:10.1177/1178222617712994.

23.

Tan

Sun

Kong

Zhang

Yang

Liu

. A Survey on Deep Transfer Learning. In: The 27th International Conference on Artificial Neural Networks (ICANN 2018). Springer International Publishing; 2018:270-279.

24.

Galbusera

Niemeyer

Wilke

Bassani

Casaroli

Anania

Costa

Brayda-Bruno

Sconfienza

. Fully automated radiological analysis of spinal disorders and deformities: a deep learning approach. Eur Spine J. 2019;28(5):951-960. doi:10.1007/s00586-019-05944-z.

25.

Horng

Kuok

Lin

Sun

. Cobb Angle Measurement of Spine from X-Ray Images Using Convolutional Neural Network. Comput Math Methods Med. 2019;2019:6357171-6357218. doi:10.1155/2019/6357171.

26.

Koo

. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med. 2016;15(2):155-163. doi:10.1016/j.jcm.2016.02.012.

27.

Cohen

. A Coefficient of Agreement for Nominal Scales. Educ Psychol Meas. 1960;20(1):37-46. doi:10.1177/001316446002000104.

28.

Cicchetti

. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess. 1994;6(4):284-290.

29.

Landis

Koch

. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.

30.

Van Rossum

. Python Reference Manual. Amsterdam. Centrum voor Wiskunde en Informatica; 1995.

31.

Sun

Xing

Zhao

Meng

Hai

. Comparison of manual versus automated measurement of Cobb angle in idiopathic scoliosis based on a deep learning keypoint detection technology. Eur Spine J. 2022;31(8):1969-1978. doi:10.1007/s00586-021-07025-6.

32.

Orosz

Bhatt

Jazini

Dreischarf

Grover

Grigorian

Roy

Schuler

Good

Haines

. Novel artificial intelligence algorithm: an accurate and independent measure of spinopelvic parameters. J Neurosurg Spine. 2022;37:1-9. doi:10.3171/2022.5.Spine22109.

Novel AI-Based Algorithm for the Automated Computation of Coronal Parameters in Adolescent Idiopathic Scoliosis Patients: A Validation Study on 100 Preoperative Full Spine X-Rays

Abstract

Study design

Objectives

Methods

Results

Conclusions

Keywords

Introduction

Methods

Study Design and Patient Selection

Evaluated Coronal Parameters

AI-Algorithm

Phase 1: Segmentation

Preprocessing and Data Enhancement

Training

Phase 2: Parameter Determination

Statistical Analysis

Results

Discussion

Footnotes

Acknowledgments

Declaration of conflicting interests

Funding

ORCID iD

References