Abstract
Study Design
Retrospective validation study.
Objectives
To develop and validate VerteRo, a fully automated deep learning-based tool that estimates axial vertebral body rotation (VBR) from standard posteroanterior (PA) scoliosis radiographs with high accuracy, overcoming the limitations of categorical grading systems.
Methods
A multi-stage convolutional neural network pipeline incorporating automated region-of-interest (ROI) extraction, vertebral segmentation, pedicle localization, and geometric rotation estimation was trained on 97 AIS radiographs. Three object detection architectures (Faster R-CNN, YOLOv11 L, YOLOv12 L) were evaluated. The mean absolute error (MAE) relative to Stokes-derived reference angles was measured as the primary outcome and concordance with Nash–Moe grading was measured as the secondary outcome.
Results
Faster R-CNN demonstrated superior pedicle detection (F1 = 0.93) compared with YOLOv11 L and YOLOv12 L. VerteRo achieved a mean absolute error of 8.239° relative to reference standards. 71.34% of vertebrae demonstrated exact grade agreement with the Nash–Moe grading system.
Conclusions
This pilot study provides a proof-of-concept for a fully automated, end-to-end solution for quantitative axial rotation assessment from PA radiographs, offering improved objectivity over categorical grading methods. Its ability to generate quantitative axial rotation measures rather than coarse categorical grades has strong potential for supporting research applications and potential decision-support in clinical assessment following further validation.
Keywords
Introduction
Adolescent idiopathic scoliosis (AIS) is a complex three-dimensional spinal deformity defined by a lateral deviation of the spine greater than 10° in the coronal plane, accompanied by axial vertebral rotation (AVR).1,2 Occurring primarily in children over the age of 10 until skeletal maturity, AIS presents with external manifestations such as rib humps and shoulder asymmetry, often leading to back pain, respiratory restriction, and body image dissatisfaction.1,2,3 Diagnosis of AIS is made through assessing the coronal spinal curvature, most commonly using the Cobb angle, on a full-spine standing posteroanterior (PA) plain film radiograph. Treatment options for AIS range from non-operative approaches, such as observation and bracing, to operative interventions, including posterior spinal fusion and vertebral body tethering (VBT).1,2,3
While coronal curvature dictates the primary diagnosis, axial vertebral rotation is a critical prognostic factor and a necessary component for both pre-operative planning and assessing the overall severity of the three-dimensional deformity.3,4 The gold standard for measuring vertebral body rotation (VBR) is axial imaging via computed tomography (CT) or magnetic resonance imaging (MRI). 2 However, routine use of CT or MRI is severely limited by a confluence of factors: high cost, as CT and MRI are associated with substantially higher costs compared with plain film radiography 5 ; equipment availability constraints leading to prolonged waiting times, and most importantly, substantially higher radiation exposure compared with plain radiography.6,7 Given the heightened susceptibility of children to radiation-induced malignancy and the need for repeated imaging during growth, minimizing unnecessary CT utilization aligns with the ALARA (As Low As Reasonably Achievable) principle.7,8
Consequently, clinicians often rely on estimating rotation from PA radiographs using methods such as the Nash-Moe or Perdriolle techniques.9-12 These manual methods are subjective, labor-intensive, and inherently limited by their categorical nature (e.g., Grade 0-4), which fails to capture granular changes in rotational deformity. 2
Machine learning (ML) has recently transformed scoliosis assessment, with numerous tools developed for automated Cobb angle measurement.13,14 However, automated quantification of axial rotation remains under-explored. Early efforts, such as those by Ebrahimi et al (2019), demonstrated feasibility but required manual inputs for landmarking. 15 Furthermore, most existing ML tools output categorical Nash-Moe grades rather than continuous angular measurements, limiting their precision.16-19
To address these gaps, we present VerteRo, a fully automated deep neural network pipeline. This study aims to develop and validate a system capable of: (1) automatically extracting the vertebral column from PA radiographs; (2) precisely localizing pedicles using advanced object detection models (Faster R-CNN, YOLOv11 L, YOLOv12 L); and (3) computing a continuous axial rotation angle without human intervention.
Materials and Methods
Study Design and Ethics
A retrospective chart review was conducted on 105 patients diagnosed with AIS between December 2023 and November 2025. This pilot proof-of-concept study adheres to the CLAIM (Checklist for Artificial Intelligence in Medical Imaging) and MINIMAR guidelines.20,21 Ethical approval was obtained from the Institutional Review Board (IRB), and the study complies with the Declaration of Helsinki.
Dataset Composition
The distribution of rotation severity based on Nash–Moe grading across all analyzed vertebrae (n = 1649 from 97 patients) was as follows: Grade 0, 1048 vertebrae (63.6%); Grade 1, 398 vertebrae (24.1%); Grade 2, 113 vertebrae (6.9%); and Grade 3, 41 vertebrae (2.5%). In addition, 2 vertebrae (0.1%) had both pedicles obstructed by gastrointestinal structures, and 47 vertebrae (2.9%) could not be visualized due to poor radiographic quality. A total of 1649 vertebrae were included in the final rotation analysis.
Data Preprocessing and ROI Extraction
To ensure privacy and focus the model, the vertebral column was automatically extracted using a structural-based algorithm (Figure 1): (1) Vertical Projection: The image was vertically divided into regions, isolating the central spinal column. (2) Horizontal Projection: An intensity histogram was generated along image rows to identify the superior border of the region of interest (ROI), defined by the anatomical narrowing of the neck. (3) CLAHE Enhancement: The extracted ROIs were enhanced using Contrast Limited Adaptive Histogram Equalization (CLAHE) to improve local contrast and bone edge visibility. We utilized a tile grid size of 8 × 8 and a clip limit of 2.0. The equalization formula used was: (4) Bilinear Interpolation: Applied post-enhancement to soften artifacts and blend tile boundaries. Example of a PA view radiograph being processed using the vertical and horizontal projection techniques for vertebral column extraction A side-by-side comparison of (A) an original vertebral column ROI and (b) enhanced vertebral column ROI which was enhanced using CLAHE


Vertebral Column Annotation and Augmentation
Of the initial 105 patients identified, 97 posteroanterior whole-spine radiographs met the image quality and field-of-view requirements for automated region-of-interest (ROI) extraction and were included in the final analysis. Eight radiographs were excluded due to incomplete visualization of the thoracolumbar spine or insufficient image quality for reliable automated segmentation. After the region of interest was extracted from each X-ray image, all 97 ROI images were used to evaluate the proposed methodology. All images in our dataset were annotated using the Roboflow platform.
22
Each image was annotated into 17 classes consisting of all vertebrae in the thoracolumbar segment (T1-L5) (Figure 3). To increase the robustness and generalizability of the model, the dataset was augmented (rotation ±15°, zoom 0-20%, Gaussian blur, and salt-and-pepper noise, probability 0.1%), resulting in an expanded dataset of 276 images. An example of a thoracolumbar PA view radiograph annotated using the Roboflow platform. Segmentation and classification of each vertebra was performed using the Faster R-CNN, YOLOv11 L and YOLOv12 L
Pedicle Detection and Identification
Pedicle detection and identification were performed using three different deep learning models: Faster R-CNN, YOLOv11 L, and YOLOv12 L, utilizing the dataset obtained from the previous step.
First, the left and right pedicles of each vertebral image were annotated using the Roboflow platform (Figure 4). Next, the image data were augmented by rotating each image between ±15° to generate a final dataset of 4080 images. This new dataset of 4080 vertebral images was divided into three subsets: 3860 images for the training set, 147 images for the validation set, and 73 images for the test set. An example of pedicle annotation in a single vertebra segment using the Roboflow platform
Deep Learning Architecture
Three state-of-the-art computer vision models were used to detect and identify the pedicles: Faster R-CNN, YOLOv11 L, and YOLOv12 L. The architecture and hyperparameter settings for each model were as follows: (1) Faster R-CNN: In this study, Faster R-CNN used ResNet-50 FPN as the backbone, which was pretrained with the COCO train2017 dataset.
23
The hyperparameters were set with Stochastic Gradient Descent (SGD) as the optimizer, learning rate (LR) as 0.005, momentum as 0.9, and weight decay as 0.0005. Moreover, early stopping was implemented to halt the training if the validation loss did not improve within 100 epochs. The learning rate was automatically adjusted using ReduceLROnPlateau. The model was trained for 300 epochs with a batch size equal to 2. The model achieved an early stop at 112 epochs. (2) YOLOv11 L: A single-stage detector optimized for real-time inference that was pretrained with the COCO dataset.
24
In our study, the vertebral images were resized to 640 × 640. The hyperparameters were set with a learning rate as 0.01, momentum as 0.937, weight decay as 0.0005, batch size as 32, and the number of epochs as 300. The model achieved an early stop at 164 epochs. (3) YOLOv12 L: The latest large-scale model in the YOLO series, incorporating attention-centric architecture and optimized feature aggregation for improved detection efficiency.
24
For direct performance comparison with YOLOv11 L, the YOLOv12 L model was used with the exact same hyperparameter settings. The model achieved an early stop at 205 epochs.
The results obtained from the three models were classified into three cases for analysis: both pedicles were detected, only one pedicle was detected, and no pedicle was detected.
Image Enhancement With BSRGAN
Given the small size of pedicles relative to the full spine, we applied a Blind Super-Resolution Generative Adversarial Network (BSRGAN) to upsample and sharpen individual vertebral crops before pedicle detection. This step was critical for resolving subtle osseous boundaries in low-quality radiographs.
25
(Figure 5). A side-by-side comparison of (A) An original image of a single vertebra segment and (B) An image of a single vertebra segment enhanced using BSRGAN
Vertebral Rotation Measurement
The first crucial step in orientation estimation was isolating the target vertebra by removing irrelevant image components. This was achieved by initially converting the image to a binary format using Otsu’s thresholding method. Subsequently, all connected components were identified, and only the largest component, the vertebra, was preserved (Figure 6). Next, the boundary of the isolated vertebra was meticulously detected: an erosion operation, employing a 3 × 3 structuring element, was used to shrink the preserved vertebra, and the resulting shrunk image was then subtracted from the original to highlight the vertebral outline. For the actual orientation estimation of the vertebra (θ), a methodology was implemented that relies exclusively on the successful detection of both pedicles. (1) Orientation Estimation: To correct for vertebral tilt, the orientation angle (θ) was determined by calculating the angle between the horizontal axis and the line connecting the centroids of the left (x1, y1) and right (xr, yr) pedicles. This was computed using the arccosine function as follows to ensure rotational invariance and numerical stability across variable pedicle separations: (2) Rotation Calculation: Using the corrected pedicle positions, the axial rotation angle (θ) was estimated using the Stokes method,
10
which is a plain film radiographic surrogate for the vertebral body rotation angle that relies on calculating the offset of the pedicle from the vertebral body center. Boundary detection of a vertebra using the erosion feature with a 3 × 3 structuring element

Results
Vertebra Segmentation and Classification
The dataset was split into training (207 images), validation (60 images), and testing (9 images) sets. The data was split at the patient level prior to augmentation, ensuring that no augmented images from the same patient were present across training, validation and test sets. The Faster R-CNN model performed the best for vertebra segmentation and classification with an F1 score of 0.96 at a confidence threshold of 0.8. YOLOv12 L obtained an F1 score of 0.93 at a confidence threshold of 0.469 and YOLOv11 L obtained an F1 score of 0.91 at a confidence threshold of 0.339. The detailed performance and class-wise misclassification of the Faster R-CNN model are illustrated in Figure 7, showing the confusion matrix for vertebra segmentation and classification. Confusion matrix of the vertebra segmentation and classification using the Faster R-CNN models
Pedicle Detection and Identification
The Faster R-CNN model performed the best for pedicle detection and identification with an F1 score of 0.93 at a confidence threshold of 0.95. YOLOv11 L obtained an F1 score of 0.84 at a confidence threshold of 0.413 and YOLOv12 L obtained an F1 score of 0.86 at a confidence threshold of 0.618. The confusion matrix for pedicle detection and identification using the top-performing Faster R-CNN model is presented in Figure 8. The model achieved high classification accuracy for both L-pedicle (0.94) and R-pedicle (0.96). Confusion matrix of the pedicle detection and identification using the Faster R-CNN models
Estimation of Vertebral Body Rotation
The primary outcome, being VerteRo’s geometric calculation module yielded a mean absolute error (MAE) of 8.239° (calculated from 1600 out of 1649 vertebral body levels where both pedicles could be clearly visualized), when compared with the Stokes-derived radiographic surrogate. The secondary outcome, being VerteRo’s concordance with Nash–Moe grading was 71.34% (exact grade agreement).
Discussion
This study introduces VerteRo, a fully automated deep neural network–based tool for estimating axial vertebral body rotation (VBR) from standing posteroanterior whole-spine radiographs in patients with adolescent idiopathic scoliosis (AIS). Axial vertebral rotation represents a fundamental component of the three-dimensional deformity in AIS and contributes to clinical progression, rib prominence, and surgical planning.1-4 While computed tomography (CT) remains the gold standard for rotation assessment, its routine use is limited by cost and radiation exposure concerns, particularly in pediatric populations.2,6,7 VerteRo achieved a mean absolute error (MAE) of 8.239° relative to the Stokes method 10 and demonstrated 71.34% concordance with Nash–Moe grading, supporting the feasibility of fully automated axial rotation estimation from plain radiographs. Importantly, the Stokes method represents a radiographic surrogate of vertebral rotation derived from two-dimensional projections and does not reflect true three-dimensional axial rotation as measured by CT or MRI. Therefore, the present study validates agreement with a radiographic estimation method rather than a true anatomical gold standard.
The first major contribution of VerteRo is its fully automated, end-to-end workflow. Unlike prior quasi-automated approaches that required manual landmarking or vertebral level labeling, 15 VerteRo integrates vertebral segmentation and classification, pedicle detection using Faster R-CNN, 23 contrast enhancement with CLAHE, super-resolution refinement using BSRGAN, 25 and geometric computation of rotation based on pedicle centroid alignment derived from established radiographic principles.9,10 By modularizing pedicle detection and geometric estimation into separate stages, the system reduces compounded localization errors and simplifies task-specific optimization. This architecture builds upon earlier machine learning efforts in scoliosis imaging,13,14,17,18 while eliminating the need for manual intervention, thereby enhancing reproducibility and scalability in clinical workflows.
The second important advancement of VerteRo is its generation of continuous angular rotation measurements rather than categorical outputs alone. Traditional systems such as the Nash–Moe classification remain widely used but are inherently ordinal and limited in precision.10-12 Recent deep learning approaches have replicated Nash–Moe grading, 16 yet categorical classification cannot detect subtle interval changes that may precede visible coronal curve progression. Continuous angular measurement allows more granular monitoring of deformity severity and longitudinal change, particularly in non-operative management such as bracing, where small increments in rotation may be clinically relevant.1-4 The observed 71.34% concordance with Nash–Moe grading, analyzed as a secondary outcome, should be interpreted in the context of comparing a continuous variable with an ordinal classification system. Small angular deviations near grade thresholds may produce categorical disagreement despite minimal clinical difference. Thus, percentage agreement may underestimate the practical alignment between continuous rotation estimation and categorical grading systems.
The third technical contribution of VerteRo lies in its targeted image enhancement strategy. Pedicles are small anatomical structures frequently obscured by rib overlap, mediastinal shadows, and radiographic noise, especially in the upper thoracic spine. Contrast Limited Adaptive Histogram Equalization improves local contrast, while BSRGAN-based super-resolution enhances boundary sharpness prior to pedicle detection. 25 These preprocessing steps improve landmark visibility and contribute to the superior performance of the Faster R-CNN model for small-object detection. 23 This tailored enhancement approach addresses a key limitation in applying generic object detection frameworks to spinal radiographs and strengthens detection robustness in anatomically complex regions.
VerteRo also has meaningful potential future implications for radiation reduction strategies in pediatric spine care. By extracting rotational information from standard PA radiographs, this approach aligns with the ALARA (As Low As Reasonably Achievable) principle2,6,7 and may reduce the need for CT imaging performed solely for rotational assessment. 8 Although the MAE was 8.239°, this reflects the inherent limitations of estimating three-dimensional vertebral rotation from two-dimensional projections.2,9 Minor deviations in pedicle localization or projection geometry may disproportionately influence angular estimation, particularly in low-grade rotations. Nevertheless, the error magnitude is consistent with the technical challenges historically associated with radiographic rotation measurement.
Several limitations should be acknowledged. First, this was a retrospective study involving 97 patients from a limited institutional setting, which may restrict generalizability. External validation using multi-center datasets and varied radiographic equipment is necessary to confirm robustness. Second, upper thoracic vertebrae are particularly susceptible to pedicle obscuration by mediastinal structures, which may reduce detection reliability. Third, axial rotation estimated from frontal projection remains a two-dimensional proxy for a three-dimensional deformity, and anatomical factors such as vertebral wedging or projection variability may influence pedicle geometry relative to true rotation.2,9 Fourth, 63.6% of the vertebral body levels in the dataset are classified as grade 0 using the Nash-Moe method, thus the higher-grade vertebral body rotation samples are underrepresented. This distribution may bias model performance toward lower rotation cases and potentially overestimate overall accuracy, particularly when applied to more severe deformities. This may affect the model’s performance when analyzing vertebral bodies with high grade rotation. Future studies should explore incorporation of additional techniques such as complementary imaging planes, including lateral radiographs, to further refine three-dimensional estimation. External validation utilizing a larger and more diverse dataset with radiographs from multiple different hospitals can also help improve the accuracy, reduce overfitting and improve the generalizability of VerteRo’s estimation before broader clinical application.
Conclusions
VerteRo demonstrates that a deep learning–based framework can achieve fully automated, quantitative estimation of axial vertebral rotation from standard posteroanterior radiographs in patients with adolescent idiopathic scoliosis. By integrating end-to-end automation, continuous angular output, and targeted image enhancement, this approach addresses key limitations of manual and categorical rotation assessment methods. With further external validation and broader dataset expansion, VerteRo may provide a reproducible and radiation-conscious approach for objective monitoring following further validation.
Footnotes
Acknowledgements
This research project is supported by the Quick Win project of the Ratchadapiseksompoch Fund, Chulalongkorn University. Furthermore, the authors would like to thank the AASCE-MICCAI 2019 dataset for providing posteroanterior (PA) view whole spine radiographs for this study.
ORCID iDs
Author Contributions
Conceptualization: VK, RL, SS.
Formal Analysis: SS, RL, KP, JT.
Methodology: VK, RL, SS, KP, JT.
Project Administration: VK, WL, WS, RL.
Manuscript Writing: SS, SJ, KS, RL.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research project was supported by the Quick Win Project of the Ratchadapiseksompoch Fund, Chulalongkorn University. (No. 767006).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
