Abstract
Objective
Deep learning is an advanced machine-learning approach that is used in several medical fields. Here, we developed a deep learning model using an object detection algorithm to identify the L5 vertebra on anteroposterior lumbar spine radiographs, and assessed its detection accuracy.
Methods
We retrospectively recruited 150 participants for whom both anteroposterior whole-spine and lumbar spine radiographs were available. The anteroposterior lumbar spine radiographs of these patients were used as the input data. Of the 150 images, 105 (70%) were randomly selected as the training set, and the remaining 45 (30%) were assigned to the validation set. YOLOv5x, of the YOLOv5 family model, was used to detect the L5 vertebra area.
Results
The mean average precisions 0.5 and 0.75 of the trained L5 detection model were 99.2% and 96.9%, respectively. The model’s precision was 95.7% and its recall was 97.8%. Furthermore, 93.3% of the validation data were correctly detected.
Conclusion
Our deep learning model showed an outstanding ability to identify L5 vertebrae.
Introduction
Accurate and reliable spine numbering is critical for pathological diagnosis and pre-procedural or pre-operative planning. However, it can be difficult in patients with lumbosacral transitional vertebrae, especially when whole-spine images are not available. 1 There are two types of lumbosacral transitional vertebrae: 1) the sacralization of L5, a congenital spinal anomaly in which an elongated transverse process of the last lumbar vertebra fuses with the “first” sacral segment in varying degrees;1–3 and 2) the lumbarization of S1, a spinal anomaly in which the first and second segments of the sacrum are not fused, and the lumbar spine consequently appears to have six (rather than five) vertebrae.1–4
The lumbar vertebrae can be accurately numbered using whole-spine radiography or whole-spine magnetic resonance imaging (MRI) by counting down from the C2 to the sacrum.5,6 In cases without available full-spine imaging, it is frequently difficult to accurately number the lumbar vertebrae. Up to 32% of neurosurgeons report that they have performed spinal surgery at the wrong level at least once in their careers. 7 Therefore, prior to a spinal procedure or operation, clinicians should perform whole-spine imaging in patients with spine disorders to accurately identify the lumbar vertebral level.
Deep learning techniques have recently emerged as powerful methods to automatically learn feature representations from data.8–10 In particular, these techniques can substantially improve object detection. 11 Object detection refers to determining whether there are any instances of objects from specified categories in an image; if there are, the spatial location and extent of each object instance are shown via a bounding box. 11 Object detection, which is the foundation of image comprehension and computer vision, also serves as the basis for solving complex or high-level vision tasks such as segmentation, object tracking, image captioning, activity recognition, and event detection. 12 In the medical field, object detection is widely used to identify various pathologies, such as malignancy, pneumonia, and bony fractures, on medical images.13,14 We hypothesized that object detection may similarly be applied to identify lumbar vertebral levels.
In the current study, we developed a deep learning model with an object detection algorithm to identify the L5 vertebra on lumbar spine radiographs, and evaluated its accuracy.
Methods
Participants
Participants aged ≥20 years who visited the spine center of Yeungnam University Hospital (Daegu, Republic of Korea) between June 2016 and December 2021 were retrospectively recruited. Both anteroposterior whole-spine and lumbar spine radiographs from each included participant were stored in the hospital picture archiving and communication system. The study protocol was approved by the Institutional Review Board of Yeungnam University Medical Center (IRB no. 2023-02-005) and followed the Helsinki Declaration of 1975 as revised in 2013. The requirement for written informed consent was waived because of the study’s retrospective nature. The authors have de-identified all patient details, and the reporting of this study conforms to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. 15
Input data
Each anteroposterior lumbar spine radiograph was randomly assigned to the training or validation set using the scikit-learn package (French Institute for Research in Computer Science and Automation, Rocquencourt, France) (Figure 1). Table 1 lists the dataset details. On each patient’s anteroposterior lumbar spine radiograph, the L5 vertebra was identified by counting down from C2 to the sacrum on an anteroposterior whole-spine radiograph.

Diagram of the process of the L5 detection model training. mAP, mean average precision; SGD, stochastic gradient descent.
Dataset configuration.
To label the L5 vertebra, LabelImg (Heartex, San Francisco, CA, USA; https://github.com/heartexlabs/labelImg), an open-source software that is widely used to annotate and label bounding boxes on images, was used. On each patient’s anteroposterior lumbar spine radiograph, the L5 vertebra was identified by counting down from C2 to the sacrum using the anteroposterior whole-spine radiograph as a guide. Rectangular bounding boxes were created to cover all L5 areas with a class named “L5.” All annotation files were exported in YOLO format.
Deep learning model development
Python 3.9.12 (https://www.python.org/) and PyTorch 1.12.0 (https://pytorch.org/) were used to develop a deep learning model for detecting L5 vertebrae on lumbar spine radiographs. YOLOv5x, of the YOLOv5 family model (Ultralytics, Los Angeles, CA, USA), was used to detect the L5 vertebra area (Figure 2). YOLOv5 is a family of state-of-the-art object-detection architectures that have been pretrained on the COCO dataset. Our L5 detection model detected only one class of objects. The YOLOv5x model with the highest accuracy was therefore used because accuracy was more important than detection speed. 15

Overview of the YOLOv5 model.
We used a stochastic gradient descent optimizer and generalized intersection over union loss function to optimize our model. Data augmentation techniques, such as image flip, mixup, and mosaic, improved the model’s generalization performance. Table 2 lists the details of the model’s experimental hyperparameters.
Model hyperparameters for training.
GIOU, generalized intersection over union; HSV, hue, saturation, value; lr, learning rate; lr0, initial learning rate; lrf, final learning rate; SGD, stochastic gradient descent.
Performance analysis
The mean average precisions (mAPs) 0.5 and 0.75 were used to measure the performance of the L5 detection model. The mAP_0.5 and mAP_0.75 metrics were considered to indicate “good detection” when the intersections over unions, which are an indicator of the degree to which two areas overlap between the areas of the detected and real objects, were ≥50% and ≥75%, respectively.
Failure analysis was also performed to provide additional information, to better understand the causes of incorrectly detected cases as well as the strengths and weaknesses of the deep learning algorithm. 16
Results
We recruited 150 patients (mean age, 66.86 ± 12.97 years; 84 men and 66 women) with anteroposterior whole-spine and lumbar spine radiographs. Of the 150 anteroposterior lumbar spine images, 105 (70%) were randomly selected as the training set, whereas the remaining 45 (30%) were assigned to the validation set.
The mAP_0.5 and mAP_0.75 metrics of the trained L5 detection model were 99.2% and 96.9%, respectively, indicating its outstanding performance. 17 The precision and recall were 95.7% and 97.8%, respectively. Figure 3 shows the changes in mAP_0.5 and mAP_0.75 metrics by training epoch. Of the 45 validation data points, the L5 vertebra was correctly detected with ≥90% confidence in 42 (93.3%; Figure 4).

Mean average precision (mAP) changes according to training epoch.

Examples of correctly detected cases.
Failure analysis was performed to aid our understanding of the incorrectly detected cases. The L5 vertebra was completely misdetected in two (4.4%) of the 45 validation data points (Figure 5a, b). The metal implants in Figure 5b seem to have caused this misdetection. Figure 5c shows an incorrectly detected case. In our L5 detection model, two vertebrae were selected as L5. The vertebra in the upper bounding box was presented as L5 with 87% confidence, whereas that in the lower bounding box was presented as L5 with 53% confidence; however, the vertebra in the lower bounding box was the actual L5 vertebra.

Three examples of incorrectly detected cases. (a, b) L5 detection failure and (c) L5 vertebra was incorrectly detected. Our model presented two vertebrae as L5. The upper bounding box presented the L5 vertebra with 87% confidence, whereas the lower bounding box presented the L5 vertebra with 53% confidence. However, the vertebra in the lower bounding box was the actual L5 vertebra in this case.
Discussion
In the current study, we developed a deep learning model to detect the L5 vertebra using YOLOv5x. The mAP_0.5 metric of our developed model was 98.2%. The results of the failure analysis showed that 42 cases (93.3%) among the validation dataset were correctly identified, whereas only three (6.7%) were incorrectly detected. Given that that ≥90% accuracy in a deep learning classification model is classified as outstanding, our L5 vertebra detection model appears to have outstanding performance. 17
We aimed to confirm the exact location of L5 using deep learning. The accurate determination of lumbar spine segment is of great importance. 18 Lumbar spine pathology and related clinical symptoms are strongly associated with specific lumbar nerve roots, which can be localized to the lumbar spine level. 19 The selection of incorrect lumbar spine levels increases the likelihood of identifying the wrong lumbar nerve root, which can result in the treatment of incorrect lumbar segment levels during interventional or surgical procedures. 20 Historically, several methods for accurately identifying the lumbar spine level have been reported, including a method using the 12th rib, iliolumbar ligament, kidney level, and craniocaudal level on whole-spine radiographs. However, conventional methods are not accurate;21–23 spine physicians have therefore endeavored to develop further methods for accurately identifying lumbar spine segment numbers.
The accurate selection of the L5 level is of particular importance in clinical settings because most lumbar pathologies that require intervention or surgery are located at L4 to L5 and L5 to S1. 24 However, it is difficult to identify the L5 vertebra using conventional methods for the following reasons. First, L5 is easily miscounted when sacralization or lumbarization occurs. Second, in the case of spondylolisthesis, it is easy to erroneously number the lumbar vertebrae when only an anteroposterior radiograph is examined. Third, in cases of severe osteoporosis, spinal or pelvic deformity, or congenital spinal pathologies, an accurate L5 selection is more difficult than usual. When spinal computed tomography or MRI is performed together with lumbar spinal radiography, the determination of the L5 vertebra is slightly easier; however, in clinical practice, spinal radiography is typically performed without spinal computed tomography or MRI. Therefore, during spinal surgery or intervention, the L5 vertebra can sometimes be incorrectly determined by clinicians.
Regarding the determination of the L5 vertebra, many studies have examined patients with lumbosacral transitional vertebrae. There are two main methods for determining the location of the L5 vertebra. The first method identifies the L5 vertebra on the basis of characteristic vertebrae or structures. For example, clinicians can accurately identify L5 by counting down the number of vertebral bodies from C2. Moreover, identification of the T12 vertebra by identifying the vertebra to which the last rib is attached can reveal the L1 vertebra, and the fourth vertebra below the L1 is the L5. The L5 vertebra can also be identified by its transverse process using the Ferguson view. However, in typical clinical situations, whole-spine radiography is not performed. Furthermore, the method used to initially identify T12 is relatively inaccurate because of the existence of dysplastic ribs, and the method using the Ferguson view has a sensitivity of just 76% to 84%.
The second method involves determining the L5 vertebra by identifying the iliac crest tangent sign in the coronal planes of MRI scans. However, this method has a sensitivity of just 81%. Compared with these traditional methods for counting vertebra number, our recently developed deep learning model showed high accuracy (93.3%) for detecting the L5 vertebra on anteroposterior lumbar spine radiographs. Deep learning models are characterized by multilayer structures with multiple hidden layers; they thus have a better detection accuracy ability than traditional shallow learning models [8]. We therefore believe that our deep learning model extracts valuable features that differentiate the L5 vertebra from other vertebral levels.
In the present study, we developed a deep learning model for detecting the L5 vertebra using YOLOv5x that showed outstanding performance. We believe that our model may be useful in clinical practice. To our knowledge, our study offers the first deep learning model to detect the L5 vertebra. However, the present investigation had some limitations. First, it included a relatively small amount of imaging data. Second, we used a dataset from a single hospital. Third, we did not compare the accuracy of our developed model with that of humans. Fourth, we did not evaluate the performance of our developed model using data from external clinics or hospitals. To increase the accuracy and versatility of our deep learning model and measure its performance more accurately, these limitations should be addressed in future studies.
Footnotes
Declaration of conflict of interest
The authors declare that there is no conflict of interest.
Funding
This work was supported by a National Research Foundation of Korea grant funded by the Korean government (MSIT; NO.RS-2023-00219725), and by a 2021 Yeungnam University Research Grant.
