Abstract
Study Design
Systematic review.
Objectives
Lumbar degenerative disc disease (DDD) poses a significant global health care challenge, with accurate diagnosis being difficult using conventional methods. Artificial intelligence (AI), particularly machine learning and deep learning, offers promising tools for improving diagnostic accuracy and workflow in lumbar DDD. This study aims to review AI-assisted magnetic resonance imaging (MRI) diagnosis in lumbar DDD and discuss current research for clinical use.
Methods
A systematic search of electronic databases identified studies on AI applications in MRI-based lumbar DDD diagnosis, following Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Search terms included combinations of “Artificial Intelligence,” “Machine Learning,” “Deep Learning,” “Low Back Pain,” “Lumbar,” “Disc,” “Degeneration,” and “MRI,” targeting studies in English from January 1, 2010, to January 1, 2024. Inclusion criteria encompassed experimental and observational studies in peer-reviewed journals. Data extraction focused on study characteristics, AI techniques, performance metrics, and diagnostic outcomes, with quality assessed using predefined criteria.
Results
Twenty studies met the inclusion criteria, employing various AI methodologies, including machine learning and deep learning, to diagnose lumbar DDD manifestations such as disc degeneration, herniation, and bulging. AI models consistently outperformed conventional methods in accuracy, sensitivity, and specificity, with performance metrics ranging from 71.5% to 99% across different diagnostic objectives.
Conclusion
The algorithm model provides a structured framework for integrating AI into routine clinical practice, enhancing diagnostic precision and patient outcomes in lumbar DDD management. Further research and validation are needed to refine AI algorithms for real-world application in lumbar DDD diagnosis.
Keywords
Introduction
Lumbar degenerative disc disease (DDD) stands as a prevalent musculoskeletal disorder with significant implications for global public health and individual well-being. Manifesting as progressive structural deterioration and functional impairment of intervertebral discs within the lumbar spine, DDD constitutes a leading cause of chronic low back pain, functional disability, and diminished quality of life among affected individuals.1,2 The accurate and timely diagnosis of lumbar DDD holds paramount importance in guiding appropriate therapeutic interventions, ranging from conservative management strategies to surgical interventions in symptomatic cases.2-4
Magnetic resonance imaging (MRI) is the cornerstone imaging modality for evaluating and characterization of lumbar spine pathology. It offers unparalleled soft tissue contrast and spatial resolution.4,5 However, the interpretation of MRI findings for lumbar DDD diagnosis remains inherently subjective and susceptible to interobserver variability, contingent upon the expertise and experience of radiologists. Such variability in diagnostic interpretation may engender inconsistencies in clinical decision-making and treatment outcomes, underscoring the imperative for objective and standardized diagnostic approaches to enhance diagnostic accuracy and clinical efficacy.
Against this backdrop, the advent of artificial intelligence (AI) heralds a paradigm shift in medical imaging diagnostics, offering transformative potential in augmenting the precision and efficiency of diagnostic processes across diverse medical domains. Through the application of advanced machine learning and deep learning algorithms, AI systems are poised to revolutionize the analysis and interpretation of medical imaging data, transcending human limitations in data processing speed and pattern recognition capabilities.4,6
This systematic review endeavors to critically evaluate the efficacy and utility of AI-assisted MRI diagnosis in lumbar DDD. By synthesizing the existing literature, we aim to delineate the landscape of AI applications in lumbar DDD diagnosis, elucidate the methodological nuances and performance metrics underpinning AI-driven diagnostic paradigms, and discern the potential implications of AI integration for clinical practice and patient care. Furthermore, we endeavor to proffer a structured algorithmic framework for the seamless integration of AI technology into routine clinical workflows, with a view towards optimizing diagnostic accuracy, facilitating early disease detection, and informing personalized therapeutic interventions in lumbar DDD management.
Methods
Literature Search Strategy
The authors conducted an extensive literature search on the PubMed and Scopus databases to investigate existing research regarding artificial intelligence-assisted MRI diagnosis in lumbar DDD. This systematic review was carried out utilizing PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) criteria
7
(Figure 1). The search spanned articles published between January 1, 2010, and January 1, 2024, utilizing MeSH (Medical Subject Headings) terms. Keywords such as “Artificial Intelligence,” “Machine Learning,” “Deep Learning,” “Low Back Pain,” “Lumbar,” “Disc,” “Degeneration,” and “MRI” were included in the title or abstract search. Additionally, the authors manually checked reference lists for relevant articles. Inclusion criteria stipulated that only articles written in English were considered. Preferred reporting items for systematic reviews and meta-analyses (PRISMA) flow diagram in this study.
Inclusion and Exclusion Criteria
This systematic review encompassed a wide range of research types, including randomized controlled trials, observational cohorts, and experimental studies, without imposing constraints on research design, such as retrospective or prospective approaches. Exclusions comprised research from unrelated fields, case reports, reviews, meta-analyses, proceedings, and studies lacking accessible abstracts or full texts. Two independent reviewers conducted the literature search, resolving any discrepancies through mutual agreement.
Data Extraction
Data extraction was systematically conducted to gather relevant information from the included studies identified through the literature search. A standardized data extraction form was developed a priori to capture key variables of interest across all eligible studies. The following information was extracted from each study: study characteristics (authors, publication year, study design, sample size, country of origin, type of AI model, specific AI algorithms or techniques employed), diagnostic objectives (nature of lumbar DDD condition(s) investigated, classification or detection task), performance metrics (accuracy, sensitivity, and specificity), main objectives, objective of findings, and other study details.
Assessment of the Risk of Bias in This Systematic Review
This study used a prediction model risk of bias assessment tool (PROBAST).8,9 PROBAST is a tool for assessing the risk of bias (ROB) and the applicability of diagnostic and prognostic prediction model studies. These biases were categorized as high, low, and unclear. Before comparison, 2 researchers independently evaluated the studies’ bias levels. Any discrepancies were resolved through consensus, with a third author’s input sought if required. Additionally, a third reviewer resolved any remaining discrepancies in the data assessment.
Results
Demographic data of the studies in this systematic review.
Publication information (Authors, Year, Sample Size and Nationality)
The systematic review includes studies conducted by authors from diverse nationalities spanning multiple years and sample sizes. Authors from countries such as the USA,1,10,11 Turkey, 3 Spain, 12 Iran, 13 Canada, 14 India,15,16 China,2,17 Germany, 18 Taiwan, 19 Switzerland, 20 Hong Kong, 5 Finland, 21 and Thailand 4 have contributed to this research. The sample sizes range from as low as 63 to as high as 17 800, reflecting the variability in study populations and research objectives. The multinational participation in research on this topic underscores the widespread interest and concerted efforts in leveraging AI techniques to advance the diagnosis and classification of lumbar disc pathology across different health care settings and patient populations.
Type of Study and AI-Algorithm/Techniques
The systematic review encompasses a wide array of studies conducted by authors from diverse nationalities, each contributing to the advancement of lumbar disc pathology diagnosis and classification using various AI techniques (Table 1). These studies employ a spectrum of AI techniques, ranging from classical methods to state-of-the-art deep learning architectures. Classical machine learning algorithms such as support vector machines, random forests, perceptron classifiers, k-nearest neighbor, and logistic regression are widely utilized across experimental studies. Additionally, deep learning approaches, including convolutional neural networks (CNNs), recurrent generative adversarial networks (GANs), recurrent neural networks (RNNs), and semantic segmentation networks, are prominently featured in the reviewed literature. These AI techniques enable researchers to extract meaningful patterns and features from complex medical imaging data, facilitating accurate and efficient diagnosis and classification of lumbar disc pathology. Through a combination of experimental validation studies and observational analyses, authors demonstrate the effectiveness and generalizability of AI-driven approaches in enhancing clinical decision-making and patient care in the context of lumbar disc pathology.
Risk of Bias Analysis
The prediction model study risk of bias assessment tool (PROBAST) guidelines for assessing the risk of bias in this systematic review.
+ indicates low risk of bias/low concern regarding applicability; - indicates high risk of bias/high concern regarding applicability; ? Indicates unclear risk of bias/unclear concern regarding applicability.

Summary of the risk of bias assessment and applicability using prediction model risk of bias assessment tool (PROBAST).
Dataset of Deep Learning Model and Base of Classification
Summary of object detection and performance of artificial intelligence models.
Performance of Artificial Intelligence
A comprehensive list of studies focusing on AI-assisted object detection for disc degeneration using MRI images is listed in Table 3. From Alomari’s study 1 in 2010 to Liawrungreang’s work 4 in 2023, each study utilized different datasets and classification bases to detect abnormalities in spinal discs. Object detection tasks ranged from identifying disc degeneration, herniation, bulging, stenosis, to classifying discs as normal or abnormal based on their appearance. Performance metrics varied across studies, with accuracy ranging from 71.5% to 99%, sensitivity from 47.4% to 99.0%, and specificity from 59.6% to 100.0%. Some studies employed classifiers such as SVM, Perceptron, LMS, k-Means, and ensemble methods, while others utilized deep learning models like VGG-M, VGG-16, GoogLeNet, and ResNet-34. The diversity in methodologies and outcomes underscores the ongoing efforts to develop accurate and reliable AI models for diagnosing pathologies through MRI imaging (Table 3).
Application and Outcome
Summary of a systematic review of the application and outcome of artificial intelligence-assisted MRI diagnosis in lumbar degenerative disc disease.
Discussion
The findings of this systematic review underscore the transformative potential of AI technologies in enhancing the diagnosis of lumbar DDD from MRI scans. Across the included studies, a diverse array of AI algorithms, ranging from classical machine learning methods to sophisticated deep learning architectures, demonstrated impressive performance metrics in identifying and characterizing various manifestations of lumbar disc pathology.4,6 These AI models consistently outperformed conventional diagnostic approaches, exhibiting high accuracy, sensitivity, and specificity.4,11 The high accuracies reported across different studies underscore the robustness and generalizability of AI-driven diagnostic paradigms in the realm of lumbar DDD. Such accuracy is particularly noteworthy given the inherent subjectivity and variability associated with traditional radiological interpretations of MRI findings in lumbar spine pathology. By leveraging advanced pattern recognition capabilities and data-driven decision-making, AI systems offer a standardized and objective approach to lumbar DDD diagnosis, thereby mitigating the risk of interobserver variability and enhancing diagnostic precision. Current trends in AI-assisted spine surgery and diagnosis are increasing, improving feasibility, accuracy and safety with potential enhancements. 6
Moreover, the integration of AI technology into routine clinical workflows holds promise for streamlining diagnostic processes and optimizing patient care in lumbar DDD management. The framework delineates a structured approach for harnessing AI capabilities to inform clinical decision-making, facilitate early disease detection, and tailor personalized therapeutic interventions. By automating labor-intensive tasks such as image analysis and classification, AI systems empower health care providers to allocate their time and expertise more efficiently, ultimately improving workflow efficiency and patient outcomes. However, despite the promising strides made in AI-assisted MRI diagnosis of lumbar DDD, several challenges and considerations warrant attention. The heterogeneity in AI methodologies and datasets across the reviewed studies highlights the need for standardized protocols and benchmark datasets to facilitate comparability and reproducibility. Additionally, the ethical implications surrounding AI deployment in clinical practice, including issues of algorithm transparency, data privacy, and bias mitigation, necessitate careful consideration and proactive measures to ensure responsible and equitable implementation.
Advances in understanding DDD and AI’s role in patient care, DDD poses a diagnostic challenge due to similarities with asymptomatic ageing-related changes. Distinguishing between them relies on correlating clinical symptoms like chronic back pain and reduced mobility with specific MRI findings such as disc height loss and herniation, which are more pronounced in symptomatic DDD. MRI characteristics associated with DDD are identified through studies comparing symptomatic and asymptomatic individuals, establishing criteria for disease diagnosis. These criteria are validated by sensitivity and specificity analyses to differentiate normal variations from pathological changes.6,11 AI technology has transformed DDD diagnosis and management by accurately detecting subtle MRI changes that human radiologists might overlook.4,6 AI algorithms analyze imaging data to quantify disc degeneration, predict treatment outcomes, and guide personalized therapy plans based on individual patient profiles. Practically, integrating AI-driven diagnostics into clinical practice improves accuracy and efficiency. Clinicians can utilize AI insights to tailor treatment strategies, optimize patient care, and potentially mitigate disease progression early on, enhancing overall outcomes for patients with DDD.4,6
Summary of advancements and current artificial intelligence (AI) algorithms for lumbar degenerative disc disease(DDD).

An example of a web-based application for artificial intelligence-assisted diagnosis in lumbar degenerative disc disease.
Conclusion
This systematic review underscores the transformative potential of AI technology in revolutionizing lumbar DDD diagnosis and management. By capitalizing on advanced machine learning and deep learning algorithms, AI-driven approaches offer a pathway towards standardized, objective, and efficient diagnostic processes, ultimately enhancing patient care and outcomes in the realm of lumbar spine pathology. Continued research, validation, and collaborative efforts are essential to further refine and optimize AI algorithms for real-world applications in lumbar DDD diagnosis and surgery.
Footnotes
Author Contributions
Conceptualization, WL., PS., JBP. and KRD.; methodology, WC., WL.; validation, WC. and WL.; formal analysis, PS., WC. and WL.; investigation, WL., WC. and PS.; resources, WL., WC. and PS.; data curation, WL. and PS.; writing—original draft preparation, WL., JBP. and KRD.; writing—review and editing, WL., PS., JBP. and KRD.; visualization, WL.; supervision, WL, JBP. and KRD.; project administration, WL.; funding acquisition, WL. All authors have read and agreed to the published version of the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
IRB Statement
This study was conducted in accordance with the Declaration of Helsinki and with approval from the Ethics Committee and Institutional Review Board of University of Phayao (Institutional Review Board (IRB) approval, IRB Number: HREC-UP-HSST 1.1/032/67).
Data Availability Statement
The data used in this research were acquired from a public resource.
