Sage Journals: Discover world-class research

Abstract

Study Design

Systematic review.

Objectives

Lumbar degenerative disc disease (DDD) poses a significant global health care challenge, with accurate diagnosis being difficult using conventional methods. Artificial intelligence (AI), particularly machine learning and deep learning, offers promising tools for improving diagnostic accuracy and workflow in lumbar DDD. This study aims to review AI-assisted magnetic resonance imaging (MRI) diagnosis in lumbar DDD and discuss current research for clinical use.

Methods

A systematic search of electronic databases identified studies on AI applications in MRI-based lumbar DDD diagnosis, following Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Search terms included combinations of “Artificial Intelligence,” “Machine Learning,” “Deep Learning,” “Low Back Pain,” “Lumbar,” “Disc,” “Degeneration,” and “MRI,” targeting studies in English from January 1, 2010, to January 1, 2024. Inclusion criteria encompassed experimental and observational studies in peer-reviewed journals. Data extraction focused on study characteristics, AI techniques, performance metrics, and diagnostic outcomes, with quality assessed using predefined criteria.

Results

Twenty studies met the inclusion criteria, employing various AI methodologies, including machine learning and deep learning, to diagnose lumbar DDD manifestations such as disc degeneration, herniation, and bulging. AI models consistently outperformed conventional methods in accuracy, sensitivity, and specificity, with performance metrics ranging from 71.5% to 99% across different diagnostic objectives.

Conclusion

The algorithm model provides a structured framework for integrating AI into routine clinical practice, enhancing diagnostic precision and patient outcomes in lumbar DDD management. Further research and validation are needed to refine AI algorithms for real-world application in lumbar DDD diagnosis.

Keywords

artificial intelligence machine learning deep learning lumbar degenerative disc disease MRI diagnosis systematic review algorithm model

Introduction

Lumbar degenerative disc disease (DDD) stands as a prevalent musculoskeletal disorder with significant implications for global public health and individual well-being. Manifesting as progressive structural deterioration and functional impairment of intervertebral discs within the lumbar spine, DDD constitutes a leading cause of chronic low back pain, functional disability, and diminished quality of life among affected individuals.^1,2 The accurate and timely diagnosis of lumbar DDD holds paramount importance in guiding appropriate therapeutic interventions, ranging from conservative management strategies to surgical interventions in symptomatic cases.^2-4

Magnetic resonance imaging (MRI) is the cornerstone imaging modality for evaluating and characterization of lumbar spine pathology. It offers unparalleled soft tissue contrast and spatial resolution.^4,5 However, the interpretation of MRI findings for lumbar DDD diagnosis remains inherently subjective and susceptible to interobserver variability, contingent upon the expertise and experience of radiologists. Such variability in diagnostic interpretation may engender inconsistencies in clinical decision-making and treatment outcomes, underscoring the imperative for objective and standardized diagnostic approaches to enhance diagnostic accuracy and clinical efficacy.

Against this backdrop, the advent of artificial intelligence (AI) heralds a paradigm shift in medical imaging diagnostics, offering transformative potential in augmenting the precision and efficiency of diagnostic processes across diverse medical domains. Through the application of advanced machine learning and deep learning algorithms, AI systems are poised to revolutionize the analysis and interpretation of medical imaging data, transcending human limitations in data processing speed and pattern recognition capabilities.^4,6

This systematic review endeavors to critically evaluate the efficacy and utility of AI-assisted MRI diagnosis in lumbar DDD. By synthesizing the existing literature, we aim to delineate the landscape of AI applications in lumbar DDD diagnosis, elucidate the methodological nuances and performance metrics underpinning AI-driven diagnostic paradigms, and discern the potential implications of AI integration for clinical practice and patient care. Furthermore, we endeavor to proffer a structured algorithmic framework for the seamless integration of AI technology into routine clinical workflows, with a view towards optimizing diagnostic accuracy, facilitating early disease detection, and informing personalized therapeutic interventions in lumbar DDD management.

Methods

Literature Search Strategy

The authors conducted an extensive literature search on the PubMed and Scopus databases to investigate existing research regarding artificial intelligence-assisted MRI diagnosis in lumbar DDD. This systematic review was carried out utilizing PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) criteria⁷ (Figure 1). The search spanned articles published between January 1, 2010, and January 1, 2024, utilizing MeSH (Medical Subject Headings) terms. Keywords such as “Artificial Intelligence,” “Machine Learning,” “Deep Learning,” “Low Back Pain,” “Lumbar,” “Disc,” “Degeneration,” and “MRI” were included in the title or abstract search. Additionally, the authors manually checked reference lists for relevant articles. Inclusion criteria stipulated that only articles written in English were considered.

Figure 1.

Preferred reporting items for systematic reviews and meta-analyses (PRISMA) flow diagram in this study.

Inclusion and Exclusion Criteria

This systematic review encompassed a wide range of research types, including randomized controlled trials, observational cohorts, and experimental studies, without imposing constraints on research design, such as retrospective or prospective approaches. Exclusions comprised research from unrelated fields, case reports, reviews, meta-analyses, proceedings, and studies lacking accessible abstracts or full texts. Two independent reviewers conducted the literature search, resolving any discrepancies through mutual agreement.

Data Extraction

Data extraction was systematically conducted to gather relevant information from the included studies identified through the literature search. A standardized data extraction form was developed a priori to capture key variables of interest across all eligible studies. The following information was extracted from each study: study characteristics (authors, publication year, study design, sample size, country of origin, type of AI model, specific AI algorithms or techniques employed), diagnostic objectives (nature of lumbar DDD condition(s) investigated, classification or detection task), performance metrics (accuracy, sensitivity, and specificity), main objectives, objective of findings, and other study details.

Assessment of the Risk of Bias in This Systematic Review

This study used a prediction model risk of bias assessment tool (PROBAST).^8,9 PROBAST is a tool for assessing the risk of bias (ROB) and the applicability of diagnostic and prognostic prediction model studies. These biases were categorized as high, low, and unclear. Before comparison, 2 researchers independently evaluated the studies’ bias levels. Any discrepancies were resolved through consensus, with a third author’s input sought if required. Additionally, a third reviewer resolved any remaining discrepancies in the data assessment.

Results

A total of 362 articles meeting the initial screening criteria were reviewed across PubMed and Scopus databases. After a thorough assessment, 57 papers meeting specific criteria were identified, out of which 20 were considered appropriate for inclusion following comprehensive inclusion and exclusion criteria. This study centered on the objectives of AI and study types, considering the participants’ nationalities. The research conducted by the author in the published year analyzed demographic data (Table 1).

Table 1.

Demographic data of the studies in this systematic review.

Authors	Published year	Sample size (N)	Nationality	Tite of study	Type of study	Based on artificial intelligence (AI) technique
Raja’ S Alomari et al¹	2010	80	USA	Computer-aided diagnosis of lumbar disc pathology from clinical lower spine MRI	Experimental study with cross-validation experiment	Gaussian model/Gibbs model
Jaehan Koh et al¹⁰	2012	70	USA	Disc herniation diagnosis in MRI using a CAD framework and a two-level classifier	Experimental study	Heterogeneous classifiers: a Perceptron classifier, a least mean square Classifier, a support vector machine classifier, and a k-means Classifier
Ayse Betul Oktay et al³	2014	102	Turkey	Computer aided diagnosis of degenerative intervertebral disc diseases from lumbar MR images	Experimental study	Machine learning with support Vector machines (SVMs)
Silvia Ruiz-España et al¹²	2015	67	Spain	Semiautomatic computer-aided classification of degenerative lumbar spine disease in magnetic resonance imaging	Experimental study	Machine learning model with Gradient vector flow algorithm
Isaac Castro-Mateos et al²²	2016	240	United Kingdom	Intervertebral disc classification by its degree of degeneration from T2-weighted magnetic resonance images	Experimental study	Support vector machine (SVM) and logistic regression
Amir Jamaludin et al²³	2017	2009	UK	SpineNet: Automated classification and evidence visualization in spinal MRIs	Experimental study	Convolutional neural Network (CNN)
Elias Ebrahimzadeh et al¹³	2018	210	Iran	Towards an automatic diagnosis system for lumbar disc herniation: The significance of local subset feature selection	Experimental study	Three classifiers: Multilayer Perceptron (MLP), K-nearest neighbor (KNN) and support vector machine (SVM)
Zhongyi Han et al¹⁴	2018	1818	Canada	Spine-GAN: Semantic segmentation of multiple spinal structures	Experimental study	Recurrent generative adversarial network (Spine-GAN)
Kai-Uwe Lewandrowskl et al¹¹	2020	17 800	USA	Feasibility of deep learning algorithms for reporting in routine spine magnetic resonance imaging	Experimental study	Neural network models with convolutional neural network by segmentation algorithms
Shirly Sundarsingh et al¹⁵	2020	63	India	Diagnosis of disc bulge and disc desiccation in lumbar MRI using concatenated shape and texture features with random Forest classifier	Experimental study	Random forest (RF) with Histogram of Oriented Gradients (HOG) add Local structure robust Binary robust patterns (LS-RBRP)
A. Beulah et al¹⁶	2021	93	India	Degenerative disc disease diagnosis from lumbar MR images using hybrid features	Experimental study	Support vector machine (SVM) classifier and Gabor features
Fei Gao et al²⁴	2021	500	China	Automated grading of lumbar disc degeneration using a Push-Pull Regularization network based on MRI	Experimental study	Convolutional neural networks (CNNs): VGG-M: VGG-Mid (or VGG-Middle), VGG-16(Visual Geometry Group-16), GoogLeNet (Inception-v1) and ResNet-34(Residual Network-34)
Frank Niemeyer et al¹⁸	2021	7948	Germany	A deep learning model for the accurate and reliable classification of disc degeneration based on MRI data	Experimental study	Deep convolutional neural network
Jen-Yung Tsai et al¹⁹	2021	168	Taiwan	Lumbar disc herniation automatic detection in magnetic resonance imaging based on deep learning	Experimental study	Convolutional neural network (CNN) backbone: a single-stage detection (YOLOv3)
Qiong Pan et al²	2021	500	China	Automatically diagnosing Disk bulge and Disk herniation with lumbar magnetic resonance images by using deep convolutional neural networks: Method Development study	Experimental study	Faster Region-based convolutional neural network (Faster R-CNN)
Alexandra Grob et al²⁰	2022	4410	Switzerland	External validation of the deep learning system “SpineNet” for grading radiological features of degeneration on MRIs of the lumbar spine	Experimental study	SpineNet (SN) with convolutional neural networks (CNNs)
Hua-Dong Zheng et al¹⁷	2022	5255	China	Deep learning-based high-accuracy quantitation for lumbar intervertebral disc degeneration from MRI	Experimental study	BianqueNet (semantic segmentation network based on the Deeplabv3+architecture)
Jason Pui Yin Cheung et al⁵	2022	1343	Hong Kong	Learning-based fully automated prediction of lumbar disc degeneration progression with specified clinical parameters and preliminary validation	Experimental study	MRI-SegFlow (convolutional neural network (CNN) architecture) and Visual Geometry Group-Medium (VGG-M)
Terence P. McSweeney et al²¹	2023	1331	Finland	External validation of SpineNet, an Open-Source deep learning model for grading lumbar Disk degeneration MRI features, using the Northern Finland Birth cohort 1966	Observational with experimental study	SpineNet (deep learning models)
Wongthawat Liawrungrueang et al⁴	2023	1000	Thailand	Automatic detection, classification, and grading of lumbar intervertebral disc degeneration using an artificial neural network model	Experimental study	Convolutional neural network (CNN) backbone: a single-stage detection (YOLOv5)

Publication information (Authors, Year, Sample Size and Nationality)

The systematic review includes studies conducted by authors from diverse nationalities spanning multiple years and sample sizes. Authors from countries such as the USA,^1,10,11 Turkey,³ Spain,¹² Iran,¹³ Canada,¹⁴ India,^15,16 China,^2,17 Germany,¹⁸ Taiwan,¹⁹ Switzerland,²⁰ Hong Kong,⁵ Finland,²¹ and Thailand⁴ have contributed to this research. The sample sizes range from as low as 63 to as high as 17 800, reflecting the variability in study populations and research objectives. The multinational participation in research on this topic underscores the widespread interest and concerted efforts in leveraging AI techniques to advance the diagnosis and classification of lumbar disc pathology across different health care settings and patient populations.

Type of Study and AI-Algorithm/Techniques

The systematic review encompasses a wide array of studies conducted by authors from diverse nationalities, each contributing to the advancement of lumbar disc pathology diagnosis and classification using various AI techniques (Table 1). These studies employ a spectrum of AI techniques, ranging from classical methods to state-of-the-art deep learning architectures. Classical machine learning algorithms such as support vector machines, random forests, perceptron classifiers, k-nearest neighbor, and logistic regression are widely utilized across experimental studies. Additionally, deep learning approaches, including convolutional neural networks (CNNs), recurrent generative adversarial networks (GANs), recurrent neural networks (RNNs), and semantic segmentation networks, are prominently featured in the reviewed literature. These AI techniques enable researchers to extract meaningful patterns and features from complex medical imaging data, facilitating accurate and efficient diagnosis and classification of lumbar disc pathology. Through a combination of experimental validation studies and observational analyses, authors demonstrate the effectiveness and generalizability of AI-driven approaches in enhancing clinical decision-making and patient care in the context of lumbar disc pathology.

Risk of Bias Analysis

This systematic review assessed the risk of bias and applicability of diagnostic and prognostic prediction model studies using the PROBAST tool.^8,9 The results showed Ayse Betul Oktay et al,³ Zhongyi Han et al,¹⁴ Kai-Uwe Lewandrowskl et al,¹¹ Alexandra Grob et al,²⁰ Jason Pui Yin Cheung et al,⁵ Terence P. McSweeney et al,²¹ and Wongthawat Liawrungrueang et al⁴ were identified as having a low risk of bias and low concerns regarding applicability. In contrast, Raja’ S Alomari et al¹ presented a high risk of bias and high concerns regarding applicability. A significant number of studies, including those by Jaehan Koh et al,¹⁰ Silvia Ruiz-España et al,¹² Isaac Castro-Mateos et al,²² Elias Ebrahimzadeh et al,¹³ Shirly Sundarsingh et al,¹⁵ A. Beulah et al,¹⁶ and Jen-Yung Tsai et al,¹⁹ exhibited unclear risk of bias and unclear concerns regarding applicability. Additionally, studies by Amir Jamaludin et al,²³ Fei Gao et al,²⁴ Frank Niemeyer,¹⁸ Qiong Pan et al,² and Hua-Dong Zheng et al¹⁷ displayed unclear risk of bias but low concerns regarding applicability. A summary of the risk of bias assessment and applicability is reported in Table 2 and Figure 2.

Table 2.

The prediction model study risk of bias assessment tool (PROBAST) guidelines for assessing the risk of bias in this systematic review.

Author, Year	Risk of Bias				Applicability			Overall
Author, Year	1. Participants	2. Predictors	3. Outcome	4. Analysis	1. Participants	2. Predictors	3. Outcome	Risk of Bias	Applicability
Raja’ S Alomari et al¹	?	+	−	?	?	+	−	−	−
Jaehan Koh et al¹⁰	?	+	?	?	?	+	?	?	?
Ayse Betul Oktay et al³	+	+	+	+	+	+	+	+	+
Silvia Ruiz-España et al¹²	?	+	+	?	?	+	+	?	?
Isaac Castro-Mateos et al²²	?	+	+	?	?	+	+	?	?
Amir Jamaludin et al²³	+	+	?	?	+	+	+	?	+
Elias Ebrahimzadeh et al¹³	?	+	+	?	?	+	+	?	?
Zhongyi Han et al¹⁴	+	+	+	+	+	+	+	+	+
Kai-Uwe Lewandrowskl et al¹¹	+	+	+	+	+	+	+	+	+
Shirly Sundarsingh et al¹⁵	?	+	+	?	?	+	+	?	?
A. Beulah et al¹⁶	?	+	+	?	?	+	+	?	?
Fei Gao et al²⁴	+	+	?	?	+	+	+	?	+
Frank Niemeyer et al¹⁸	+	+	?	?	+	+	+	?	+
Jen-Yung Tsai et al¹⁹	?	+	+	?	?	+	+	?	?
Qiong Pan et al²	+	+	?	?	+	+	+	?	+
Alexandra Grob et al²⁰	+	+	+	+	+	+	+	+	+
Hua-Dong Zheng et al¹⁷	+	+	?	?	+	+	+	?	+
Jason Pui Yin Cheung et al⁵	+	+	+	+	+	+	+	+	+
Terence P. McSweeney et al²¹	+	+	+	+	+	+	+	+	+
Wongthawat Liawrungrueang et al⁴	+	+	+	+	+	+	+	+	+

+ indicates low risk of bias/low concern regarding applicability; - indicates high risk of bias/high concern regarding applicability; ? Indicates unclear risk of bias/unclear concern regarding applicability.

Figure 2.

Summary of the risk of bias assessment and applicability using prediction model risk of bias assessment tool (PROBAST).

Dataset of Deep Learning Model and Base of Classification

The dataset utilized across the studies predominantly consists of MRI scans, providing comprehensive representations of spinal anatomy and pathology. MRI is chosen for its superior soft tissue contrast, making it an ideal modality for assessing disc abnormalities. Researchers conducted various classification tasks within this dataset framework, often leveraging deep learning models to distinguish between different categories of disc conditions. These classification tasks are based on criteria such as identifying normal vs abnormal disc appearances or employing established grading systems such as the Pfirrmann grading system. Object detection, a crucial aspect of these studies, involves identifying and localizing specific abnormalities within the MRI scans. This includes detecting pathologies such as disc degeneration, herniation, bulging, and stenosis. By combining these approaches, researchers aim to develop robust AI models that accurately identify and characterize spinal abnormalities, thereby aiding in clinical diagnosis and treatment planning for patients with spinal pathologies (Table 3).

Table 3.

Summary of object detection and performance of artificial intelligence models.

Authors	Published year	Dataset of deep learning model	Base of classification	Object detection	Accuracy, %	Sensitivity, %	Specificity, %
Raja’ S Alomari et al¹	2010	Magnetic resonance imaging	Classify the discs as normal or abnormal disc appearance	Disc degeneration/pathology	91.3%	-	-
Jaehan Koh et al¹⁰	2012	Magnetic resonance imaging	Classify the discs as normal or abnormal disc appearance	Disk herniation (overall model)	99%	-	-
				SVM classifier	53.5%	47.4%	59.6%
				Perceptron classifier	85.3%	81.7%	88.9%
				LMS classifier	92.8%	92.3%	93.2%
				k-Means classifier	61.6%	26.7%	96.4%
				Ensemble classifier	99.5%	99.0%	100.0%
Ayse Betul Oktay et al³	2014	Magnetic resonance imaging	Classify the discs as normal or degeneration	Disc degeneration	92.81%	94.6%	89.8%
Silvia Ruiz-España et al¹²	2015	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	>95%	95.8%	92.6%
				Disc herniation	>95%	60%	87.1%
				Stenosis	>95%	70%	81.7%
Isaac Castro-Mateos et al²²	2016	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	∼91%	87.3 %	95.5 %
Amir Jamaludin et al²³	2017	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	71.5% to 82.5%	-	-
Elias Ebrahimzadeh et al¹³	2018	Magnetic resonance imaging	Classify the degenerative disc herniation diagnosis	Disc classifiers (overall model)	93.17%	78.11%	97.36%
				Disc classifier (SVM)	95.23%	82.60%	98.78%
				Disc classifier (KNN)	92.38%	77.27%	96.38%
				Disc classifier (MLP)	91.90%	74.46%	96.93%
Zhongyi Han et al¹⁴	2018	Magnetic resonance imaging	Spatial pathological correlations between both normal and abnormal structure	Disc degeneration	96.2%	86%	89.1%
Kai-Uwe Lewandrowskl et al¹¹	2020	Magnetic resonance imaging	Segmentation of vertebrae, intervertebral discs, dural sac on sagittal, axial MRI images and a segmented 3-dimensional anatomical model	Disc bulging	84.9	89.4	81.2
				Canal stenosis	86.2	91.1	82.5
				Disc herniation	85.2	81.8	87.4
				Foraminal stenosis	81.0	72.4	83.1
Shirly Sundarsingh et al¹⁵	2020	Magnetic resonance imaging	Categorizing the normal IVD, disc bulge and disc desiccation	Disc degeneration (random forest with HOG + LS-RBRP features)	93.92%	90.59%	98.39%
A. Beulah et al¹⁶	2021	Magnetic resonance imaging	Classify the discs as non-degenerative and degenerative disc	Disc degeneration	92.47%	90.45%	93.8%
Fei Gao et al²⁴	2021	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration (overall)	86.0	-	-
				VGG-M	72% to 80%	-	-
				VGG-16	74.5% to 83%	-	-
				GoogLeNet	74.5% to 84.5%	-	-
				ResNet-34	76% to 86%	-	-
Frank Niemeyer et al¹⁸	2021	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	92%	90.2%	92.5%
Jen-Yung Tsai et al¹⁹	2021	Magnetic resonance imaging	Classify the discs as normal or abnormal in different lumbar vertebrae regions	Disc degeneration/Herniation	81.1	91.7	87%
Qiong Pan et al²	2021	Magnetic resonance imaging	3-Class classification as normal disk, disk bulge and disk herniation	Intervertebral Disk (IVD) classification	84.2% to 92.7%	-	-
Alexandra Grob et al²⁰	2022	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	>55%	>50%	>88%
Hua-Dong Zheng et al¹⁷	2022	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	89.43% to 94.38%	-	-
Jason Pui Yin Cheung et al⁵	2022	Magnetic resonance imaging	Pfirrmann grading system and Schneiderman score	Disc degeneration (Pfirrmann grading)	89.9%	60.4%	94.6%
				Disc degeneration (Schneiderman score)	90.2%	96.0%	79.7%
Terence P. McSweeney et al²¹	2023	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	79%	79%	93%
Wongthawat Liawrungrueang et al⁴	2023	Magnetic resonance imaging	Pfirrmann grading system	Disc degeneration	∼95.0	∼95.0	∼98.0

Performance of Artificial Intelligence

A comprehensive list of studies focusing on AI-assisted object detection for disc degeneration using MRI images is listed in Table 3. From Alomari’s study¹ in 2010 to Liawrungreang’s work⁴ in 2023, each study utilized different datasets and classification bases to detect abnormalities in spinal discs. Object detection tasks ranged from identifying disc degeneration, herniation, bulging, stenosis, to classifying discs as normal or abnormal based on their appearance. Performance metrics varied across studies, with accuracy ranging from 71.5% to 99%, sensitivity from 47.4% to 99.0%, and specificity from 59.6% to 100.0%. Some studies employed classifiers such as SVM, Perceptron, LMS, k-Means, and ensemble methods, while others utilized deep learning models like VGG-M, VGG-16, GoogLeNet, and ResNet-34. The diversity in methodologies and outcomes underscores the ongoing efforts to develop accurate and reliable AI models for diagnosing pathologies through MRI imaging (Table 3).

Application and Outcome

Several studies have investigated the application of AI in diagnosing lumbar DDD from MRI scans, employing diverse algorithms and methods. These include Gaussian and Gibbs models, Support Vector Machines (SVMs), CNNs, and ensemble classifiers, among others. Results indicate high accuracies ranging from approximately 81% to 99.5% in tasks such as classifying lumbar discs as normal or abnormal, detecting disc herniation, diagnosing disc degeneration, and segmenting spinal structures. Some studies achieved moderate accuracy in tasks like automatic detection of disc herniation or grading radiological features of degeneration. The use of AI models such as CNNs, Spine-GAN, and Faster R-CNN facilitated automated detection, classification, and grading of lumbar intervertebral disc degeneration, with accuracies reaching approximately 95%. These findings suggest promising potential for AI-driven MRI diagnosis in effectively assessing and managing lumbar DDD. The author summarizes the application and outcome of artificial intelligence-assisted MRI diagnosis in lumbar DDD (Table 4).

Table 4.

Summary of a systematic review of the application and outcome of artificial intelligence-assisted MRI diagnosis in lumbar degenerative disc disease.

Authors	Based on Application	Algorithms and Methods of Study	Outcome and Summary
Raja’ S Alomari et al¹	Lumbar DDD diagnosis from MRI	Gaussian model, Gibbs model	High accuracy in classifying lumbar discs as normal or abnormal (91.3%)
Jaehan Koh et al¹⁰	Disc herniation diagnosis in MRI using CAD framework	Perceptron classifier, least mean square, SVM, k-means	Ensemble classifier achieved high accuracy (99.5%) in disc herniation detection
Ayse Betul Oktay et al³	Lumbar intervertebral disc diseases diagnosis	Support vector machines (SVMs)	Achieved high accuracy (92.81%) in classifying discs as normal or degenerated
Silvia Ruiz-España et al¹²	Computer-aided classification of lumbar spine disease	Gradient vector flow algorithm	High accuracy (>95%) in semiautomatic classification of lumbar spine disease
Isaac Castro-Mateos et al²²	Intervertebral disc classification	Support vector machine (SVM), logistic regression	SVM and logistic regression achieved high accuracy in classifying disc degeneration (≈91%)
Amir Jamaludin et al²³	Automated classification in spinal MRIs	Convolutional neural network (CNN)	CNN-based model demonstrated automated classification with high accuracy
Elias Ebrahimzadeh et al¹³	Automatic diagnosis system for disc herniation	Multilayer perceptron (MLP), K-nearest neighbor (KNN), SVM	Achieved high accuracy (≈93.17%) in disc herniation diagnosis using various classifiers
Zhongyi Han et al¹⁴	Semantic segmentation of spinal structures	Recurrent generative adversarial network (Spine-GAN)	Achieved high accuracy (96.2%) in semantic segmentation of multiple spinal structures
Kai-Uwe Lewandrowskl et al¹¹	Feasibility of deep learning algorithms	Convolutional neural network (CNN) by segmentation algorithms	Achieved moderate to high accuracy in automated grading of lumbar disc degeneration using various CNN models
Shirly Sundarsingh et al¹⁵	Diagnosis of disc bulge and desiccation	Random forest (RF) with Histogram of Oriented Gradients (HOG)	RF model achieved high accuracy (93.92%) in diagnosing disc bulge and desiccation
A. Beulah et al¹⁶	Degenerative disc disease diagnosis	Support vector machine (SVM) classifier, Gabor features	SVM classifier with Gabor features achieved high accuracy (92.47%) in diagnosing degenerative disc disease
Fei Gao et al²⁴	Automated grading of disc degeneration	Push-Pull Regularization network based on MRI	Deep learning model demonstrated high accuracy (≈92%) in grading lumbar disc degeneration
Frank Niemeyer et al¹⁸	Accurate and reliable classification of disc degeneration	Deep convolutional neural network	Deep CNN model achieved high accuracy (92%) in classifying disc degeneration severity
Jen-Yung Tsai et al¹⁹	Lumbar disc herniation automatic detection	Convolutional neural network (YOLOv3)	Achieved moderate accuracy (81.1%) in automatic detection of lumbar disc herniation
Qiong Pan et al²	Automatically diagnosing Disk bulge and herniation	Faster Region-based convolutional neural network (Faster R-CNN)	Achieved moderate to high accuracy (84.2% to 92.7%) in diagnosing disk bulge and herniation
Alexandra Grob et al²⁰	External validation of SpineNet	Convolutional neural networks (CNNs)	SpineNet demonstrated moderate accuracy (>55%) in grading radiological features of degeneration on MRIs of the lumbar spine
Hua-Dong Zheng et al¹⁷	High-accuracy quantitation for disc degeneration	BianqueNet (semantic segmentation network based on Deeplabv3+)	Achieved high accuracy (89.43% to 94.38%) in quantitating lumbar intervertebral disc degeneration from MRI
Jason Pui Yin Cheung et al⁵	Fully automated prediction of disc degeneration	MRI-SegFlow, Visual Geometry Group-Medium (VGG-M)	Achieved high accuracy (≈90%) in predicting lumbar disc degeneration progression with specified clinical parameters
Terence P. McSweeney et al²¹	External validation of SpineNet	SpineNet (deep learning models)	SpineNet demonstrated moderate accuracy (79%) in grading lumbar disc degeneration MRI features
Wongthawat Liawrungrueang et al⁴	Automatic detection, classification, and grading	Convolutional neural network (YOLOv5)	Achieved high accuracy (≈95.0%) in detecting, classifying, and grading lumbar intervertebral disc degeneration using AI model

Discussion

The findings of this systematic review underscore the transformative potential of AI technologies in enhancing the diagnosis of lumbar DDD from MRI scans. Across the included studies, a diverse array of AI algorithms, ranging from classical machine learning methods to sophisticated deep learning architectures, demonstrated impressive performance metrics in identifying and characterizing various manifestations of lumbar disc pathology.^4,6 These AI models consistently outperformed conventional diagnostic approaches, exhibiting high accuracy, sensitivity, and specificity.^4,11 The high accuracies reported across different studies underscore the robustness and generalizability of AI-driven diagnostic paradigms in the realm of lumbar DDD. Such accuracy is particularly noteworthy given the inherent subjectivity and variability associated with traditional radiological interpretations of MRI findings in lumbar spine pathology. By leveraging advanced pattern recognition capabilities and data-driven decision-making, AI systems offer a standardized and objective approach to lumbar DDD diagnosis, thereby mitigating the risk of interobserver variability and enhancing diagnostic precision. Current trends in AI-assisted spine surgery and diagnosis are increasing, improving feasibility, accuracy and safety with potential enhancements.⁶

Moreover, the integration of AI technology into routine clinical workflows holds promise for streamlining diagnostic processes and optimizing patient care in lumbar DDD management. The framework delineates a structured approach for harnessing AI capabilities to inform clinical decision-making, facilitate early disease detection, and tailor personalized therapeutic interventions. By automating labor-intensive tasks such as image analysis and classification, AI systems empower health care providers to allocate their time and expertise more efficiently, ultimately improving workflow efficiency and patient outcomes. However, despite the promising strides made in AI-assisted MRI diagnosis of lumbar DDD, several challenges and considerations warrant attention. The heterogeneity in AI methodologies and datasets across the reviewed studies highlights the need for standardized protocols and benchmark datasets to facilitate comparability and reproducibility. Additionally, the ethical implications surrounding AI deployment in clinical practice, including issues of algorithm transparency, data privacy, and bias mitigation, necessitate careful consideration and proactive measures to ensure responsible and equitable implementation.

Advances in understanding DDD and AI’s role in patient care, DDD poses a diagnostic challenge due to similarities with asymptomatic ageing-related changes. Distinguishing between them relies on correlating clinical symptoms like chronic back pain and reduced mobility with specific MRI findings such as disc height loss and herniation, which are more pronounced in symptomatic DDD. MRI characteristics associated with DDD are identified through studies comparing symptomatic and asymptomatic individuals, establishing criteria for disease diagnosis. These criteria are validated by sensitivity and specificity analyses to differentiate normal variations from pathological changes.^6,11 AI technology has transformed DDD diagnosis and management by accurately detecting subtle MRI changes that human radiologists might overlook.^4,6 AI algorithms analyze imaging data to quantify disc degeneration, predict treatment outcomes, and guide personalized therapy plans based on individual patient profiles. Practically, integrating AI-driven diagnostics into clinical practice improves accuracy and efficiency. Clinicians can utilize AI insights to tailor treatment strategies, optimize patient care, and potentially mitigate disease progression early on, enhancing overall outcomes for patients with DDD.^4,6

There are limitations to this study. Firstly, the heterogeneity among the included studies in terms of sample sizes, methodologies, and study designs may limit direct comparisons and generalizability. Secondly, while AI models demonstrated high accuracy, challenges related to the clinical implementation, interpretability, and explainability of these models persist. Additionally, the reliance on retrospective data and potential biases within these datasets may affect the generalizability of AI algorithms. Lastly, while AI technology shows promise, it should supplement rather than replace clinical judgment. Addressing these limitations requires further research, validation, and interdisciplinary collaboration to maximize the benefits of AI in lumbar DDD diagnosis. However, AI methodologies, including machine learning and deep learning techniques such as CNNs, have shown significant promise in enhancing the diagnosis of lumbar DDD using MRI. The authors summarize advancements and propose current AI and algorithms for lumbar DDD diagnosis (Table 5). These AI models offer several key features, including automated feature extraction, precise segmentation of anatomical structures, and classification of disc pathology severity. Support vector machines (SVMs), random forests (RF), ensemble learning, semantic segmentation networks, recurrent generative adversarial networks (GANs), transfer learning, and explainable AI further augment diagnostic capabilities by addressing specific challenges in lumbar DDD diagnosis. While these advancements offer various advantages, such as improved accuracy and robustness, challenges, such as interpretability, computational complexity, and data limitations, warrant careful consideration in their application. Overall, AI holds great potential to revolutionize lumbar DDD diagnosis, offering more efficient and accurate methods for clinical assessment and patient management. The algorithm model provides a structured framework for integrating AI technology into routine clinical practice, enhancing diagnostic precision and improving patient outcomes in lumbar DDD management. Further research and validation efforts are warranted to refine and optimize AI algorithms for real-world applications in lumbar DDD diagnosis. Finally, we applied the AI algorithms for lumbar DDD diagnosis to web-based applications within the hospital’s computer-assisted diagnosis (Figure 3).

Table 5.

Summary of advancements and current artificial intelligence (AI) algorithms for lumbar degenerative disc disease(DDD).

AI methodology	Algorithms for clinical applications in lumbar DDD diagnosis	Advantages	Limitations
Machine learning	Utilizes algorithms to learn from data and make predictions, classification of lumbar discs as normal or abnormal and prediction of disc degeneration severity	Versatile and adaptable to different diagnostic tasks	Limited ability to handle complex, high-dimensional data and reliance on feature engineering may require domain expertise
Deep learning	Neural networks with multiple layers for complex pattern recognition, automated feature extraction from MRI images and segmentation of lumbar structures and pathologies	Capable of learning complex patterns from raw data and well-suited for image-based diagnosis tasks	Prone to overfitting with insufficient training data and computational complexity may require substantial resources
Convolutional neural networks (CNNs)	Designed specifically for processing spatial data such as images and identification of disc herniation and degeneration patterns	High accuracy in image analysis and object detection	Interpretability may be challenging due to black-box nature
Support vector machines (SVMs)	Supervised learning models that analyse data for classification and regression tasks and discrimination between normal and abnormal lumbar discs	Effective in handling high-dimensional data	Sensitivity to the choice of kernel function and hyperparameters
Random forest (RF)	Ensemble learning method consisting of multiple decision trees and classification of lumbar disc degeneration severity	Robust against overfitting and noise in data	Interpretability may be compromised due to ensemble nature
Ensemble learning	Integrates predictions from multiple AI models to improve overall diagnostic performance and enhanced accuracy and robustness in lumbar DDD diagnosis	Mitigates weaknesses of individual models	Increased computational complexity with model integration
Semantic segmentation networks	Focuses on accurately delineating structures or regions of interest within images and precise segmentation of lumbar spine anatomy in MRI scans	Enables detailed assessment of disc morphology and pathology	Performance may degrade with anatomical variations or artifacts
Recurrent generative adversarial network (GAN)	Architecture consisting of a generator and a discriminator network and Generation of synthetic MRI images for data augmentation	Augments training data and enhances model generalization	Ensuring realism and diversity of synthetic data may be challenging
Transfer learning	Technique leveraging pre-trained models on large datasets to improve performance on specific tasks and Adaptation of pre-trained models for lumbar DDD diagnosis	Reduces need for extensive training data and computation	Domain shift between pre-training and target task may impact performance
Explainable AI	Focuses on transparency and interpretability of AI models’ decision-making processes and provides insights into AI diagnostic decisions for lumbar DDD	Enhances trust and understanding of AI-assisted diagnoses	Interpretability techniques may add computational overhead

Figure 3.

An example of a web-based application for artificial intelligence-assisted diagnosis in lumbar degenerative disc disease.

Conclusion

This systematic review underscores the transformative potential of AI technology in revolutionizing lumbar DDD diagnosis and management. By capitalizing on advanced machine learning and deep learning algorithms, AI-driven approaches offer a pathway towards standardized, objective, and efficient diagnostic processes, ultimately enhancing patient care and outcomes in the realm of lumbar spine pathology. Continued research, validation, and collaborative efforts are essential to further refine and optimize AI algorithms for real-world applications in lumbar DDD diagnosis and surgery.

Footnotes

Author Contributions

Conceptualization, WL., PS., JBP. and KRD.; methodology, WC., WL.; validation, WC. and WL.; formal analysis, PS., WC. and WL.; investigation, WL., WC. and PS.; resources, WL., WC. and PS.; data curation, WL. and PS.; writing—original draft preparation, WL., JBP. and KRD.; writing—review and editing, WL., PS., JBP. and KRD.; visualization, WL.; supervision, WL, JBP. and KRD.; project administration, WL.; funding acquisition, WL. All authors have read and agreed to the published version of the manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

IRB Statement

This study was conducted in accordance with the Declaration of Helsinki and with approval from the Ethics Committee and Institutional Review Board of University of Phayao (Institutional Review Board (IRB) approval, IRB Number: HREC-UP-HSST 1.1/032/67).

ORCID iDs

Wongthawat Liawrungrueang

Peem Sarasombath

K. Daniel Riew

Data Availability Statement

The data used in this research were acquired from a public resource.*

References

Alomari

Corso

Chaudhary

Dhillon

. Computer-aided diagnosis of lumbar disc pathology from clinical lower spine MRI. Int J Comput Assist Radiol Surg. 2010;5:287-293.

Pan

Zhang

, et al. Automatically diagnosing disk bulge and disk herniation with lumbar magnetic resonance images by using deep convolutional neural networks: method development study. JMIR Med Inform. 2021;9:e14755.

Oktay

Albayrak

Akgul

. Computer aided diagnosis of degenerative intervertebral disc diseases from lumbar MR images. Comput Med Imag Graph. 2014;38:613-619.

Liawrungrueang

Kim

Kotheeranurak

Jitpakdee

Sarasombath

. Automatic detection, classification, and grading of lumbar intervertebral disc degeneration using an artificial neural network model. Diagnostics. 2023;13:663.

Cheung

JPY

Kuang

Lai

MKL

, et al. Learning-based fully automated prediction of lumbar disc degeneration progression with specified clinical parameters and preliminary validation. Eur Spine J. 2022;31:1960-1968.

Liawrungrueang

Cho

Sarasombath

Kim

. Current trends in artificial intelligence-assisted spine surgery: a systematic review. Asian Spine J. 2024;18:146-157.

Moher

Liberati

Tetzlaff

Altman

PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151:264.

Wolff

Moons

KGM

Riley

, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170:51-58.

Fernandez-Felix

López-Alcalde

Roqué

Muriel

Zamora

. CHARMS and PROBAST at your fingertips: a template for data extraction and risk of bias assessment in systematic reviews of predictive models. BMC Med Res Methodol. 2023;23:44.

10.

Koh

Chaudhary

Dhillon

. Disc herniation diagnosis in MRI using a CAD framework and a two-level classifier. Int J Comput Assist Radiol Surg. 2012;7:861-869.

11.

LewandrowskI

K-U

Muraleedharan

Eddy

, et al. Feasibility of deep learning algorithms for reporting in routine spine magnetic resonance imaging. Internet J Spine Surg. 2020;14:S86-S97.

12.

Ruiz-España

Arana

Moratal

. Semiautomatic computer-aided classification of degenerative lumbar spine disease in magnetic resonance imaging. Comput Biol Med. 2015;62:196-205.

13.

Ebrahimzadeh

Fayaz

Nikravan

Ahmadi

Dolatabad

. Towards an automatic diagnosis system for lumbar disc herniation: the significance of local subset feature selection. Biomed Eng Appl Basis Commun. 2018;30:1850044. doi:10.4015/S1016237218500448.

14.

Han

Wei

Mercado

Leung

. Spine-GAN: semantic segmentation of multiple spinal structures. Med Image Anal. 2018;50:23-35.

15.

Sundarsingh

Kesavan

. Diagnosis of disc bulge and disc desiccation in lumbar MRI using concatenated shape and texture features with random forest classifier. Int J Imag Syst Technol. 2020;30:340-347.

16.

Beulah

Sharmila

Pramod

. Degenerative disc disease diagnosis from lumbar MR images using hybrid features. Vis Comput. 2022;38:2771-2783.

17.

Zheng

H-D

Sun

Y-L

Kong

D-W

, et al. Deep learning-based high-accuracy quantitation for lumbar intervertebral disc degeneration from MRI. Nat Commun. 2022;13:841.

18.

Niemeyer

Galbusera

Tao

Kienle

Beer

Wilke

. A deep learning model for the accurate and reliable classification of disc degeneration based on MRI data. Invest Radiol. 2021;56:78-85.

19.

Tsai

J-Y

Hung

IY-J

Guo

, et al. Lumbar disc herniation automatic detection in magnetic resonance imaging based on deep learning. Front Bioeng Biotechnol. 2021;9:708137.

20.

Grob

Loibl

Jamaludin

, et al. External validation of the deep learning system ‘SpineNet’ for grading radiological features of degeneration on MRIs of the lumbar spine. Eur Spine J. 2022;31:2137-2148.

21.

McSweeney

Tiulpin

Saarakkala

, et al. External validation of SpineNet, an open-source deep learning model for grading lumbar disk degeneration MRI features, using the northern Finland birth cohort 1966. Spine. 2023;48:484-491.

22.

Castro-Mateos

Hua

Pozo

Lazary

Frangi

. Intervertebral disc classification by its degree of degeneration from T2-weighted magnetic resonance images. Eur Spine J. 2016;25:2721-2727.

23.

Jamaludin

Kadir

Zisserman

. SpineNet: automated classification and evidence visualization in spinal MRIs. Med Image Anal. 2017;41:63-73.

24.

Gao

Liu

Zhang

Wang

Zhang

. Automated grading of lumbar disc degeneration using a push-pull regularization network based on MRI. J Magn Reson Imag. 2021;53:799-806.

Artificial Intelligence-Assisted MRI Diagnosis in Lumbar Degenerative Disc Disease: A Systematic Review

Abstract

Study Design

Objectives

Methods

Results

Conclusion

Keywords

Introduction

Methods

Literature Search Strategy

Inclusion and Exclusion Criteria

Data Extraction

Assessment of the Risk of Bias in This Systematic Review

Results

Publication information (Authors, Year, Sample Size and Nationality)

Type of Study and AI-Algorithm/Techniques

Risk of Bias Analysis

Dataset of Deep Learning Model and Base of Classification

Performance of Artificial Intelligence

Application and Outcome

Discussion

Conclusion

Footnotes

Author Contributions

Declaration of conflicting interests

Funding

IRB Statement

ORCID iDs

Data Availability Statement

References