Abstract
Objective
To assess the composition of lumbar multifidus muscle, in patients with unilateral lumbar disc herniation causing nerve compression, using quantitative and qualitative magnetic resonance imaging (MRI) measurement methods.
Methods
Two radiologists retrospectively measured MRI signal intensity of the multifidus muscle, as high intensity represents more fat, and visually graded the fat content using a 5-point grading system in patients with unilateral subarticular lumbar disc herniation. Findings from the herniated and contralateral sides were compared. The association between fat content and severity of nerve compression and symptom duration were also evaluated.
Results
Ninety patients (aged 24–70 years) were included. Signal intensity of the affected multifidus muscle was significantly higher versus the contralateral muscle for quantitative measurements and qualitative scoring for both investigators. Significant correlations were observed between the severity of nerve compression and symptom duration and the degree of fat content in the affected multifidus muscle.
Conclusions
Higher fat composition was observed in the multifidus muscle ipsilateral to the lumbar disc herniation versus the contralateral side. Straightforward visual grading of muscle composition regarding fat infiltration appeared to be as useful as quantitative measurement.
Introduction
Lumbar disc herniation is a common disease of the lumbar spine and frequent cause of back pain, muscle spasms, and movement restrictions.1–3 Besides resulting in radiculopathy related signs and symptoms by compressing nerve roots, the presence of lumbar disc herniation also affects the paraspinal muscle, which unfortunately is often overlooked in clinical practice.3–8 The paraspinal muscles play a principal role in the functional and structural stabilization of the lumbar spine, and the most medial of these muscles, the multifidus muscle, has received particular attention due to its unique innervation. 9 In contrast to other components of the paraspinal musculature, the multifidus muscle has unilateral innervation that stems from the medial branch of the posterior root of the segmental nerve. 9 A growing body of literature has described the size and composition of the multifidus muscle, measured using magnetic resonance imaging (MRI), in patients with lumbar disc herniation.3–8,10–14 The value of measuring the multifidus cross-sectional area has been a subject of debate, with many disparities between published works.6,10–15 However, indirect assessment of the lumbar multifidus muscle by measuring MRI signal intensity has been shown to be superior to cross-sectional area measurements in depicting the effect of herniation on the muscle.8,12,16–19 Nevertheless, these measurements are generally complex and time-consuming, and require additional software programs, which prevent these measurements from entering the daily reporting algorithms.
The aim of the present study was to evaluate the lumbar multifidus muscle composition in patients with unilateral disc herniation at L4/L5 or L5/S1, using an MRI visual qualitative grading system compared with quantitative findings derived from MRI signal intensity measurements. In addition, quantitative and qualitative findings were compared with symptom duration and the severity of herniation. The present authors hypothesized that an experienced and well-trained radiologist would be able to interpret changes in the multifidus muscle without the need for additional software or complex measurements.
Patients and methods
Study population
Patients diagnosed with unilateral posterolateral (also named subarticular) disc herniation causing nerve root compression at L4/L5 or L5/S1 level on MRI were sequentially enrolled into this retrospective study, conducted at Mehmet Akif Ersoy Training and Research Hospital between January 2017 and October 2018. All included patients were aged > 18 years and had clinically diagnosed unilateral leg symptoms concordant with the herniation side and level. Patients with bilateral or multi-level disease, with a history of prior spinal surgery, malignancy, and/or history of acute trauma < 1 month prior to the MRI were excluded from the study. Patients with clinical symptoms that were discordant with MRI were excluded, and patients with oedema in the paraspinal muscles, according to short tau inversion recovery (STIR) sequences (described below), were also excluded from the study.
The study was approved by the local ethics committee of Mehmet Akif Ersoy Research and Training Hospital. All procedures involving human participants were performed in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The need for informed consent was waived by the ethics committee, as the study involved the retrospective review of anonymized DICOM files.
MRI acquisition and interpretation
All MRI acquisitions were conducted in the same 1.5 Tesla unit with 12-channel spine coil (Magnetom Avanto; Siemens Medical Solutions, Erlangen, Germany). The lumbar spinal MRI protocol comprised axial T2-weighted images (repetition time [TR], 3370 ms; echo time [TE], 116 ms; matrix size, 208 × 320; and slice thickness, 4 mm), sagittal T2-weighted images (TR, 3370 ms; TE, 116 ms; matrix size, 208 × 320; and slice thickness, 4 mm) and sagittal STIR sequence (TR, 1990 ms; TE, 66 ms; inversion time, 160 ms; flip angle, 150°; matrix size, 208 × 320; and slice thickness, 4 mm). All patients were diagnosed with a single level unilateral posterolateral herniation at L4/L5 or L5/S1 causing nerve compression. Herniation was defined as a focal outpouring of disc material to the outside of normal intervertebral disc space. 20 The severity of nerve compression was graded using the MRI-based nerve compression grading system proposed by Pfirrmann et al., 21 as follows: grade 1, contact of the herniation with the nerve; grade 2, deviation of the nerve; and grade 3, compression of the nerve by the herniation. Identification of the lumbar disc herniation, preparation of the first data set and grading of the herniation was performed by an observer (OLU) with 20 years of experience in musculoskeletal radiology. STIR sequences were used to identify the presence of oedema in the paraspinal muscles, and if the observer identified high-signal intensity in the paraspinal muscles, then the patient was excluded from the study, as stated above. Next, single-slice T2-weighted images from the L4/L5 or L5/S1 levels were captured, using the lower end of the spinous process of the upper vertebra as a landmark to obtain images parallel to the muscle plane. Obtained images were then anonymized and stored for further evaluation.
The anonymized patient data set was then assessed by two investigators (DA and BC) with > 5 years of spinal MRI experience, who were blinded to symptom duration and the severity of nerve compression. Furthermore, prior acquisition of the single slice image passing through the lower spinous process substantially prevents investigators from identifying the side of the herniation, a factor that has not been mentioned in previous studies. First, the two investigators (DA and BC) were instructed and trained together for approximately 10 h, including a self-assessment period for qualitative staging of lumbar multifidus fat, with a grading system initially proposed using computed tomography for rotator cuff muscles, 22 then subsequently modified and used for evaluation of the lumbar multifidus muscle. 10 The system comprised five grades: grade 0, normal muscle tissue; grade 1, fat streaks; grade 2, presence of prominent fat, yet less than muscle tissue; grade 3, prominent fat equivalent to muscle; and grade 4, abundant fat exceeding muscle tissue. During the training period, the two investigators evaluated the images and self-assessed their skills in communication with each other to enhance the consistency of measurements. This original grading system was proposed on T1-weighted axial images, but in the present study, the system was applied to nonfat-saturated axial T2-weighted images, since T1-weighted axial sequence images are not used in the routine lumbar MRI protocol at the Mehmet Akif Ersoy Training and Research Hospital. Subsequently, the investigators were instructed and trained in manual segmentation of the multifidus muscle at the L4/L5 or L5/S1 levels, using ImageJ imaging software, version 1.43 (National Institutes of Health, Bethesda, MD, USA) as previously described.8,17 The investigators also received approximately 10 h of training for quantitative measurements.
Once trained, the investigators independently rated the anonymized images. In the first session, images were qualitatively graded and scored using the qualitative scoring system, 22 then in the next session, images were re-evaluated in a random order for quantitative measurements. For quantitative measurements, the investigators manually drew the border of the lumbar multifidus muscle at the L4/L5 or L5/S1 levels, and calculated signal intensity using ImageJ imaging software, in which higher signal intensity equated to more fat inside the region of interest (ROI). The investigators also drew an ROI of 10 mm D onto the ipsilateral psoas major, and the ratio of multifidus signal intensity to psoas major was calculated as previously described. 23 For the quantitative measurements, the investigators calculated the mean signal values for each ROI, without any threshold to separate fatty composition of the muscles from the lean muscle mass. An example of muscle segmentation in a representative patient is shown in Figure 1.

Representative magnetic resonance images from a 55-year-old male with a 36-month history of back pain and right lower extremity radiculopathy: (a) The sagittal T2-weighted image showing right posterolateral herniation (arrow) compressing the right L5 nerve, and (b) An axial image created by the initial observer using the parallel line passing through the lower spinous process of the L4 (green line on a). The two independent investigators manually segmented the lumbar multifidus muscles and placed a circular region of interest (ROI) with a 10 mm diameter onto the psoas muscles on the axial image (b). The affected multifidus (green ROI) was visually graded as 3 by both observers and the contralateral muscle (blue ROI) was graded as 1.
Statistical analyses
Continuous variables are presented as mean ± SD, unless otherwise specified, and categorical variables are presented as n (%) prevalence. Statistical analyses were performed using SPSS software, version 21 (IBM, Armonk, NY, USA). Visual histograms and Kolmogorov–Smirnov test were used to assess normality of data distribution. Paired Student's t-test was used to compare signal intensity of the multifidus muscle on the affected versus contralateral side, and to compare signal ratio of the multifidus to psoas muscle between the affected versus contralateral side. Visually measured fat content of the multifidus muscle was presented as ordinal variables, and nonparametric Wilcoxon signed–rank test was used to compare the results of visual muscle fat content assessment. A P value < 0.05 was considered to be statistically significant. Correlations between symptom duration and the ratio of multifidus signal intensity to psoas signal intensity on the ipsilateral side with herniation, and the correlation between severity of herniation and the ratio of multifidus signal intensity to psoas signal intensity were assessed using Pearson’s correlation coefficient. Spearman’s rank correlation coefficient was used to investigate the association between visual fat content scores and the severity of herniation and duration of symptoms. A 5% type-I error level was used to infer statistical significance. Interobserver agreement for quantitative signal-intensity results for muscle on the affected and contralateral side was calculated using intra-class correlation coefficients (ICCs). A 95% confidence interval (CI) was constructed for each ICC. An ICC value > 0.80 was considered to show excellent agreement. Cohen’s kappa coefficient was used to asses interobserver reliability of visual grading scores, and values > 0.80 were considered to show excellent agreement.
Results
A total of 90 patients, 50 female (55.6%) and 40 male (44.4%), were enrolled in the final study cohort (Table 1), with a mean age of 44.88 ± 12.46 years (range, 24–70 years). The majority of herniations were observed on the right side (61.1%).
Demographic, clinical and magnetic resonance imaging findings in patients with unilateral lumbar disc herniation.
Data presented as mean ± SD, range or n (%) prevalence.
Signal intensity was significantly higher in the affected multifidus muscle (indicating higher fat content) compared with the contralateral muscle, for both investigators (P < 0.0001; Table 2). No statistically significant difference was identified in terms of psoas signal intensities by either investigator (P > 0.05; Table 2); mean psoas muscle signal intensities for investigator (observer) 1 were 31.63 ± 1.10 for the affected side and 31.57 ± 1.12 for the contralateral side (P = 0.098), and for investigator (observer) 2 were 31.68 ± 1.17 for the affected side and 31.11 ± 1.24 for the contralateral side (P = 0.076). The ratios of multifidus muscle to psoas muscle signal intensity on the affected side were 1.83 ± 0.36 and 1.89 ± 0.37 for the two investigators, respectively. The multifidus to psoas ratios for the contralateral side were 1.35 ± 0.16 and 1.37 ± 0.23 for the two investigators, respectively. There were statistically significant differences in ratios between the two sides for both investigators (P < 0.0001; Table 2). Wilcoxon signed–rank test to assess qualitative visual grading scores revealed that scores were significantly higher on the ipsilateral multifidus muscle compared with the contralateral side for both investigators (P < 0.0001; Table 2).
Quantitative signal intensity and visual fat content qualitative grading scores of the multifidus muscle in 90 patients with unilateral lumbar disc herniation.
Data presented as mean ± SD for signal intensity measurements, or n (%) prevalence for visual grading.
OB, observer (investigator); VG, visual grading.
NS, no statistically significant between-group difference (P > 0.05; Paired Student's t-test or Wilcoxon signed–rank test).
Moderate positive correlations were identified for both investigators between patient age and the signal intensity ratios of the multifidus to psoas muscle on the contralateral side of the herniation, indicating an increase in muscle fat composition with increasing age (r = 0.32, P = 0.002 and r = 0.41 P = 0.001 for investigator 1 and 2, respectively). Significant correlations were observed between the severity of nerve compression and the affected multifidus to psoas muscle ratio for both investigators (P < 0.0001 for both, r = 0.835 and 0.875, respectively). Significant correlations were observed between the duration of the symptoms and the ratio of the affected multifidus to the psoas muscle for both observers (P <0.0001 for both, r = 0.657 and 0.700, respectively). Spearman’s rank correlation coefficient revealed statistically significant correlations between symptom duration and visually-graded muscle fat content for both investigators (P < 0.0001, r = 0.729 and r = 0.72, respectively). Moreover, there were statistically significant correlations between the severity of nerve compression and fat content of the muscle (P <0.0001, r = 0.885 and r = 0.888, respectively).
Cohen’s kappa value was 0.81 for qualitative visual grading of the affected muscle, and the ICC value was 0.96 (95% CI 0.94, 0.97) for quantitative measurements of the affected muscle.
Discussion
In line with previous work,7,12,17 the present study demonstrated that fatty composition of the multifidus muscle at the same level and on the same side as lumbar herniation was considerably higher than on the contralateral side, and visual grading of the increased multifidus fat content was a reliable method with excellent inter-observer reliability scores. The present work also identified substantial correlations between muscle fat composition and symptom duration, and fatty changes to the muscle and severity of nerve compression.
The lumbar multifidus is well recognised to play a notable role in lumbar spine function. Indeed, much evidence has suggested that multifidus muscle dysfunction in lumbar disc herniation negatively affects the patient’s functional outcome following lumbar disc surgery, and knowledge of the presence and degree of muscle dysfunction would greatly aid physicians in tailoring the most suitable rehabilitation program to restore the functions of the lumbar spine.24–28 Accordingly, the composition and morphology of the multifidus muscle in patients with lumbar disc herniation or nerve root injury has gained much attention. One of the first studies concerning the effect of lumbar disc herniation on paraspinal muscles found that in a porcine model of nerve root injury, the cross-sectional area of the multifidus was decreased and the fat content was increased as a consequence of nerve injury. 29
Several further studies based on a theoretical framework of the experimental work of Hodges et al. 29 have explored the potential association between nerve root compression associated with lumbar disc herniation and fatty infiltration of the lumbar multifidus muscle. In line with the present results, a study that measured signal intensity of the multifidus muscle demonstrated increased fat content of the multifidus at the same level and same side as the herniation, in patients with L4/L5 herniation and concordant leg pain. 17 However, structural changes to the muscle in this study were prominent on the same side, yet at one level below the herniation. 17 Using a more straightforward semi-quantitative measurement method involving distance of the muscle to the lamina, 12 the muscle to lamina distance was reported to be increased on the same side and at the same level as the herniation, and the increased muscle to lamina distance was associated with fatty atrophy. 12 However, in contrast to the present findings, no association between increased fatty infiltration and duration of symptoms, or increased fatty infiltration and severity of nerve compression was found in either of these published studies.12,17 In line with the present results, Wan et al. 30 and Kim et al. 7 found an inverse correlation between symptom duration and atrophy of the multifidus muscle in patients with lumbar disc herniation. Despite the main focus of the study by Wan et al. 30 being the assessment of patients with lumbar back pain, rather than disc herniation, the authors stated that symptom duration was closely associated with fatty infiltration in addition to muscular atrophy.
A meta-analysis of 28 studies that evaluated the impact of lumbar disc herniation, facet degeneration, or canal stenosis on lumbar paraspinal muscle morphology, concluded that disc herniation and severe facet degeneration were associated with altered paraspinal muscle morphology at, or below, the pathology level. 31 However, the authors highlighted the different measurement techniques in these studies and the need for a consensus on measurement methods, and recommend measurement techniques for standardization. 31 The authors also noted that normalization of brightness for pixel intensity assessment is a matter of importance to reduce variability between equipment and facilities. 31 Pixel intensity normalization techniques were not employed in the present study, however, this issue may not have affected the reliability of the present study, as the same measurement protocol and MRI unit was used for the whole study population. The present authors do note that the lack of pixel normalization will hamper the comparability of the present results with any future studies. Hence, pixel intensity normalization should be employed in future work to provide standardization for the fat content measurement in paraspinal muscles.
A study that evaluated the multifidus muscle in patients with unilateral L4/L5 muscle hernia, using the same visual grading system as the present study, 10 found significantly higher scores for the muscle on the same side as lumbar disc herniation, with an excellent ICC of 0.84, which is comparable with the present ICC of 0.81. However, the potential association between symptom duration or herniation severity and the muscle composition was not explored. 10 In addition to this published work, 10 the present study utilized both quantitative and qualitative methods to measure multifidus muscle composition in patients with lumbar disc herniation and demonstrated that qualitative measurement of the multifidus muscle was comparable to the quantitative method, despite the slightly lower inter-rater reliability rate (0.96 versus 0.81).
The feasibility and reliability of the Goutallier scoring system for assessing fat content of the lumbar multifidus muscle has been investigated previously. 32 Unlike the present study population, which consisted solely of patients with unilateral lumbar disc herniation, the previous investigation randomly included patients with different degrees of fatty infiltration in the lumbar multifidus muscle. 32 Similar to the present results, the Goutallier scoring system was found to yield comparable results to quantitative measurements obtained using ImageJ software, and the authors also stated that visual grading of the fat content had excellent interobserver reliability. 32 Therefore, the present authors suggest that as the straightforward and practical visual scoring system appears to be as efficient as quantitative methods, it may be unnecessary to apply quantitative analysis, which can be inconvenient to integrate into daily interpretation algorithms, particularly since radiologists worldwide are increasingly overwhelmed by the exponentially increasing number of films to report daily.
Several limitations of the present results should be acknowledged. First, histological samples were not obtained to compare with the imaging findings, however, previously published work has shown both the increased fatty infiltration of multifidus muscle fibres in the setting of nerve root injury, and the ability of MRI in depicting multifidus composition alterations in nerve injury settings. 33 Thus, the present authors believe that this limitation should not affect the reliability of the present results. Secondly, the lumbar multifidus muscle was only assessed at the same level as the herniation since the segmental nerve directly innervates the muscle at the same level; furthermore, side- and level-specific changes are now well recognized, and the authors believe it does not need to be further traversed. Thirdly, unilateral radiculopathy was assessed with clinical examination, and electrodiagnostic work-up findings for the patients were not available. Fourthly, although inter-observer reliability was calculated, and yielded an excellent agreement between observers for both quantitative and visual fat analyses, the intra-reader reliability was not assessed. Notwithstanding, previous work has demonstrated that pixel measurements using ImageJ software as a surrogate marker for fat infiltration of the muscles has excellent intra-reader and inter-reader reliability. 23 Finally, the investigators in the present study were trained and instructed in the visual and quantitative measurements; hence one might suggest that inter-observer reliability or diagnostic accuracy of the measurements, particularly visual grading, might decrease in the daily settings. However, the investigators were only trained for approximately 20 hours, or possibly even lower, and this is an acceptable duration for a musculoskeletal radiologist, since the volume of lumbar spinal imaging is significantly high in the daily workflow.
In conclusion, both quantitative measurement methods and visual grading of the multifidus muscle, in patients with unilateral lumbar disc herniation, demonstrated increased fatty infiltration of the multifidus muscle on the same side and at the same level as the lumbar disc herniation, and fatty infiltration of the muscle correlated with symptom duration and the severity of nerve compression. Furthermore, the straightforward and practical visual grading of muscle composition regarding fat infiltration was as useful as quantitative measurement; hence, the present authors suggest that visually grading fatty infiltration of the multifidus muscle may be preferred in daily practice, and should be included in the reporting algorithms.
Footnotes
Acknowledgments
The authors would like to thank Dr Onur Levent Ulusoy for his contribution in the interpretation of patient magnetic resonance images.
Declaration of conflicting interest
The authors declare that there is no conflict of interest.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
