Abstract
Study Design
Retrospective study.
Objective
To evaluate diagnostic concordance between CT–derived Hounsfield Units (HU) and DXA T-scores in spinal surgery candidates, and to identify factors related to discordance.
Methods
We analyzed 180 patients (mean age 72.4 ± 8.2 years) who had lumbar CT and DXA. DXA osteoporosis was defined by the lowest T-score (lumbar spine or hip) ≤ −2.5, and CT osteoporosis was defined as HU ≤ 100. Patients were classified into concordant positive (DXA+/HU+), concordant negative (DXA−/HU−), and two discordant groups (DXA+/HU−, DXA−/HU+). Analyses included correlation, ROC analysis, and comparisons of age, BMI, sex, and the DXA site yielding the lowest T-score.
Results
HU and T-scores showed correlation (Spearman’s ρ = 0.467, P < 0.001) with discrimination (AUC = 0.700; 95% CI 0.614-0.781). Concordance was 72.2% (130/180; DXA−/HU− = 94, DXA+/HU+ = 36); discordance was 27.8% (50/180; DXA+/HU− = 24; DXA−/HU+ = 26). DXA+/HU + patients were older than DXA−/HU− (75.9 ± 6.5 vs 71.1 ± 9.1 years; P = 0.003), and both DXA + groups had lower BMI (22.9 ± 3.8 and 22.7 ± 4.8 vs 24.8 ± 3.8 kg/m2; P = 0.009 and 0.029). The HU 100-150 “gray zone” was not associated with discordance (23.3% vs 30.8%, P = 0.311).
Conclusions
HU values show moderate agreement with DXA and are useful for opportunistic screening. Given ∼27% discordance—especially in older or lower-BMI patients—HU should be interpreted alongside DXA for comprehensive assessment.
Keywords
Introduction
Osteoporosis is a common comorbidity among patients undergoing spinal surgery, particularly in the elderly population.1,2 Accurate preoperative assessment of bone quality is essential to guide surgical planning and implant selection, and to minimize complications such as pedicle screw loosening and cage subsidence.3-5
Dual-energy X-ray absorptiometry (DXA) remains the gold standard for diagnosing osteoporosis based on T-scores. 6 However, in spine surgery patients, its reliability may be compromised by degenerative changes or osteophytes in the lumbar spine, which can falsely elevate bone mineral density (BMD) readings. Furthermore, due to the requirement of specialized equipment, DXA is not available in all facilities. Recent real-world data have shown that only about 25% of patients undergo DXA preoperatively, and as few as 10% postoperatively. 7
Recently, Hounsfield Unit (HU) values derived from routine computed tomography (CT) scans have gained attention as a surrogate marker for BMD. Several studies have reported moderate to strong correlations between vertebral HU values and DXA-derived T-scores, supporting its use as a practical tool for opportunistic osteoporosis screening.8-10
However, despite these advances, diagnostic discordance between HU-based assessments and DXA-derived T-scores is frequently observed: some patients present with high HU values but low T-scores, and vice versa. Such discrepancies may lead to misclassification of bone quality and suboptimal surgical decision-making. However, the extent of this discordance and the clinical factors that contribute to it remain unclear. Prior HU–DXA studies have largely focused on correlation and the performance of a single HU cutoff. In clinical practice, however, osteoporosis is defined using the lowest T-score across DXA sites, and discordance between lumbar CT HU and DXA classification may directly influence perioperative planning in spine surgery. Accordingly, we aimed to quantify clinically relevant HU–DXA discordance patterns, evaluate pragmatic HU thresholds for preoperative triage, and identify patient factors associated with mismatch. We hypothesized that age, sex, BMI, and the DXA site yielding the lowest T-score would contribute to HU–T-score discordance.
Methods
IRB Statement
This study was conducted in accordance with the ethical standards of the institutional and national research committees. Approval was obtained from the Institutional Review Board and the Committee for Conflict-of-Interest Management (25R-049). Given the study’s retrospective nature, the requirement for written informed consent was waived.
Included Patients
A total of 180 patients who underwent preoperative evaluation for degenerative spinal disorders at our institution between November 2018 and July 2025 were retrospectively reviewed. Inclusion criteria included age ≥ 50 years and the availability of both lumbar spine CT and DXA within three months prior to surgery. Patients were excluded if they had a history of spinal instrumentation or kyphoplasty, spinal tumors, inflammatory spinal diseases, or if multiple vertebral fractures precluded accurate HU measurement between L1 and L4. Demographic data including age, sex, height, weight, and body mass index (BMI) were collected. Information on prior total hip arthroplasty (THA) was not captured as a structured variable in the database; therefore, patients with a history of unilateral THA were not explicitly excluded a priori, whereas patients with bilateral THA, for whom hip DXA could not be obtained, were excluded during data cleaning.
Hounsfield Unit Measurement
Axial HU values were measured using preoperative lumbar spine CT scans. 11 As in previous papers, the evaluation was performed as follows, following the report by Schreiber JJ et al.12-14
Three axial slices were selected for each vertebral body from L1 to L4: just below the superior endplate, at the mid-vertebral body, and just above the inferior endplate. At each level, an elliptical region of interest (ROI) was placed within the trabecular bone, carefully avoiding cortical bone and artifact areas. The mean HU value was calculated across all 12 slices (3 per vertebra × 4 vertebrae), yielding the Axial L1-4 average HU (Ax L1-4 Ave HU).
All examinations were performed at a single tertiary-care center on one of four Siemens helical scanners (Siemens Healthineers, Erlangen, Germany) under standardized parameters (tube voltage 120 kVp, tube current 280 mA, slice thickness 3.0 mm, identical reconstruction interval) and reconstructed with the same bone reconstruction algorithm. Each scanner underwent routine calibration with a manufacturer-provided phantom in accordance with quality-assurance procedures to ensure longitudinal stability and reproducibility of HU measurements across devices. Cases with significant image artifacts or vertebral deformities that prevented consistent ROI placement were excluded. For all cases, a multi-density calibration phantom was included during acquisition to standardize HU measurements across patients and sessions. The phantom was positioned beneath the lumbar region according to the manufacturer’s instructions. HU values were extracted from trabecular ROIs and interpreted as phantom-referenced axial L1-4 averages. This approach reduces the influence of scanner-specific and reconstruction-related factors and enhances between-scan comparability within the cohort.
To ensure reproducibility, intra-rater and inter-rater reliability were assessed using the intraclass correlation coefficient (ICC [3,1]) based on a two-way mixed-effects model with absolute agreement.
Dual-energy X-ray Absorptiometry
DXA was performed using the Horizon W® system (Hologic Inc., USA). T-scores were obtained for both the lumbar spine and femoral neck. The lowest T-score (Low T-score) from either site was used for analysis for each patient. This approach reflects clinical practice and WHO-based diagnostic criteria, in which osteoporosis is diagnosed if T-score ≤ −2.5 at any measured site. In addition, lumbar spine T-scores in patients with degenerative lumbar disease are often artificially elevated by osteophytes and facet arthrosis; using the lowest T-score therefore provides a more conservative estimate of systemic bone fragility than relying on lumbar values alone. Osteoporosis was defined as a Low T-score ≤ −2.5. 15 The interval between CT and DXA was less than three months in all cases, with most performed within four weeks of each other.
Statistical Analysis
All statistical analyses were conducted using SPSS software (version 23.0; IBM Corp., Armonk, NY, USA). Continuous variables were reported as mean ± standard deviation (SD), and categorical variables as frequencies and percentages. The relationship between Ax L1-4 Ave HU and Low T-score was assessed using linear regression and Spearman’s rank correlation.
For diagnostic classification, patients were categorized into four groups based on a threshold of 100 HU (Ax L1-4 Ave) and the DXA-based definition of osteoporosis (T-score ≤ −2.5). This 100 HU threshold was based on a previous paper 16 : To minimize analytic bias, we retained this prespecified cutoff for the main analyses and only used cohort-specific optimization as a secondary check: in our data, ROC analysis with Youden’s index yielded an optimal cut-point ≈99 HU, which is concordant with the chosen threshold. Because any single cutoff trades sensitivity for specificity, we further conducted a threshold sensitivity analysis at 90, 100, 110, 120, and 130 HU, reporting concordance with DXA, sensitivity, specificity, PPV, NPV, and Cohen’s κ. This range of 90-130 HU was selected a priori based on prior opportunistic CT studies, which have reported lumbar thresholds of approximately 90-135 HU for identifying patients with DXA-defined osteoporosis. In addition, our HU measurements were phantom-referenced, which tends to shift absolute HU values slightly downward compared with non-calibrated scans, and our spine surgery cohort included many patients with advanced degenerative disease and poor bone quality. Therefore, we anticipated that the optimal HU cut-off in this population would fall toward the lower end of the published range. Detailed findings—including the Youden-optimal cut-point and operating characteristics at each threshold—are provided in the Results.
Concordant Osteoporosis (Both HU and DXA indicate Osteoporosis)
Concordant Non-Osteoporosis (Both HU and DXA normal)
DXA-positive/HU-negative: T-score ≤ −2.5 and HU > 100
Diagnostic Group Definitions Based on HU and DXA
Patients Were Categorized Into four diagnostic Groups According to axial L1-4 Average Hounsfield Units (HU) Obtained From Lumbar Spine CT and the Low T-score (Lumbar Spine or Femoral Neck) Derived From DXA. A Threshold of ≤100 HU was Used to Define osteoporosis Based on CT, and a T-score of ≤ −2.5 was Used to Define osteoporosis Based on DXA. Concordant Groups Reflect Agreement Between HU and DXA Classifications, While Discordant Groups Indicate a Mismatch Between the two Modalities
Group differences in age and BMI were assessed using the Mann–Whitney U test. Sensitivity analyses using independent t-tests were conducted when normality assumptions appeared reasonable. Diagnostic accuracy of HU was evaluated via receiver operating characteristic (ROC) curve analysis with area under the curve (AUC) calculated. Residuals from the HU–T-score regression model were plotted to evaluate the magnitude and distribution of discordance. Sex and DXA site distributions were compared across discordant groups. A two-tailed P < 0.05 was considered statistically significant.
Results
Summary of Patient Characteristics and Diagnostic Indicators for Osteoporosis
Values are expressed as mean ± SD or n (%), as appropriate. Where applicable, “Low T-score” denotes the minimum of lumbar spine and femoral neck T-scores for each patient.
BMI, Body Mass Index; HU, Hounsfield Unit; BMD, Bone Mineral Density; YAM, Young Adult Mean; DXA, dual-energy X-ray absorptiometry.
Across the 180 patients, axial L1-4 HU showed a clear positive association with DXA-derived bone status (Figure 1). Axial HU was moderately correlated with the DXA-derived Low T-score (Pearson’s r = 0.505, Spearman’s ρ = 0.467; both P < .001). Correlations with site-specific T-scores were also significant (lumbar T-score: Pearson’s r = 0.664, Spearman’s ρ = 0.617; femoral-neck T-score: Spearman’s ρ = 0.450; all P < .001). Correlation between Axial L1-4 HU and DXA T-score
ROC analysis for detecting osteoporosis defined by a Low T-score ≤ −2.5 demonstrated moderate diagnostic accuracy (AUC = 0.700, 95% CI 0.614-0.781; Figure 2). The Youden index identified an optimal cutoff of 98.7 HU (i.e., approximately 100 HU), consistent with pragmatic thresholds reported in spine practice. At HU ≤ 100, sensitivity was 0.600 and specificity was 0.783 (PPV 0.581; NPV 0.797; Table 3). As the HU threshold increased (e.g., 110-130 HU), sensitivity increased at the expense of specificity, with overall concordance changing modestly and Cohen’s κ remaining in the fair–moderate range (Table 3). Accordingly, we highlight HU ≤ 100 as a clinically pragmatic triage threshold, while presenting the full range of prespecified thresholds in Table 3. ROC Curve of Axial HU in Predicting Osteoporosis Threshold Sensitivity Analysis of Axial L1-4 HU for Detecting Osteoporosis Receiver Operating Characteristic (ROC) Analysis using DXA osteoporosis Defined as Low T-score ≤ −2.5 (Minimum of Lumbar Spine and Femoral Neck). Patients Were re-classified at Prespecified HU Thresholds (90, 100, 110, 120, 130 HU) and at the Youden-optimal Cutoff (≈99 HU) Derived From this Cohort (n = 180; AUC = 0.700; 95% CI, 0.614-0.781). For each Threshold, we Report concordance With DXA (%, n), sensitivity, Specificity, PPV, NPV, Cohen’s κ, and the Confusion-matrix Counts (TP/FP/TN/FN). Classification Rule: HU ≤ Threshold ⇒ osteoporosis (HU+) HU, Hounsfield units; DXA, dual-energy X-ray absorptiometry; PPV, positive predictive value; NPV, negative predictive value.
Using HU ≤ 100 and Low T-score ≤ −2.5 to define osteoporosis, 130/180 (72.2%) cases were concordant (DXA+/HU + or DXA−/HU−) and 50/180 (27.8%) were discordant (DXA+/HU− or DXA−/HU+). Concordant cases tended to cluster near the HU spectrum’s ends—clearly low values (<100 HU) corresponding to osteoporosis and clearly high values (>150 HU) corresponding to normal bone density—whereas discordant cases were commonly observed in the intermediate HU range around 100-150 HU (Figure 3); however, the discordance rate was not significantly different inside vs outside this ‘gray zone’ (100-150 HU: 17/73 [23.3%] vs outside: 33/107 [30.8%], Fisher’s exact P = 0.311). Distribution of axial L1-4 average HU by concordance with DXA classification
Model diagnostics were consistent with an overall acceptable fit but meaningful case-level variability. Residuals were centered around zero without major systematic bias (Figure 4). However, residual dispersion was greater in discordant cases, indicating that HU-based estimation of DXA status is less reliable at the individual level when HU and DXA disagree. Histogram of residuals from HU–T-score regression
Baseline Characteristics Across Diagnostic Groups Defined by DXA and CT-Derived HU
Values are Shown as Mean ± SD Unless Otherwise Indicated. “Sex (Male/Female)” is Presented as Counts. Minimum T-score and Minimum YAM Denote the Lowest Value Between Lumbar Spine and Femoral Neck for each patient. Axial L1-4 HU Indicates the Average HU Measured on axial CT Slices at L1–L4 (three Slices per Vertebra; Trabecular ROI Avoiding Cortex and Artifacts)
Diagnostic groups are defined by DXA osteoporosis (T-score ≤−2.5 at lumbar spine or femoral neck) and HU threshold (Axial L1-4 HU ≤100): DXA−/HU−, DXA+/HU+, DXA+/HU−, and DXA−/HU+.
HU, Hounsfield Unit; DXA, dual-energy X-ray absorptiometry; BMD, bone mineral density; YAM, young adult mean; SD, standard deviation; n, number.

Comparison of Age and BMI Across Diagnostic Groups
Figure 6 illustrates the distribution of sex and the anatomical site of the Low T-score across the four diagnostic groups. Both DXA + groups showed a marked female predominance, whereas the DXA−/HU− group had a higher proportion of males. The hip was the most frequent site of the Low T-score across all groups and was universal in the DXA+/HU− group, while lumbar spine–dominant Low T-scores were relatively uncommon. These patterns suggest that sex composition and hip-dominant fragility may contribute to discordance between HU- and DXA-based classifications. Distribution of Sex and DXA Site Across Diagnostic Groups
Discussion
This study evaluated the diagnostic concordance between axial lumbar HU values obtained from preoperative CT scans and DXA-derived T-scores in a cohort of 180 patients undergoing spinal surgery. HU and T-scores showed a significant positive correlation, yet nearly one-third of patients (27.8%) exhibited diagnostic discordance between the two modalities despite the use of standardized thresholds (HU ≤ 100 and T-score ≤ −2.5). While prior studies have primarily emphasized HU–DXA correlation and single-threshold performance, our analysis focuses on clinically relevant discordance patterns using WHO-consistent practice. This distinction is particularly important in a spine-surgery cohort, where preoperative lumbar CT is routinely available for opportunistic assessment and where missed osteoporosis may directly influence fixation strategy, complication risk, and the timing of bone health optimization. Therefore, beyond reporting association, we aimed to quantify the real-world misclassification risk and to frame HU thresholds as pragmatic triage tools rather than definitive diagnostic substitutes for DXA.
ROC analysis revealed that axial HU values demonstrated moderate diagnostic performance for identifying osteoporosis (AUC = 0.700, 95% CI 0.614-0.781). These findings suggest that HU values are useful as a screening tool; however, the notable discordance rate indicates that HU should be interpreted as an adjunct to DXA rather than a stand-alone diagnostic test. At the commonly used threshold of HU ≤ 100, HU identified DXA-defined osteoporosis with a sensitivity of 0.600 and a specificity of 0.783 (PPV 0.581; NPV 0.797), meaning that approximately 40% of DXA-osteoporosis cases (24/60) would be missed by HU alone. This underscores the potential risk of underestimating bone fragility if CT HU is used as the sole determinant. Our threshold sensitivity analysis (Table 3) further showed that raising the HU cutoff (110-130 HU) improves sensitivity but reduces specificity, modestly changing net concordance and leaving Cohen’s κ in the fair–moderate range. Accordingly, HU thresholds should be viewed as decision aids rather than absolute diagnostic rules, and surgical decisions should integrate HU with DXA findings, patient characteristics, and procedural risk.
The subgroup analyses provide insight into which patients are most prone to HU–DXA discordance. As shown in Figure 5, the DXA+/HU + group (low T-score and low HU) had a significantly higher mean age than the DXA−/HU− group, and both DXA + groups (DXA+/HU+ and DXA+/HU−) exhibited lower BMI than the DXA−/HU− group. These patterns suggest that older age and lower BMI are associated with poorer bone status and may influence discordance. Figure 6 further demonstrated that the DXA+/HU− group had a strong female predominance and that all patients in this group had their lowest T-score at the hip rather than the lumbar spine. This indicates that lumbar DXA itself was not the cause of discordance, and that factors such as regional HU overestimation or under-recognition of hip fragility may be involved. In contrast, the DXA−/HU + group showed a more balanced sex distribution and predominantly had the hip as the site of the lowest T-score, suggesting that DXA may not fully capture fragility in patients with low trabecular bone density when measurements are limited to cortical-dominant sites like the hip. These findings are consistent with previous reports suggesting that vertebral HU values reflect trabecular bone density and are less affected by degenerative changes than lumbar DXA measurements.11,16-18 Several mechanisms may underlie these discordant patterns. First, DXA provides a two-dimensional areal BMD measure that can be influenced by body size and overlying structures, whereas CT-derived HU reflects attenuation within predefined vertebral ROIs and more directly samples vertebral trabecular bone. Second, vertebral HU is predominantly trabecular, whereas femoral-neck DXA may be more influenced by cortical bone; discordance may occur when trabecular loss and cortical preservation (or the reverse) coexist. Third, degenerative changes such as osteophytes and endplate sclerosis can artifactually elevate lumbar DXA measurements, supporting reliance on the lowest T-score across sites in clinical practice. Finally, because our CT assessment was limited to lumbar HU, discordance—particularly in cases driven by a hip lowest T-score—may also reflect regional fragility and site mismatch between lumbar CT and hip DXA. To explore whether diagnostic uncertainty was concentrated in the intermediate HU range, we examined discordance rates within the 100-150 HU “gray zone.” Discordance rates were similar inside and outside this range (23.3% [17/73] vs 30.8% [33/107], P = 0.311), indicating that while intermediate HU values warrant careful interpretation, they do not independently predict discordance. Thus, borderline HU values warrant cautious interpretation—particularly in older or low-BMI individuals—while the 100-150 HU range should not be treated as a strict diagnostic boundary.
Taken together, these data suggest a potential mismatch between bone quantity, assessed by DXA, and bone quality, as reflected by HU. While DXA quantifies areal BMD, it may not adequately capture microarchitectural integrity or regional fragility, especially when cortical bone dominates, as in hip measurements. In contrast, HU measurements from CT better represent vertebral trabecular attenuation, which may correlate more closely with mechanical strength. Our single-center surgical cohort skews toward older patients with spine pathology, which can inflate the prevalence of low DXA T-scores and lower HU, potentially limiting generalizability to younger, healthier populations. Age-stratified analyses suggested higher negative concordance in younger patients and a greater proportion of discordance in the 60-69-year band, which is consistent with the expectation that younger adults more commonly show concordant negatives on both HU and DXA. However, these subgroup findings are exploratory and derived from a surgical cohort; prospective, multi-center studies including younger, community-dwelling controls will be necessary to confirm the performance and calibration of HU for general screening beyond preoperative settings.
In addition to DXA and CT-derived HU, magnetic resonance imaging (MRI)–based scoring systems for vertebral bone quality (VBQ) have also been proposed. MRI can provide complementary information on bone marrow composition, trabecular structure, and fat infiltration, potentially capturing aspects of bone quality that are not fully reflected by areal BMD or HU attenuation alone. Because MRI is frequently obtained in patients with degenerative spine disease, MRI-based VBQ scores may serve as an additional opportunistic tool, analogous to HU assessment on CT. Although MRI parameters were not evaluated in the present study, future work integrating DXA, CT-based HU, and MRI-derived bone quality indices may help establish a more comprehensive, multimodal framework for osteoporosis and fragility risk assessment in spine surgery candidates.
Clinically, these findings support the utility of HU as an adjunctive tool for assessing bone quality when DXA is unavailable or potentially inaccurate. 19 Prior studies have also supported the diagnostic value of HU.16-18,20 Given its reproducibility, routine availability on preoperative CT, and ability to reflect trabecular structure, HU assessment is well suited for preoperative bone quality evaluation and surgical risk stratification. From a practical standpoint, a risk-stratified workflow may be useful. When HU and DXA are concordant, management is straightforward (bone health optimization for concordant osteoporosis; routine management for concordant non-osteoporosis). When results are discordant, HU thresholds should be interpreted as decision aids rather than definitive diagnostic rules and integrated with DXA findings, clinical risk factors, and procedural risk. For DXA-defined osteoporosis with HU above the threshold (DXA+/HU−), hip-predominant fragility and/or site mismatch is likely; bone health optimization should be prioritized, and adjunct evaluation (e.g., QCT where available, MRI-based VBQ, or metabolic work-up including endocrinology referral) may be considered when uncertainty persists or surgical risk is high. For low HU with non-osteoporotic DXA (DXA−/HU+), HU may capture trabecular deficits not fully reflected by DXA; in higher-risk constructs, a more cautious strategy and additional evaluation can be considered. Because CT is more widely available than DXA in many care settings, opportunistic assessment of axial L1-4 HU from existing scans may help reduce access barriers and support earlier triage (e.g., prompting DXA referral or preoperative bone health optimization). At the same time, the moderate AUC and the non-trivial false-negative proportion at the 100-HU threshold emphasize that HU is best used as an adjunct to DXA.
Beyond bone density alone, there is growing recognition that muscle mass and quality also influence clinical outcomes and treatment decisions in spine surgery candidates. The emerging concept of “sarcoporosis” highlights the interplay between osteoporosis and sarcopenia, and recent work has demonstrated crosstalk between reduced vertebral bone density and fat-infiltrated psoas at the upper lumbar levels. Özcan-Ekşi et al reported that fat-infiltrated psoas is closely linked to osteoporotic changes and may represent an additional marker of frailty in this population. 21 Integrating CT-based bone metrics with quantitative assessments of sarcopenia or muscle fat infiltration may therefore further refine risk stratification and guide future therapeutic strategies.
In recent years, the importance of bone health optimization (BHO) in spine surgery has been increasingly recognized. 22 CT-derived HU values have been proposed as a tool to predict complications such as screw loosening or cage subsidence, and opportunistic HU assessment offers a practical strategy for identifying at-risk patients, especially among the elderly or those suspected of osteoporosis. However, the observed discordance in our cohort underscores the need for integrated assessment—particularly in high-risk populations or cases near diagnostic thresholds. Accordingly, HU is best positioned as an opportunistic triage tool that complements DXA and clinical risk assessment, informing the urgency of bone health optimization and perioperative planning. A combined, integrative assessment using both modalities remains essential.
This study has several limitations. First, its retrospective design precludes causal inference and introduces potential selection bias. Second, although axial HU values from L1 to L4 were measured using a standardized, phantom-calibrated protocol, anatomical variation and image quality may still affect consistency, and our results may not be directly generalizable to other scanners, reconstruction kernels, or acquisition parameters without local calibration. Third, while the sample size of 180 patients is comparable to prior HU–DXA correlation studies, it may still be underpowered to detect subtle subgroup differences or to establish generalizable cutoffs across diverse populations. A larger, multicenter cohort would provide more robust statistical power. Fourth, we used DXA T-scores as the reference standard for diagnosing osteoporosis, consistent with WHO criteria. Given the well-known limitations of DXA, particularly in patients with spinal degenerative changes, using DXA as the gold standard may introduce classification bias when evaluating CT-derived HU values. In addition, because CT HU was assessed only in the lumbar spine (L1–L4) and we did not obtain proximal femur HU, discordance may partly reflect a site mismatch when osteoporosis is defined by the lowest T-score at any DXA site (often the hip). Concordance might be higher with same-site comparisons (e.g., femoral HU vs femoral-neck DXA), which should be evaluated in future studies. Moreover, a history of THA was not available as a structured variable in our database. Patients with bilateral THA were excluded because hip DXA could not be obtained; however, patients with unilateral THA were not systematically excluded, so residual misclassification of hip T-scores cannot be completely ruled out. Nonetheless, DXA remains the clinically accepted reference method, and our analysis was designed to reflect current diagnostic practice. In addition, we did not assess postoperative clinical outcomes (e.g., screw loosening, cage subsidence, or proximal junctional failure), which limits direct translational inference regarding the clinical consequences of HU–DXA discordance for surgical decision-making.
Fifth, HU values are influenced by technical factors including CT scanner manufacturer, reconstruction kernel, and acquisition parameters. Although all scans in this study were performed using a uniform protocol (120 kVp, 280 mA) at a single center, external validation across different institutions and vendors—ideally with harmonization using QCT or standardized phantoms—is needed to establish scanner-independent thresholds. In addition, several potentially important covariates were not uniformly available, including osteoporosis treatment history, vitamin D status, hormone replacement therapy, glucocorticoid exposure, and other metabolic factors, which may confound discordance patterns. Lastly, although we adopted an HU cutoff of 100 based on prior reports, other studies have proposed higher thresholds (e.g., 110 HU), 23 and a recent systematic review suggested a diagnostic interval of 90.9-138 HU. 10 Thus, the candidate thresholds evaluated in our study (90-130 HU) lie toward the lower end of this previously reported range and may appear numerically low, but they remain within the expected interval for CT-based identification of DXA-defined osteoporosis, particularly in a phantom-calibrated, high-risk surgical cohort. Our cohort’s ROC-optimal value (∼99 HU) aligns closely with these pragmatic thresholds yet still yields non-trivial false negatives and only moderate PPV, reinforcing that HU-based criteria should be locally validated and interpreted in conjunction with DXA and the clinical context.
Conclusions
Axial lumbar HU values derived from CT and DXA T-scores show only moderate concordance for the diagnosis of osteoporosis.
Approximately 27.8 % of patients exhibited diagnostic discordance, which was associated with age, sex, BMI, and the anatomical site of the lowest DXA T-score.
Axial L1-4 HU values moderately correlate with DXA T-scores and may serve as an adjunctive tool for opportunistic osteoporosis screening in spine surgery candidates.
CT-derived HU values should be used as an adjunct—not a replacement—for DXA in preoperative bone quality assessment and surgical decision-making.
In discordant or borderline cases, perioperative planning should adopt an integrative approach that combines HU, DXA, and clinical risk factors to guide implant selection and optimize bone health.
Footnotes
Authors’ Note
This study, including any part of it, has no prior or duplicate submissions or publications elsewhere.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
