Predicting health facility deliveries using explainable machine learning in Sidama Region,Ethiopia: A prospective cohort study

Abstract

Background:

Health facility delivery (HFD) is a key intervention for reducing maternal and neonatal morbidity and mortality. However, a substantial proportion of women in Ethiopia continue to give birth at home. Early identification of women at risk of home delivery is essential to support targeted maternal health interventions.

Objective:

This study aimed to predict HFD utilization using machine learning (ML) models and to identify key determinants of delivery service uptake in the Sidama Region of Ethiopia.

Design:

A prospective cohort study was conducted among 3855 pregnant women who initiated antenatal care (ANC) in public health facilities across 4 districts of the Sidama Region between January 2021 and January 2025.

Methods:

Data were analyzed using the R software version 4.3.1 (R Core Team, R Foundation for Statistical Computing, Vienna, Austria). Predictive models - including logistic regression, random forest, gradient boosting, and extreme gradient boosting (XGBoost) were developed to predict HFD utilization. Model performance was assessed using accuracy, sensitivity, specificity, F1-score, and the area under the receiver operating characteristic curve (AUC-ROC). SHapley Additive exPlanations (SHAP) were used to identify the most influential predictors.

Results:

Overall, 59.2% of women delivered in health facilities, while 40.8% delivered at home. Among the evaluated models, XGBoost demonstrated the highest predictive performance, achieving an accuracy of 86.9% (95% confidence interval (CI): 85.6–88.1) and an AUC-ROC of 0.91 (95% CI: 0.90–0.93). SHAP analysis identified place of residence, maternal education, timing of ANC initiation, parity, and distance to the nearest health facility as the most influential predictors. Rural residence, lower educational attainment, late ANC initiation, higher parity, and greater distance to health facilities were associated with a lower likelihood of HFD utilization.

Conclusion:

Despite improvements in HFD utilization, a substantial proportion of women in the Sidama Region continue to deliver at home. ML models offer a robust approach for identifying women at high risk of home delivery and supporting targeted, data-driven interventions. Strategies that promote early ANC engagement, maternal education, improved geographic access to health facilities, and integration of predictive analytics into health systems may enhance HFD utilization.

Keywords

health facility delivery utilization machine learning maternal health Sidama Region Ethiopia

Introduction

Maternal mortality remains a critical global public health challenge, with an estimated 295,000 deaths occurring annually, the vast majority in low- and middle-income countries.¹ In Ethiopia, maternal mortality remains unacceptably high at 401 deaths per 100,000 live births, reflecting persistent inequities in access to skilled maternal health services.² Health facility delivery (HFD) is widely recognized as one of the most effective interventions to reduce maternal and neonatal morbidity and mortality.³ However, a substantial proportion of women, particularly those in rural and underserved areas, continue to give birth at home without skilled attendants.^4,5 The identification of women at risk of home delivery and an understanding of the factors influencing HFD are therefore essential for achieving Sustainable Development Goal 3, which aims to reduce maternal mortality substantially.⁶

Previous studies in Ethiopia and across sub-Saharan Africa have consistently identified key determinants of HFD, including maternal education, socioeconomic status, parity, cultural norms, physical accessibility, and antenatal care (ANC) quality.^7–10 While these studies have provided valuable insights, most relied on conventional regression methods, which are limited in their ability to capture complex, nonlinear interactions, account for correlated predictors, or generate individualized risk predictions.¹¹ Consequently, predictive frameworks capable of integrating diverse sociodemographic, obstetric, and health system–related factors remain scarce.^12,13

Machine learning (ML) has emerged as a powerful approach in maternal and child health research, offering improved predictive performance over traditional regression techniques for outcomes such as ANC dropout, childhood immunization uptake, and maternal complications.^14–17 In the context of HFD, ML enables individual-level risk prediction, which is crucial for actionable risk stratification, targeted interventions, and optimized resource allocation. Unlike conventional regression, ML can model complex, nonlinear interactions and provide interpretable explanations of individual predictions when combined with techniques such as SHapley Additive exPlanations (SHAP), allowing policymakers and health planners to prioritize high-risk women and implement context-specific interventions.

Despite this promise, ML applications for predicting HFD in Ethiopia remain limited; to our knowledge, explainable ML approaches have not yet been applied in the Sidama Region, which features a heterogeneous population, uneven health infrastructure, and culturally diverse communities.^18,19 National surveys indicate substantial urban–rural disparities, with rural women disproportionately affected by long travel distances, transportation barriers, and traditional practices.^2,20 Developing context-specific, interpretable predictive models could therefore improve targeted interventions, support data-driven policy, and enhance maternal and neonatal outcomes.

Against this backdrop, the present study applies explainable machine learning (extreme gradient boosting (XGBoost) with SHAP) to predict HFD among pregnant women in Sidama Region, Ethiopia. The study aimed to identify key sociodemographic, obstetric, and health system-related predictors of HFD service utilization. By integrating predictive modeling with interpretable outputs, the study not only confirms known predictors but also provides novel, actionable insights to guide interventions, support risk-flagging in ANC services, and ultimately reduce maternal and neonatal morbidity and mortality.

Methods

Study design and setting

A prospective cohort study was conducted to examine HFD utilization among pregnant women who initiated ANC in four districts of the Northern Zone of the Sidama Region, Ethiopia. Sidama Region is characterized by a heterogeneous mix of urban and rural populations, diverse socioeconomic conditions, and variable healthcare infrastructure, making it an appropriate setting for examining determinants of HFD utilization.^18,19 Maternal health services in the region are delivered through a tiered health system comprising primary healthcare units, health centers, and referral hospitals, which collectively provide ANC, HFD, and postnatal care services to women and newborns.

Reporting guidelines

This prospective cohort study is reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines²¹ for observational research and the transparent reporting of a multivariable prediction model for Individual Prognosis or Diagnosis–Machine Learning (TRIPOD-ML) guidelines²² for prediction model studies. Key items from each checklist were followed to ensure transparent reporting of participant selection, predictor definition, outcome measures, model development, validation, and clinical utility. The STROBE and TRIPOD-ML checklists are provided as Supplemental Material 1 and 2, respectively.

Study period

Data were collected from January 2021 to January 2025 using routinely recorded maternal health service data within a prospective cohort framework. This extended period provided adequate temporal coverage to capture seasonal, demographic, and service-related variations in HFD utilization.

Eligibility criteria

The study included all pregnant women who initiated ANC at the selected public health facilities within the four districts of the Northern Zone of Sidama Region between January 2021 and January 2025. Inclusion criteria were women with complete documentation of ANC visits, HFD outcomes, and relevant sociodemographic and obstetric information. Exclusion criteria were women who transferred out before delivery, those with missing key variables, and women with multiple pregnancies for whom the delivery outcome could not be clearly ascertained.

Sample size calculation

The minimum required sample size was calculated using the single population proportion formula, based on the 2019 Ethiopian Demographic and Health Survey estimate of 50% for HFD,² which yields the maximum sample size for conservative planning. At a 95% confidence level and a margin of error of 2.5%, the estimated minimum sample size was 1537 women. Allowing for up to 10% incomplete or missing records, the final minimum required sample size was 1708 women. Our study included 3855 eligible records, substantially exceeding this threshold, thereby ensuring sufficient statistical power for both descriptive analyses and machine learning-based prediction of HFD utilization.

Data sources and variables

Data were extracted from routinely collected facility-based maternal health records, including ANC registers, delivery logs, and individual maternal health cards supplemented by interview to obtain data on some important variables. The primary outcome variable was HFD utilization, defined as childbirth occurring in a recognized health facility under the supervision of skilled health personnel.

Predictor variables were selected based on prior literature and data availability and were categorized as follows:

Sociodemographic factors: maternal age, marital status, educational attainment, occupation, place of residence (urban/rural), and household income.

Obstetric factors: parity, gravidity, history of obstetric complications, timing of ANC initiation, and number of ANC visits

Health care–related factors: distance to the nearest health facility, availability of transportation, previous experience with health services, and type of health facility attended

All variables were systematically cleaned, coded, and standardized to ensure consistency and suitability for predictive modeling.

Machine learning procedures

We developed predictive models using four supervised algorithms: logistic regression, random forest, gradient boosting, and XGBoost.²³ The analytic dataset was partitioned at the individual level into a training set (70%) and an independent testing set (30%) using stratified random sampling to preserve outcome incidence. Each observation represented a unique pregnancy, and no repeated measurements were present.

Hyperparameter optimization was performed using a grid search strategy tailored to each algorithm. Tuned parameters included tree depth, number of estimators, learning rate, subsampling ratios, and regularization terms, depending on model architecture. Model selection was guided by the mean area under the receiver operating characteristic curve (AUC-ROC) estimated across stratified fivefold cross-validation, chosen to balance bias reduction and computational efficiency.

The dataset contained no missing values, and class distributions were sufficiently balanced; therefore, no imputation or resampling procedures were applied. Final model performance was assessed on the held-out testing set using accuracy, sensitivity, specificity, F1-score, and AUC, with 95% confidence intervals (CI), ensuring unbiased evaluation and reproducibility.

Model validation and leakage control

To prevent information leakage, all model training, cross-validation, and hyperparameter tuning steps were confined to the training data. The testing set was not accessed at any stage of model fitting or optimization. Categorical predictors were encoded using dummy variables prior to model training. Feature scaling or normalization was not applied, as the primary models were tree-based and invariant to feature scale.

No automated feature selection or dimensionality reduction techniques were used. All predictors were prespecified based on clinical relevance and prior literature, ensuring interpretability and alignment with real-world maternal health decision-making. Comparable performance between training cross-validation and independent testing indicated minimal overfitting and stable generalization.

Model interpretability using SHAP

To enable transparent interpretation of model predictions, SHAP were applied to the final tuned XGBoost model, selected for its superior discriminative performance. SHAP values quantify the marginal contribution of each predictor to individual risk estimates, grounded in cooperative game theory and additive feature attribution.

Global feature importance was examined using SHAP summary plots, which display both the magnitude and direction of each predictor’s influence across the population. Local interpretability was explored through force plots for representative cases, illustrating how specific features increased or decreased the model-predicted probability of HFD. Feature interactions were further assessed using SHAP dependence plots to identify nonlinear effects and conditional relationships between predictors.

In the presence of correlated predictors, SHAP appropriately distributed attribution across related features, consistent with its theoretical framework. This approach ensured that interpretation reflected the joint structure of the model rather than isolated variable effects. By integrating explainable ML with rigorous validation, the study provides clinically meaningful insights while maintaining methodological transparency.

Data quality and missing data

All data were prospectively extracted from routine maternal health records, ensuring reliability and reducing recall bias. Data completeness was high. After applying the eligibility criteria, no missing values were present in the variables used for model development, and therefore, no imputation was required. Data completeness and the absence of missing values were explicitly verified prior to model training. These measures strengthen transparency and enhance adherence to the STROBE guidelines, providing a clear and rigorous basis for observational inference. These measures strengthen transparency and enhance adherence to the STROBE guidelines, providing a clear and rigorous basis for observational inference.

Statistical analysis

Data were entered and managed using Microsoft Excel and subsequently imported into the R software version 4.3.1 (R Core Team, R Foundation for Statistical Computing, Vienna, Austria) for statistical and machine learning analyses. Descriptive statistics were used to summarize participant characteristics and HFD outcomes. For predictive modeling, the dataset was randomly partitioned into a training set (70%) and a testing set (30%), following standard machine learning practice.

Multiple supervised learning algorithms were implemented, including logistic regression, random forest, gradient boosting, and XGBoost, to develop prediction models for HFD utilization. Hyperparameter tuning and fivefold cross-validation were applied to optimize model performance and minimize overfitting. Model performance was evaluated using accuracy, sensitivity, specificity, F1-score, and the AUC-ROC, which are widely recommended metrics for classification tasks in health research.

To enhance model interpretability and identify the most influential predictors, SHAP values were employed, providing consistent and transparent estimates of variable importance across models. All analyses were conducted using a 5% significance level for descriptive and inferential reporting. Results were reported using estimates with their 95% confidence intervals.

Results

Study participants characteristics

Figure 1 illustrates the flow of participants through the study. A total of 4120 maternal health records were initially screened from public health facilities across 4 districts in the Northern Zone of the Sidama Region between January 2021 and January 2025. Records were excluded for the following reasons: missing key predictors (n = 165), incomplete HFD outcome documentation (n = 80), or duplicate entries (n = 20). After applying these criteria, 3855 women were retained for analysis and included in the machine learning models.

Figure 1.

Flow of participants through study inclusion and exclusion for health facility delivery prediction modeling in Sidama, Ethiopia.

Sociodemographic and obstetric characteristics

A total of 3855 pregnant women who initiated ANC in the 4 districts of the Northern Zone of Sidama Region were included in the analysis. The mean age of participants was 27.4 ± 5.6 years. Most women resided in rural areas (62.3%). Regarding educational attainment, 40.8% had completed primary education, while 20.9% had attained secondary education or higher. Most participants were married (91.0%) and primarily engaged in household work (68.0%).

Among obstetric characteristics, the median parity was 3, and 46.0% of women were multiparous. Late initiation of ANC, defined as the first visit occurring after 16 weeks of gestation, was observed in 38.0% of participants. Only 54.0% of women completed four or more ANC visits, indicating suboptimal continuity of maternal health care. Detailed sociodemographic and obstetric characteristics are presented in Table 1.

Table 1.

Sociodemographic and obstetric characteristics of pregnant women attending antenatal care in Northern Sidama, Ethiopia, January 2021–January 2025 (N = 3855).

Variable	Category	Facility delivery, n (%)	Home delivery, n (%)	Total, n (%)
Age (years)	<20	198 (63.5)	114 (36.5)	312 (8.1)
	20–29	1265 (59.3)	869 (40.7)	2134 (55.4)
	30–39	680 (59.4)	465 (40.6)	1145 (29.7)
	⩾40	152 (57.6)	112 (42.4)	264 (6.8)
Residence	Urban	1220 (83.9)	235 (16.1)	1455 (37.7)
Residence	Rural	1075 (44.8)	1325 (55.2)	2400 (62.3)
Maternal education	No formal	589 (51.2)	561 (48.8)	1150 (29.8)
	Primary	949 (60.3)	624 (39.7)	1573 (40.8)
	Secondary+	757 (72.1)	50 (27.9)	807 (20.9)
Marital status	Married	2073 (59.1)	1433 (40.9)	3506 (91.0)
Marital status	Single/divorced	222 (63.6)	127 (36.4)	349 (9.0)
Occupation	Housewife	1568 (59.8)	1053 (40.2)	2621 (68.0)
	Merchant	360 (58.2)	258 (41.8)	618 (16.0)
	Government Employee	367 (58.8)	256 (41.2)	623 (16.2)
Household income	Lowest	690 (55.0)	565 (45.0)	1255 (32.6)
	Middle	800 (60.5)	522 (39.5)	1322 (34.3)
	Highest	805 (64.1)	451 (35.9)	1256 (32.6)
Mass media use	Yes	1251 (59.6)	847 (40.4)	2098 (54.4)
Mass media use	No	1044 (58.6)	738 (41.4)	1780 (45.6)
Gravidity	1–5	1847 (61.2)	1171 (38.8)	3018 (78.3)
Gravidity	>5	448 (46.2)	522 (53.8)	970 (21.7)
Parity	1–5	1875 (60.2)	1243 (39.8)	3118 (80.9)
Parity	>5	420 (46.5)	484 (53.5)	904 (19.1)
Previous obstetric danger signs	Yes	195 (33.6)	385 (66.4)	580 (15.0)
Previous obstetric danger signs	No	2100 (62.7)	1080 (37.3)	3180 (85.0)
ANC initiation (weeks)	⩽16	1562 (65.4)	828 (34.6)	2390 (62.0)
ANC initiation (weeks)	>16	733 (50.0)	732 (50.0)	1465 (38.0)
ANC visits	⩾4	1369 (65.7)	714 (34.3)	2083 (54.0)
ANC visits	<4	926 (52.3)	846 (47.7)	1772 (46.0)
Distance to facility (km)	⩽5	1228 (74.8)	414 (25.2)	1642 (42.6)
Distance to facility (km)	>5	1067 (38.0)	1745 (62.0)	2812 (57.4)

Categories were developed based on previous literature, the Ethiopian Demographic and Health Survey, and the Ethiopian Ministry of Health guidelines. ANC: antenatal care.

Health facility delivery utilization

Overall, 59.2% of women delivered in health facilities, while 40.8% gave birth at home. Marked disparities in delivery place were observed across key characteristics. HFD was substantially higher among urban residents (83.9%) than among rural residents (44.8%). Educational status showed a pronounced gradient, with HFD reported by 72.1% of women with secondary education or higher, compared with 51.2% among women with no formal education.

Utilization of HFD was also associated with access-related predictors. Women residing within 5 km of a health facility were significantly more likely to deliver in a facility (74.8%) than those living farther away (38.0%). Early initiation of ANC was associated with higher facility delivery utilization (65.4%) compared with late initiation (50.0%). The distribution of place of delivery by selected characteristics is shown in Table 1.

Machine learning model performance

We developed and evaluated four supervised machine learning algorithms–logistic regression, random forest, gradient boosting, and XGBoost to predict HFD utilization among pregnant women in the Sidama Region. The dataset was split into a training set (70%) and an independent testing set (30%), ensuring unbiased assessment of model generalizability.

During model development, stratified fivefold cross-validation was conducted on the training set to optimize hyperparameters and assess model stability. The XGBoost model consistently demonstrated superior predictive performance. On the training set, XGBoost achieved an AUC of 0.92, while evaluation on the independent testing set yielded an AUC of 0.91 (95% CI: 0.90–0.93), indicating minimal discrepancy between training and testing performance and confirming that overfitting was negligible. Cross-validation yielded a mean AUC of 0.91 ± 0.01, indicating consistent performance across multiple folds.

Performance metrics for all models are summarized in Table 2 and Figure 2. Among the four algorithms, XGBoost attained the highest accuracy (86.9%, 95% CI: 85.6–88.1), sensitivity (87.5%, 95% CI: 86.3–88.7), specificity (86.2%, 95% CI: 84.9–87.4), and F1-score (0.85, 95% CI: 0.84–0.86). Logistic regression, random forest, and gradient boosting models exhibited slightly lower predictive performance but still demonstrated acceptable discrimination, with AUCs ranging from 0.83 to 0.89 on the testing set.

Table 2.

Performance of machine learning models predicting health facility delivery utilization among pregnant women in Sidama Region, 2021–2025.

Model	Training AUC	Testing AUC (95% CI)	Accuracy % (95% CI)	Sensitivity % (95% CI)	Specificity % (95% CI)	F1-score (95% CI)	Brier score
Logistic regression	0.84	0.83 (0.81–0.85)	78.4 (76.9–79.9)	80.2 (78.7–81.6)	76.1 (74.5–77.6)	0.79 (0.78–0.81)	0.18
Random forest	0.87	0.88 (0.86–0.89)	84.5 (83.2–85.8)	85.1 (83.9–86.4)	83.8 (82.5–85.0)	0.84 (0.83–0.85)	0.13
Gradient boosting	0.88	0.89 (0.87–0.90)	85.7 (84.4–87.0)	86.0 (84.8–87.2)	85.3 (84.1–86.5)	0.85 (0.84–0.86)	0.12
XGBoost	0.92	0.91 (0.90–0.93)	86.9 (85.6–88.1)	87.5 (86.3–88.7)	86.2 (84.9–87.4)	0.85 (0.84–0.86)	0.11

XGBoost: extreme gradient boosting; AUC: area under the curve; CI: confidence interval.

Figure 2.

ROC curves showing the predictive performance of four machine learning models for health facility delivery utilization. The XGBoost model achieved the highest discrimination (AUC = 0.91).

To evaluate the reliability of predicted probabilities, we assessed model calibration using calibration plots and Brier scores. The XGBoost model showed good agreement between predicted and observed outcomes, with a Brier score of 0.11, indicating that the model’s probability estimates are both accurate and interpretable for clinical or programmatic risk stratification.

Overall, the minimal differences observed between training and testing AUC values, coupled with stable cross-validation performance and favorable calibration, indicate that the machine learning models, particularly XGBoost, generalize well to unseen data and are not subject to overfitting.

Explainable machine learning (SHAP) analysis

To interpret the contributions of individual predictors to HFD, we applied SHAP to the final tuned XGBoost model. SHAP values quantify the effect of each feature on model predictions, providing transparent and consistent estimates of feature importance.

The SHAP summary plot (Figure 3) highlights the overall influence of predictors across all observations, while force plots illustrate the contribution of top predictors for selected individual cases. Among all features, residence (urban versus rural) emerged as the most influential determinant, followed by maternal education, timing of ANC initiation, distance to the nearest health facility, and parity. Rural residence, late ANC initiation, higher parity, and greater distance to health facilities were associated with a higher probability of home delivery, whereas urban residence, higher education, and early ANC initiation increased the likelihood of HFD.

Figure 3.

SHAP summary plot showing the impact of the top 7 predictors on the likelihood of health facility delivery. Each dot represents a single observation, with color indicating feature value (blue = low, red = high) and horizontal position showing the SHAP value (impact on model output). Predictors are ranked by overall importance from top (most influential) to bottom (least influential).

We also examined feature interactions using SHAP dependence plots, showing how the effect of one variable may change depending on another (e.g., the influence of maternal education varied slightly with place of residence). Correlated predictors shared contributions, as SHAP distributes the effect across collinear features, ensuring the interpretability reflects the combined effect without over-attributing importance to any single correlated variable (Table 3).

Table 3.

SHAP-based feature importance showing top predictors of facility delivery utilization.

Predictor	SHAP importance rank	Direction of association	Notes on interaction/interpretation
Residence (urban vs rural)	1	Urban → higher likelihood of facility delivery	Most influential predictor; interacts slightly with maternal education
Maternal education	2	Higher education → higher likelihood	Effect moderated by place of residence
ANC initiation timing	3	Early initiation → higher likelihood	Strong predictor across rural and urban settings
Distance to facility	4	Greater distance → lower likelihood	Interaction observed with parity and residence
Parity	5	Higher parity → lower likelihood	Shares some contribution with maternal age
Previous obstetric complications	6	Presence → higher likelihood of facility delivery	Contribution partly shared with parity
Household income	7	Higher income → higher likelihood	Less influential but supports equity interpretation

ANC: antenatal care; SHAP: SHapley Additive exPlanations.

Using the tuned XGBoost model, SHAP analysis revealed the relative contributions of key predictors to HFD. Residence (urban), maternal education, and early ANC initiation were the strongest positive contributors, while greater distance to the facility and higher parity reduced the likelihood of HFD. Previous obstetric complications and higher household income had relatively smaller positive effects. The force plot for an individual prediction (Figure 4) illustrates how these features jointly influence the model output, highlighting both the magnitude and direction of each predictor’s contribution.

Figure 4.

SHAP force plot illustrating the contributions of the top 7 predictors to an individual prediction of health facility delivery using the XGBoost model. The plot shows the base value (average model output) on the left and the final predicted probability on the right. Red bars indicate features pushing the prediction toward higher likelihood of facility delivery, while blue bars indicate features pushing it toward lower likelihood. Feature importance and directionality reflect the ranking and associations reported in Table 4.

Maternal and neonatal outcomes by place of delivery

Women who delivered at home experienced higher rates of adverse maternal outcomes, including preeclampsia (11.8%), postpartum hemorrhage (7.9%), and unplanned obstetric complications requiring emergency referral. Neonatal outcomes were also less favorable among home deliveries, with higher proportions of low birth weight (12.0%) and neonatal complications (9.0%).

In contrast, women who delivered in health facilities benefited from timely identification and management of obstetric and neonatal complications, resulting in comparatively improved maternal and neonatal outcomes. A detailed comparison of outcomes by place of delivery is presented in Table 4.

Table 4.

Maternal and neonatal outcomes by place of delivery among pregnant women in Sidama Region, 2021–2025 (N = 3855).

Outcome	Facility delivery, n (%)	Home delivery, n (%)
Preeclampsia	134 (5.7)	173 (11.8)
Postpartum hemorrhage	89 (3.8)	116 (7.9)
Emergency referrals	76 (3.2)	102 (6.3)
Neonatal complications	96 (4.1)	117 (9.0)
Low birth weight (<2.5 kg)	125 (5.3)	156 (12.0)

Discussion

This study provides a comprehensive assessment of HFD utilization among pregnant women in the Northern Zone of the Sidama Region, Ethiopia, and demonstrates the added value of machine learning approaches in predicting service utilization. Despite national and regional efforts to improve maternal health services, only 59.2% of women delivered in health facilities, while 40.8% continued to give birth at home. This finding highlights the persistent risk of preventable maternal and neonatal morbidity and mortality associated with non-facility deliveries in low-resource settings.

Our findings confirm previously reported predictors of HFD in Ethiopia. Maternal education and urban residence were consistently associated with higher utilization, aligning with studies from Sidama, Southern and other regions in Ethiopia.^24,25 Early initiation of ANC also increased the likelihood of HFD, consistent with evidence from Debre Markos and Tigray.^8,26 Conversely, rural residence, longer distance to health facilities, and higher parity were associated with home delivery, reinforcing the role of geographic and infrastructural barriers in limiting access to skilled maternity care.^27,28 Comparisons with other sub-Saharan African countries, such as Nigeria, Uganda, and Kenya, suggest that socioeconomic and structural barriers to HFD are widespread, though contextual differences in health infrastructure and community programs can substantially affect HFD rates.^29–32

Beyond confirming known patterns, this study makes novel contributions through the application of explainable machine learning approaches. The XGBoost model achieved superior predictive performance (accuracy 86.9%, AUC-ROC 0.91) compared with logistic regression, random forest, and gradient boosting, demonstrating its ability to capture complex, nonlinear interactions among predictors. SHAP analysis provided interpretable insights at both global and individual levels, highlighting rural residence, delayed ANC initiation, lower maternal education, greater distance to facilities, and higher parity as the most influential predictors. These insights allow for individual-level risk prediction, which is not achievable with conventional regression alone.

Policy and implementation implications

The use of explainable ML offers practical opportunities for maternal health programs. By integrating risk predictions into ANC services, health workers could identify women at elevated risk of home delivery and prioritize targeted interventions, such as home visits, transportation support, and health education. Furthermore, interpretable outputs from SHAP can inform policymakers’ decisions on resource allocation, program planning, and monitoring, helping tailor interventions to the specific needs of rural and underserved populations. In this way, predictive analytics can support data-driven, context-specific strategies to increase HFD and reduce maternal and neonatal morbidity and mortality.

Clinical and public health use

In practice, the predictive tool can be applied from the first ANC visit, using routinely collected data to generate individualized risk scores for home delivery. These scores enable healthcare providers to stratify women by risk level and allocate resources efficiently. For high-risk individuals, tailored strategies such as scheduled follow-up visits, targeted health education, or transportation assistance can be implemented promptly.

The system is designed to integrate seamlessly with existing health information platforms, minimizing additional workload for frontline staff. Risk thresholds can be adjusted to balance sensitivity and specificity according to local capacity, ensuring that interventions remain both feasible and effective.

Importantly, the tool considers the implications of prediction errors. False positives may lead to additional outreach, which carries minimal risk and can reinforce maternal health promotion, whereas false negatives underline the importance of routine monitoring and universal health education to safeguard against missed opportunities.

By providing a structured workflow for early identification, triage, and follow-up, this approach supports practical, data-driven decision-making in low-resource settings, enhancing timely access to HFD while optimizing the use of limited maternal health resources.

While this study was conducted in the Sidama Region, the findings may have relevance for other low-resource settings with similar socioeconomic and healthcare characteristics. Key predictors of HFD, such as maternal education, residence, ANC engagement, parity, and geographic accessibility, are common challenges across many sub-Saharan African contexts. Nevertheless, we acknowledge that variations in health system infrastructure, cultural practices, and local policy environments may influence the direct applicability of our results. Future research in diverse settings is warranted to assess the broader generalizability of machine learning–based predictive approaches for maternal health service utilization.

Limitations

This study has several strengths. The large sample size enhances statistical power and supports robust predictive modeling. Inclusion of multiple districts encompassing both urban and rural populations enables a nuanced examination of regional disparities. The integration of advanced machine learning models with SHAP-based interpretability strengthens the reliability and practical relevance of the findings. Additionally, the prospective cohort design allowed assessment of temporal relationships between ANC engagement and HFD outcomes, consistent with STROBE recommendations for observational research.

Nevertheless, certain limitations should be acknowledged. Reliance on routine facility records may introduce information bias due to incomplete or inaccurate documentation. Important psychosocial, cultural, and service quality–related predictors influencing delivery choices were not captured in the available data. Furthermore, the study was conducted in four districts of the Sidama Region, which may limit generalizability to other regions of Ethiopia or sub-Saharan Africa with different health system characteristics.

Conclusion

In conclusion, although HFD utilization has improved in the Sidama Region, a substantial proportion of women continue to give birth at home, placing themselves and their newborns at avoidable risk. The key predictors identified in this study were residence, maternal education, timing of ANC initiation, parity, and distance to health facilities—represent actionable targets for intervention. The successful application of machine learning models, particularly XGBoost combined with SHAP interpretation, highlights the potential of predictive analytics to inform targeted maternal health strategies. By integrating such approaches into routine health systems, policymakers and healthcare providers can prioritize high-risk populations, optimize resource allocation, and design context-specific interventions. Moreover, the findings offer insights that may be generalizable to similar low-resource settings beyond Ethiopia, highlighting the broader relevance of data-driven approaches for improving maternal and neonatal outcomes globally.

Supplemental Material

sj-docx-1-whe-10.1177_17455057261442711 – Supplemental material for Predicting health facility deliveries using explainable machine learning in Sidama Region, Ethiopia: A prospective cohort study

Supplemental material, sj-docx-1-whe-10.1177_17455057261442711 for Predicting health facility deliveries using explainable machine learning in Sidama Region, Ethiopia: A prospective cohort study by Mehretu Belayneh, Yohannes Seifu Berego, Francisco Guillen-Grima and Amanuel Yoseph in Women's Health

Supplemental Material

sj-docx-2-whe-10.1177_17455057261442711 – Supplemental material for Predicting health facility deliveries using explainable machine learning in Sidama Region, Ethiopia: A prospective cohort study

Supplemental material, sj-docx-2-whe-10.1177_17455057261442711 for Predicting health facility deliveries using explainable machine learning in Sidama Region, Ethiopia: A prospective cohort study by Mehretu Belayneh, Yohannes Seifu Berego, Francisco Guillen-Grima and Amanuel Yoseph in Women's Health

Footnotes

Acknowledgements

We gratefully acknowledge the Sidama President’s Office and Hawassa University for providing financial support essential to the successful completion of this research. We also extend our sincere appreciation to the Sidama Regional Health Bureau and the respective District Health Offices for their collaboration and logistical support. Our heartfelt thanks go to the study participants, data collectors, field assistants, and supervisors for their invaluable contributions. Finally, we acknowledge the School of Public Health at Hawassa University for providing technical guidance during the study design and data analysis phases.

ORCID iD

Amanuel Yoseph

Ethical considerations

Ethical approval for this study was obtained from the Institutional Review Board (IRB) of the College of Medicine and Health Sciences, Hawassa University (Approval No. IRB/076/15). The study was based on routinely collected health facility records, and no direct interaction with participants took place. All data were fully anonymized prior to analysis, and strict confidentiality was maintained throughout the study to ensure ethical compliance. For individuals under 20 years included in the dataset, data were handled in line with IRB-approved protocols governing the use of health records.

Consent to participate

This study relied exclusively on secondary health facility data, with no direct involvement of participants. In accordance with IRB approval, the requirement for individual informed consent was waived.

Consent for publication

Not applicable.

Author contributions

Mehretu Belayneh: Writing – original draft; Writing – review & editing; Validation; Visualization; Formal analysis; Supervision; Resources; Software.

Yohannes Seifu Berego: Data curation; Formal analysis; Writing – original draft; Writing – review & editing.

Francisco Guillen-Grima: Investigation; Writing – original draft; Validation; Visualization; Writing – review & editing; Software.

Amanuel Yoseph: Conceptualization; Investigation; Funding acquisition; Writing – original draft; Methodology; Validation; Visualization; Writing – review & editing; Software; Formal analysis; Project administration; Data curation; Supervision; Resources.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Sidama President’s Office and Hawassa University. The funders were not involved in the study design, data collection, management, analysis, or interpretation; in the preparation, review, or approval of the article; or in the decision to submit the article for publication.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The datasets used and/or analyzed during the current study are included within the article and its Supplemental Material.

Supplemental material

Supplemental material for this article is available online.

References

World Health Organization. Trends in maternal mortality 2000 to 2017: estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division. World Health Organization, 2019.

Central Statistical Agency and ICF International. Ethiopia demographic and health survey 2011. Central Statistical Agency and ICF International, 2012.

Yoseph

Teklesilasie

Guillen-Grima

, et al. Individual-and community-level determinants of maternal health service utilization in southern Ethiopia: a multilevel analysis. Womens Health (Lond) 2023; 19: 17455057231218195.

Yoseph

Teklesilasie

Guillen-Grima

, et al. Perceptions, barriers, and facilitators of maternal health service utilization in southern Ethiopia: a qualitative exploration of community members’ and health care providers’ views. PLoS One 2024; 19(12): e0312484.

Barredo

Agyepong

Liu

, et al. Goal: 3: ensure healthy lives and promote well-being for all at all ages. UN Chronicle 2015; 51(4): 9–11.

Hassen

Jemal

Bambo

, et al. Multilevel analysis of factors associated with utilization of institutional delivery in Ethiopia. Womens Health (Lond) 2022; 18: 17455057221099505.

Gebremichael

Fenta

. Determinants of institutional delivery in Sub-Saharan Africa: findings from Demographic and Health Survey (2013–2017) from nine countries. Trop Med Health 2021; 49(1): 45.

Denu

Defar

Persson

, et al. Socio-economic and geographic equity in maternal health services utilization in Ethiopia: a community-based cross-sectional study. BMC Health Serv Res 2025; 25(1): 610.

Yoseph

Teklesilasie

Guillen-Grima

, et al. Community-based health education led by women’s groups significantly improved maternal health service utilization in Southern Ethiopia: a cluster randomized controlled trial. InHealthcare 2024; 12(10): 1045.

10.

Tsegaye

Shudura

Yoseph

, et al. Predictors of skilled maternal health services utilizations: a case of rural women in Ethiopia. PLoS One 2021; 16(2): e0246237.

11.

Kassaw

Kidie

Tesfa

, et al. Optimizing machine learning models for predicting health service access and determinants among pregnant women in rural Ethiopia. Sci Rep 2025; 15(1): 40559.

12.

Freeman

Schleiff

Sacks

, et al. Comprehensive review of the evidence regarding the effectiveness of community–based primary health care in improving maternal, neonatal and child health: 4. child health findings. J Glob Health 2017; 7(1): 010904.

13.

Rittenhouse

Vwalika

Keil

, et al. Improving preterm newborn identification in low-resource settings with machine learning. PLoS One 2019; 14(2): e0198919.

14.

Mwaura

Kamanu

Kulohoma

, et al. Bridging data gaps: Predicting sub-national maternal mortality rates in Kenya using machine learning models. Cureus 2024; 16(10): e72476.

15.

Mwaura

Kamanu

Kulohoma

. Sub-National Disparities in indicators of maternal mortality in Kenya: insights from demographic health surveys towards attaining SDG 3. J Women’s Health Dev 2024; 7: 29–40.

16.

Mamun

Chowdhury

Faruq

, et al. Identification of maternal health risk from optimal features using explainable machine learning. Eng Rep 2025; 7(11): e70491.

17.

Chowdhury

Mamun

Hossain

, et al. Newborn weight prediction and interpretation utilizing explainable machine learning. In: 2024 3rd International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE), 25 April 2024, pp. 1–6. IEEE.

18.

Sidama Regional Health Bureau. Annual health service report 2025. Unpublished reports. SRHB, 2025.

19.

Shudura

Yoseph

Tamiso

. Utilization and predictors of maternal health care services among women of reproductive age in Hawassa University Health and Demographic Surveillance System Site, South Ethiopia: a cross-sectional study. Adv Public Health 2020; 2020(1): 5865928.

20.

Amaha

. Ethiopian progress towards achieving the global nutrition targets of 2025: analysis of sub-national trends and progress inequalities. BMC Res Notes 2020; 13(1): 559.

21.

Von Elm

Altman

Egger

, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Lancet 2007; 370(9596): 1453–1457.

22.

Collins

Moons

. Reporting of artificial intelligence prediction models. Lancet 2019; 393(10181): 1577–1579.

23.

Chen

. XGBoost: a scalable tree boosting system. Cornell University, 2016.

24.

Tadele

Tariku

. Factors associated with institutional delivery in Boricha district of Sidama zone, southern Ethiopia. Int J Public Health Sci 2014; 3(4): 224–230.

25.

Hunegnaw

Goddard

Bekele

, et al. Estimates and determinants of health facility delivery in the Birhan cohort in Ethiopia. PLoS One 2024; 19(7): e0306581.

26.

Bayu

Adefris

Amano

, et al. Pregnant women’s preference and factors associated with institutional delivery service utilization in Debra Markos Town, North West Ethiopia: a community based follow up study. BMC Pregnancy Childbirth 2015; 15(1): 15.

27.

Kea

Tulloch

Datiko

, et al. Exploring barriers to the use of formal maternal health services and priority areas for action in Sidama zone, southern Ethiopia. BMC Pregnancy Childbirth 2018; 18(1): 96.

28.

Moyer

Mustafa

. Drivers and deterrents of facility delivery in sub-Saharan Africa: a systematic review. Reprod Health 2013; 10(1): 40.

29.

Mshelia

Analo

Booth

. Factors influencing the utilisation of facility-based delivery in Nigeria: a qualitative evidence synthesis. J Glob Health Rep 2020; 2020(4): e2020100.

30.

Anyait

Mukanga

Oundo

, et al. Predictors for health facility delivery in Busia district of Uganda: a cross sectional study. BMC Pregnancy Childbirth 2012; 12(1): 132.

31.

Gitonga

Muiruri

. Determinants of health facility delivery among women in Tharaka Nithi county, Kenya. Pan Afr Med J 2016; 25(Suppl. 2): 9.

32.

Mugweni

Ehlers

Roos

. Factors contributing to low institutional deliveries in the Marondera district of Zimbabwe. Curationis 2008; 31(2): 5–13.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.03 MB

0.00 MB

0.09 MB