Machine learning-based algorithms to identify factors associated with inadequate meal frequency among children aged 6–23 months in Somalia: Evidence from the Somalia Demographic and Health Survey 2020

Abstract

Objectives

Inadequate meal frequency (IMF) among children aged 6–23 months remains a pressing public health issue in Somalia, contributing to widespread malnutrition and hindering progress toward Sustainable Development Goals 2 (Zero Hunger) and 3 (Good Health and Well-being). This study investigates the most influential factors associated with IMF to inform targeted public health interventions.

Methods

Data from 4066 children were extracted from the 2020 Somalia Demographic and Health Survey, employing Five machine learning algorithms, Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, and Gradient Boosting, and assessed for predictive performance using accuracy and area under the receiver operating characteristic curve (AUC-ROC) metrics. Feature importance was analyzed to identify key predictors of IMF.

Results

The prevalence of IMF was alarmingly high at 78.51%. The Gradient Boosting model outperformed other models with an accuracy of 89.55% and an AUC-ROC of 92.77%. Birth order emerged as the most dominant predictor across all models, accounting for 74.07% of the Gini importance in the Gradient Boosting model. Other significant predictors included child age, breastfeeding status, maternal education, household wealth, and region of residence.

Conclusion

The high prevalence of IMF highlights an urgent need for targeted interventions. Strategies focusing on families with higher birth order children, maternal education, and poverty reduction may be crucial for improving child nutrition in Somalia. These findings demonstrate the potential of machine learning approaches in informing public health strategies and predictive screening in resource-limited settings.

Keywords

Machine learning inadequate meal frequency child nutrition Somalia SDHS sustainable development

Introduction

Inadequate meal frequency (IMF) among children aged 6–23 months, characterized by consuming fewer meals than the World Health Organization (WHO) minimum recommendations, critically impedes optimal child health, growth, and development.¹ This developmental window is paramount for nutritional intervention, as deficiencies during this stage can lead to irreversible consequences, including stunting, impaired cognitive development, and increased susceptibility to infectious diseases.^2–4 Addressing IMF is essential for achieving global health objectives, notably Sustainable Development Goal (SDG) 2 (End Hunger) and SDG 3 (Good Health and Well-being).⁵ The WHO and UNICEF consistently emphasize appropriate complementary feeding, which includes adequate meal frequency, as a fundamental strategy to reduce child morbidity and mortality.^6,7

Globally, a significant number of young children fail to meet the recommended minimum meal frequency.^8,9 For instance, Sub-Saharan Africa (SSA) faces a particularly acute challenge, with consistently lower adherence to appropriate complementary feeding practices.^10,11 A recent analysis of Demographic and Health Survey (DHS) data from 35 SSA countries indicated that only 38.47% of children met the Minimum Meal Frequency criteria.¹¹ Studies in neighboring countries like Ethiopia have reported IMF rates varying between 30.6% and 55.9%,^12–15 while in Gambia, the figure was 57.95%,¹⁶ and in Tanzania, context-specific factors also play a significant role.¹⁷ These figures underscore the complex socio-cultural, economic, and environmental factors influencing feeding practices across diverse African contexts.¹⁸

Somalia confronts one of the world's most severe and protracted humanitarian crises, characterized by decades of conflict, political instability, recurrent climatic shocks (e.g., droughts, floods), and widespread poverty, all of which critically undermine food security and nutritional outcomes for its children aged between 6–23 months.¹⁹ The landmark Somalia Health and Demographic Survey (SHDS) 2020 revealed a dire nutritional landscape: a staggering 74.6% of children aged 6–23 months were not given any solid, semi-solid, or soft foods in the preceding 24 h.²⁰ This points to a pervasive pattern of IMF, a direct contributor to the high national rates of stunting (18%) and wasting (11%) among children under 5 years old.^20–22

The etiology of IMF is multifactorial, with determinants operating at individual, household, and community levels, as consistently highlighted by WHO guidelines and numerous DHS-based studies across various settings, primarily in other Sub-Saharan African countries such as Ethiopia, Tanzania, and Ghana.^{11,17,23–27} Child characteristics (age, breastfeeding status, birth order), maternal attributes (education, age, parity), household factors (socioeconomic status, media exposure), and community context (type of residence, region of residence) are all recognized as influential.^{8,13,16,28–30} While traditional statistical methods, such as logistic regression, have been instrumental in identifying these general determinants, understanding their relative importance and potential complex correlation, especially in data-rich environments or contexts with intricate underlying patterns, remains a challenge. Notably, while determinants have been explored elsewhere, there is a significant gap in the literature regarding the specific drivers of IMF in the Somali context, a region with a unique and protracted humanitarian crisis.

The novelty of this study lies in its pioneering application of a comprehensive suite of machine learning (ML) algorithms to identify and rank factors associated with IMF among children aged 6–23 months in Somalia, using the nationally representative SHDS 2020 data set. ML techniques, including Logistic Regression, Decision Tree, Random Forest, and Support Vector Machine (SVM), offer powerful tools for analyzing complex data sets, discerning non-linear relationships, and prioritizing predictors based on their contribution to model performance.^31,32 Recent applications of ML in nutritional epidemiology and child health have demonstrated its utility in predicting outcomes like micronutrient deficiencies, stunting, and identifying key risk factors with potentially improved accuracy and nuanced insights compared to conventional methods alone.^33–37 To our knowledge, this study represents the first application of such a diverse range of ML models to investigate the determinants of IMF specifically within Somalia, and potentially one of the first such comprehensive ML-driven analyses in the broader East African region, if not Africa, focusing on this critical Infant and Young Child Feeding (IYCF) indicator.

By leveraging the predictive power and feature importance capabilities of these five distinct ML models, this study aims to move beyond traditional associative analysis. The rationale for employing five different algorithms is threefold: first, to ensure the robustness of our findings by triangulating results from models with different underlying assumptions; second, to compare their predictive performance in a real-world public health context in Somalia, providing methodological insights; and third, to identify a potential best-performing model for future predictive screening applications in similar resource-limited settings. Our objective is to identify the most influential predictors of IMF in the Somali context, potentially uncovering patterns that might be less apparent with standard regression techniques and providing a data-driven prioritization of risk factors. The anticipated findings are expected to provide robust, evidence-based insights that will be invaluable for national and international stakeholders, including policymakers, public health practitioners, and humanitarian organizations, in designing and targeting contextually appropriate, high-impact interventions to improve IYCF practices, specifically meal frequency, reduce malnutrition, and ultimately enhance child survival and development in Somalia.

Materials and methods

Data source

This study utilized secondary data from the 2020 Somalia Demographic and Health Survey (SDHS), the first nationally representative health survey in the country. Conducted by the Somali National Bureau of Statistics (SNBS) in partnership with the Federal Ministry of Health, Federal Member State Ministries, the United Nations Population Fund (UNFPA) Somalia, and international partners, data were collected between February 2018 and January 2019 using Computer-Assisted Personal Interviewing (CAPI). A multistage stratified cluster sampling method was employed, with a three-stage design used in urban and rural areas and a two-stage design in nomadic areas. Each region was stratified into urban, rural, and nomadic categories except for Banadir, which is entirely urban, yielding 55 initial strata. Due to security concerns, eight strata were excluded (three from Lower Shabelle, three from Middle Juba, and two additional unspecified strata), resulting in a final total of 47 sampling strata. For this analysis, information was extracted from the Kids Record (KR), which includes comprehensive data on children aged 0–59 months. Permission to use the data set was obtained through a formal online request submitted via the SNBS Microdata Portal (https://microdata.nbs.gov.so) following user registration and approval. The SDHS is a vital source of demographic and health-related information in Somalia, playing a key role in guiding evidence-based policymaking, program development, and progress tracking toward national and international goals, including the SDGs, thereby supporting informed decision-making and enhancing the effectiveness of health and development interventions across the country.

Sample size

The study focused on children aged 6–23 months, the critical window for complementary feeding. From the initial SDHS 2020 data set, records for 4139 children aged 0–23 months were identified from the KR file. After data cleaning, which involved excluding cases with missing information on the outcome variable (meal frequency) or key covariates essential for the analysis, the final analytical sample comprised 4066 children aged 6–23 months. To ensure the findings are representative of the target population and account for the complex survey design (stratification and clustering), sample weights provided within the SDHS 2020 data set were applied throughout all descriptive analyses to adjust for the non-proportional allocation of the sample to different regions and to urban and rural areas during the survey's sampling stage.

Study variables

Outcome variable

The variable was constructed based on maternal recall of the types and number of meals the child consumed in the 24 h preceding the interview, as captured in the DHS questionnaire's IYCF module. The primary outcome variable was the meal frequency status of children aged 6–23 months within the 24 h preceding the survey interview. This was assessed according to the WHO 2008 guidelines for Minimum Meal Frequency, which vary based on a child's age and breastfeeding status (6). Specifically, the frequency of solid feeds was derived from the question: “How many times did the child eat solid, semi-solid, or soft foods yesterday?” For non-breastfed children, where milk feeds count toward the minimum frequency, the total frequency was calculated by summing responses to the following questions: “How many times did the child drink powdered, tinned, or fresh milk?”, “How many times did the child drink infant formula?”, and “How many times did the child eat yogurt?” Specifically:

Breastfed infants aged 6–8 months are expected to receive solid, semi-solid, or soft foods at least twice daily.

Breastfed children aged 9–23 months are expected to receive such foods at least 3 times daily.

Non-breastfed children aged 6–23 months should receive a minimum of four feedings per day. For non-breastfed children, these four feedings may include milk feeds (such as infant formula or animal milk), provided that at least one feeding consists of solid, semi-solid, or soft food.

As maternal interview questions in DHS typically capture only solid, semi-solid, or soft food intake, milk feedings (for non-breastfed children) were computed separately (if applicable and data permitted) and combined with reported food feedings to obtain the total daily feeding count. Children who met or exceeded the minimum number of required feedings for their age and breastfeeding status were categorized as having “Adequate” meal frequency (coded as 0). Those who did not meet the requirement were categorized as having “Inadequate” meal frequency (IMF, coded as 1).

Independent variables

All independent variables selected for this study were categorical in nature. The selection of independent variables was guided by the UNICEF conceptual framework for malnutrition,⁷ prior literature on determinants of IMF,^{11,13,14,17,25,28–30}^,38–40 and the availability of relevant indicators within the SDHS 2020 data set. These variables were categorized into three analytical levels: individual, household, and community.

Individual-level factors: Child characteristics comprised age in months (categorized as 6–8, 9–11, 12–17, and 18–23 months), sex (male or female), birth order (first, second to third, and fourth or higher), perceived size at birth (normal, smaller than average, larger than average; recoded into normal or underweight based on maternal report for analysis), and current breastfeeding status (breastfeeding or not breastfeeding). Maternal characteristics included age (categorized as 15–19, 20–24, 25–29, 30–34, 35–39, 40–44, and 45–49 years), educational attainment (no education, primary, secondary, or higher), current marital status (married, divorced/separated, or widowed), parity (categorized as 0–3, 4–7, or 8 or more), and place of delivery (health facility or home).

Household-level factors: Household size (≤4 or >4 members), sex of the household head (male or female) (household head is defined by the survey respondent as the primary decision-maker in the household), age of the household head (<30, 30–40, 41–54, or >54 years), and maternal exposure to media (exposed if reads newspaper/magazine, listens to radio, or watches television at least once a week; not exposed otherwise).

Community-level factors: Type of residence (urban, rural, or nomadic), region of residence (classified across Somalia's 17 administrative regions existing at the time of the survey), and household wealth index (derived by DHS using principal component analysis of household assets and amenities, divided into quintiles: poorest, poorer, middle, richer, and richest).

Data preprocessing

Data preprocessing was conducted using Python (Version 3.9) with Pandas and Scikit-learn libraries. Missing values for predictor variables were minimal after initial cleaning but were handled by mode imputation for categorical features. Categorical variables were numerically encoded using label encoding for ordinal features (e.g., education level, wealth index) and one-hot encoding for nominal features (e.g., region) to prepare them for ML algorithms. The outcome variable (IMF) was already binary (0/1).

Statistical analysis

Descriptive statistics (weighted frequencies, percentages) were calculated for all variables. Bivariable associations between independent variables and IMF were assessed using the Chi-Square test, applying survey weights, to ensure that the results of the bivariable analysis were representative of the entire population of Somali children aged 6–23 months. A p < .05 was considered statistically significant. Feature importance was derived from the trained models (coefficients for Logistic Regression; Gini importance for Random Forest, Decision Tree, Gradient Boosting; permutation importance for SVM) to identify key predictors. All statistical analyses were performed using STATA (Version 16) and Python (Version 3.9) with libraries including Pandas, NumPy, Matplotlib, Seaborn, and Scikit-learn (Version 1.2.0 or as per notebook).

Machine learning models

Given the nature of the data set (complex interrelations) and the research objective (binary classification of IMF and identification of important factors), we selected five ML algorithms known for their robustness and interpretability: Logistic Regression, Decision Tree, Random Forest, and Gradient Boosting. Additionally, SVM was included for its powerful classification capabilities. Each algorithm was chosen for its suitability for binary classification tasks and its unique approach to modeling relationships between features and the outcome variable.

Logistic Regression: A linear model that estimates the probability of a binary outcome based on a linear combination of input features, using the logistic function to map the output to a probability between 0 and 1.⁴¹ It provides interpretable coefficients for feature importance.

Decision Tree: A non-parametric model that partitions the data into subsets based on the values of input features, creating a tree-like structure of decisions to predict the outcome. It is highly interpretable and can capture non-linear relationships.⁴²

Random Forest: An ensemble learning method that constructs multiple decision trees, each trained on a random subset of the training data and a random subset of the input features. The final prediction is made by averaging (for regression) or voting (for classification) the predictions of all individual trees, improving accuracy and reducing overfitting.⁴³ It also provides feature importance scores.

Gradient Boosting: An ensemble learning technique that builds models sequentially, where each new model corrects errors made by previous models. It typically uses decision trees as base learners and is known for its high predictive accuracy by combining many weak learners into a strong learner.^42,44

SVM: A powerful algorithm that finds an optimal hyperplane to separate data points belonging to different classes in a high-dimensional space. A linear kernel was primarily used for efficiency and interpretability, although other kernels (e.g., radial basis function) were explored during hyperparameter tuning^45,46

Model development and training

For model development and training, the data set was partitioned into an 80% training set and a 20% testing set. Stratified sampling, based on the outcome variable (IMF status), was employed to ensure proportional representation of the outcome variable in both the training and testing sets, thereby ensuring that models were trained and subsequently evaluated on samples representative of the study population.⁴⁴ The designated training set was utilized for the development of ML algorithms. The testing set was reserved for an independent and unbiased evaluation of the final models’ predictive performance. To ensure the generalizability of the model outputs to the national population and to appropriately account for the complex survey design inherent in the SDHS, sample weights were integrated during the training of ML algorithms amenable to such weighting, including Logistic Regression and tree-based ensembles like Random Forest and Gradient Boosting.

Model evaluation

To assess the predictive performance of the developed ML models on the held-out testing data set, several key evaluation metrics were employed. These metrics included:

Accuracy represents the proportion of correctly classified instances among the total number of cases examined. It is calculated as follows:

Accuracy = \frac{T P + T N}{(T P + T N + F P + F N)}

(1)

where TP denotes true positives, TN represents true negatives, FP refers to false positives, and FN indicates false negatives.

Precision measures the proportion of correctly predicted positive cases out of all instances classified as positive. It is defined as:

Precision = \frac{T P}{(T P + F P)}

(2)

Recall (Sensitivity) quantifies the model's ability to correctly identify true positive cases and is expressed as:

Sensitivity = \frac{T P}{(T P + F N)}

(3)

Specificity measures the ability of the model to correctly identify true negative cases and is given by:

Specificity = \frac{T N}{(T N + F P)}

(4)

F1-Score is the harmonic mean of precision and recall, providing a balanced measure that accounts for both false positives and false negatives. It is computed as follows:

F 1 - Score = \frac{2 \times (P r e c i s i o n \times R e c a l l)}{(P r e c i s i o n + R e c a l l)}

(5)

Area Under the Receiver Operating Characteristic Curve (AUC-ROC): A comprehensive measure of a classifier's ability to distinguish between classes across various probability thresholds. A higher AUC-ROC value indicates superior discriminatory power.⁴⁷

Classification reports, including precision, recall, and F1-score for each class (Adequate Meal Frequency and IMF), and confusion matrices were also generated to provide a detailed understanding of model performance. Feature importance was extracted from models (Logistic Regression coefficients, Decision Tree/Random Forest feature importance's, and permutation importance for SVM) to identify key determinants.

Ethical considerations

The SDHS 2020 data collection protocol was approved by the Institutional Review Board (IRB) of the SNBS. Informed consent was obtained from parents or legal guardians for children's participation in the original survey. This study involves secondary analysis of publicly available, anonymized data, thus not requiring separate ethical approval. Data confidentiality and anonymity were maintained throughout the analysis.

Results

Prevalence of IMF

The overall weighted prevalence of IMF among children aged 6–23 months in Somalia was alarmingly high at 78.51% (95% CI [77.1%–79.9%]) (Figure 1). This indicates that over three-quarters of the children did not meet the WHO minimum requirements for meal frequency.

Figure 1.

Prevalence of inadequate meal frequency.

Sociodemographic characteristics of the study participants

The analysis included a weighted sample of 4066 children aged 6–23 months. The socio-demographic characteristics of these children, their mothers, and their households are presented in Table 1.

Table 1.

Characteristics of study participants (weighted n = 4066).

Characteristic	Category	Weighted frequency	Weighted percentage (%)
Child characteristics
Age of child (months)	6–8	762.10	18.74
	9–11	553.87	13.62
	12–17	1838.37	45.21
	18–23	912.14	22.43
Sex of child	Male	2132.49	52.44
	Female	1933.98	47.56
Current breastfeeding status	Still breastfeeding	1969.72	48.44
	Not breastfeeding	2096.75	51.56
Size of child at birth	Normal	2845.94	69.99
	Underweight	1220.54	30.01
Birth order	1	881.99	21.69
	2–3	2728.61	67.10
	≥4	455.87	11.21
Maternal characteristics
Maternal age (years)	15–19	321.64	7.91
	20–24	1032.03	25.38
	25–29	1196.94	29.43
	30–34	777.59	19.12
	35–39	544.11	13.38
	40–44	169.47	4.17
	45–49	24.69	0.61
Maternal educational status	No education	3335.76	82.03
	Primary	546.45	13.44
	Secondary	132.28	3.25
	Higher	51.99	1.28
Marital status	Married	3792.05	93.25
	Divorced	213.59	5.25
	Widowed	60.84	1.50
Place of delivery	Health facility	978.56	24.06
	Home	3087.92	75.94
Total children ever born	0–3 children	1814.08	44.61
	4–7 children	1733.53	42.63
	8+ children	518.86	12.76
Household characteristics
Sex of household head	Male	2840.67	69.86
	Female	1225.81	30.14
Age of household head (years)	<30	3116.59	76.64
	30–40	28.10	0.69
	41–54	93.84	2.31
	>54	827.95	20.36
Family size	≤4	1585.67	38.99
	>4	2480.81	61.01
Media exposure	Exposed	1323.10	32.54
	Not exposed	2743.37	67.46
Community characteristics
Wealth index	Poorest	926.54	22.78
	Poorer	861.91	21.20
	Middle	747.73	18.39
	Richer	870.16	21.40
	Richest	660.14	16.23
Residence	Rural	1098.50	27.01
	Urban	2495.99	61.38
	Nomadic	471.99	11.61
Outcome variable
Meal frequency status	Adequate	873.93	21.49
	Inadequate	3192.55	78.51

Slightly over half (52.44%) of the children were male. The largest proportion of children belonged to the 12–17 months age group (45.21%), followed by the 18–23 months group (22.43%), 6–8 months (18.74%), and 9–11 months (13.62%). Nearly half (48.44%) were reported as currently breastfeeding, while 51.56% were not. The majority of children (69.99%) were reported as being of normal size at birth, with 30.01% reported as underweight. First-born children constituted 21.69% of the sample, while 67.10% were of birth order 2–3, and 11.21% were of birth order four or higher.

Maternal characteristics highlighted significant challenges. A vast majority of mothers (82.03%) had received no formal education, with only 13.44% having primary education and a mere 4.53% having secondary or higher education. Most mothers (93.25%) were married. The predominant maternal age group was 20–29 years, accounting for 54.81% (25.38% for 20–24 years and 29.43% for 25–29 years). Mothers aged 15–19 years comprised 7.91% of the sample. Parity data showed that 44.61% of mothers had 0–3 children ever born, 42.63% had 4–7, and 12.76% had 8 or more. Home births were common (75.94%), compared to 24.06% delivering in a health facility. A large proportion of mothers (67.46%) reported having no exposure to mass media (radio, television, or newspapers/magazines).

The community and household context revealed that the majority of children resided in urban areas (61.38%), with substantial proportions in rural (27.01%) and nomadic settings (11.61%). Households were predominantly headed by males (69.86%). Regarding household head's age, 76.64% were under 30 years. Most households (61.01%) had more than four members. Household wealth distribution showed significant disparities, with 43.98% of children living in households classified within the poorest or poorer wealth quintiles (22.78% poorest, 21.20% poorer), while 16.23% were in the richest quintile.

To assess potential multicollinearity among predictor variables, a correlation heatmap was generated (Figure 2). The heatmap indicated that most inter-variable correlations were weak to moderate (absolute values generally <0.4, with a few exceptions like Age group and Children ever Born showing a correlation of −0.64), suggesting that multicollinearity was not a major concern for the subsequent modeling.

Figure 2.

Correlation heatmap of predictor variables.

Bivariable analysis of factors associated with IMF

Table 2 presents the results of the bivariable analysis examining the association between various characteristics and IMF status, using the Chi-Square test.

Table 2.

Bivariable associations between characteristics and meal frequency status.

		Meal frequency status
Characteristic	Category	Adequate % [95% CI]	Inadequate % [95% CI]	P
Child factors
Sex of child	Male	21.4 [18.4–24.7]	78.6 [75.3–81.6]	0.918
	Female	21.6 [19.2–24.2]	78.4 [75.8–80.8]
Current breastfeeding	Not breastfeeding	26.3 [23.2–29.6]	73.7 [70.4–76.8]	<0.001
	Still breastfeeding	16.4 [14.3–18.8]	83.6 [81.2–85.7]
Child age (months)	6–8	22.5 [18.5–27.1]	77.5 [72.9–81.5]	<0.001
	9–11	22.7 [18.1–28.1]	77.3 [71.9–81.9]
	12–17	17.9 [15.6–20.5]	82.1 [79.5–84.4]
	18–23	27.1 [23.1–31.6]	72.9 [68.4–76.9]
Birth order	1	72.4 [68.0–76.4]	27.6 [23.6–32.0]	<0.001
	2–3	8.2 [6.7–10.0]	91.8 [90.0–93.3]
	≥4	2.6 [1.1–6.1]	97.4 [93.9–98.9]
Child size at birth	Normal	20.7 [18.6–23.1]	79.3 [76.9–81.4]	0.195
	Underweight	23.2 [19.7–27.1]	76.8 [72.9–80.3]
Maternal factors
Maternal age (years)	15–19	48.2 [41.2–55.3]	51.8 [44.7–58.8]	<0.001
	20–24	23.7 [20.3–27.4]	76.3 [72.6–79.7]
	25–29	17.9 [15.1–21.0]	82.1 [79.0–84.9]
	30–34	15.3 [11.9–19.4]	84.7 [80.6–88.1]
	35–39	19.4 [14.8–25.0]	80.6 [75.0–85.2]
	40–44	18.7 [11.8–28.2]	81.3 [71.8–88.2]
	45–49	16.0 [5.8–37.0]	84.0 [63.0–94.2]
Educational level	No education	20.5 [18.4–22.7]	79.5 [77.3–81.6]	<0.001
	Primary	21.1 [16.2–27.0]	78.9 [73.0–83.8]
	Secondary	35.6 [27.1–45.2]	64.4 [54.8–72.9]
	Higher	55.1 [40.3–69.1]	44.9 [30.9–59.7]
Maternal marital status	Married	21.3 [19.1–23.5]	78.7 [76.5–80.9]	0.327
	Divorced	26.8 [19.6–35.6]	73.2 [64.4–80.4]
	Widowed	17.7 [7.9–35.1]	82.3 [64.9–92.1]
Place of delivery	Health facility	29.2 [25.0–33.8]	70.8 [66.2–75.0]	<0.001
	Home	19.1 [17.0–21.3]	80.9 [78.7–83.0]
Parity	0–3 children	32.5 [29.4–35.9]	67.5 [64.1–70.6]	<0.001
	4–7 children	12.2 [10.3–14.4]	87.8 [85.6–89.7]
	8+ children	13.8 [10.1–18.6]	86.2 [81.4–89.9]
Media exposure	Exposed	20.0 [16.4–24.2]	80.0 [75.8–83.6]	0.303
	Not exposed	22.2 [20.1–24.5]	77.8 [75.5–79.9]
Household factors
Sex of household head	Male	22.5 [20.1–25.1]	77.5 [74.9–79.9]	0.069
	Female	19.2 [16.4–22.3]	80.8 [77.7–83.6]
Age of household head (years)	<30	19.8 [17.4–22.5]	80.2 [77.5–82.6]	<0.001
	30–40	14.1 [5.6–31.5]	85.9 [68.5–94.4]
	41–54	7.7 [3.2–17.6]	92.3 [82.4–96.8]
	>54	29.6 [26.0–33.4]	70.4 [66.6–74.0]
Family size category	≤4	19.0 [15.9–22.4]	81.0 [77.6–84.1]	0.040
	>4	23.1 [20.7–25.7]	76.9 [74.3–79.3]
Community factors
Region	Awdal	29.2 [20.5–39.8]	70.8 [60.2–79.5]	<0.001
	Woqooyi Galbeed	46.3 [32.2–61.0]	53.7 [39.0–67.8]
	Togdheer	32.2 [26.0–39.1]	67.8 [60.9–74.0]
	Sool	39.8 [34.1–45.7]	60.2 [54.3–65.9]
	Sanaag	45.8 [39.4–52.3]	54.2 [47.7–60.6]
	Bari	10.9 [7.8–15.2]	89.1 [84.8–92.2]
	Nugaal	17.6 [13.7–22.2]	82.4 [77.8–86.3]
	Mudug	17.5 [11.8–25.1]	82.5 [74.9–88.2]
	Galgaduud	23.8 [17.9–31.0]	76.2 [69.0–82.1]
	Hiraan	23.4 [16.0–33.0]	76.6 [67.0–84.0]
	Middle Shabelle	23.8 [18.7–29.7]	76.2 [70.3–81.3]
	Banadir	14.5 [11.1–18.7]	85.5 [81.3–88.9]
	Bay	16.7 [7.7–32.3]	83.3 [67.7–92.3]
	Bakool	11.3 [6.7–18.4]	88.7 [81.6–93.3]
	Gedo	14.0 [10.5–18.5]	86.0 [81.5–89.5]
	Lower Juba	17.4 [13.6–21.9]	82.6 [78.1–86.4]
Residence	Rural	21.2 [18.4–24.3]	78.8 [75.7–81.6]	<0.001
	Urban	18.4 [15.7–21.6]	81.6 [78.4–84.3]
	Nomadic	38.2 [31.8–45.1]	61.8 [54.9–68.2]
Wealth quintile	Poorest	16.7 [13.3–20.9]	83.3 [79.1–86.7]	<0.001
	Poorer	18.1 [14.9–21.8]	81.9 [78.2–85.1]
	Middle	20.2 [15.7–25.6]	79.8 [74.4–84.3]
	Richer	23.0 [19.3–27.1]	77.0 [72.9–80.7]
	Richest	32.1 [26.7–38.0]	67.9 [62.0–73.3]

Several child characteristics were significantly associated with IMF (p < .001). As noted, IMF was highest among children aged 12–17 months (82.1%) and those still breastfeeding (83.6%). Birth order demonstrated an exceptionally strong association (p < .001); IMF was dramatically lower for first-born children (27.6%) compared to the extremely high rates observed for children of birth order 2–3 (91.8%) and ≥4 (97.4%). Child's sex and size at birth were not significantly associated with IMF in the bivariable analysis (p = .918 and .195, respectively).

Significant associations were also found between IMF and several maternal factors (p < .001). Children of the youngest mothers (15–19 years) had the lowest IMF prevalence (51.8%), which increased significantly for mothers in older age groups. Maternal education showed a strong inverse relationship (p < .001), with IMF prevalence falling sharply from 79.5% for children of mothers with no education to 44.9% for those whose mothers had higher education. Place of delivery was also significant (p < .001), with children born at home experiencing higher IMF (80.9%) compared to those born in a health facility (70.8%). High parity (total children ever born) showed a strong positive association with IMF (p < .001). Marital status was not significantly associated (p = .327).

Among household factors, the age of the household head was significant (p < .001), with a notably high IMF prevalence (92.3%) for households headed by individuals aged 41–54 years. Family size was also significant (p = .040), with slightly higher IMF in smaller families (≤4 members: 81.0%) compared to larger families (>4 members: 76.9%). Sex of the household head (p = .069) and media exposure (p = .303) were not significantly associated with IMF.

All examined community-level factors demonstrated highly significant associations with IMF (p < .001). Strong regional disparities were evident, and residence type was critical, with nomadic children having the lowest IMF (61.8%) compared to rural (78.8%) and urban (81.6%) children. Household wealth quintile showed a clear gradient (p < .001), with IMF prevalence decreasing significantly from the poorest (83.3%) to the richest (67.9%) quintile.

Machine learning algorithms performance

Five ML models were trained and evaluated for their ability to classify IMF. The performance metrics on the test set are summarized in Table 3.

Table 3.

Predictive performance of machine learning algorithms for IMF.

Metric	Logistic regression	Random forest	Support vector machine (SVM)	Decision tree	Gradient boosting
Accuracy (%)	88.66	88.79	89.43	82.55	89.55
Precision (%)	92.18	91.91	93.25	89.23	93.12
Recall (%)	92.95	93.46	92.79	87.58	93.12
F1-score (%)	92.56	92.68	93.02	88.40	93.12
AUC-ROC (%)	91.16	91.70	90.47	77.24	92.77

The Gradient Boosting algorithm demonstrated superior discriminatory power, achieving the highest AUC-ROC of 92.77%, closely followed by Random Forest (91.70%) and Logistic Regression (91.16%). The SVM also exhibited strong performance (AUC-ROC: 90.47%), whereas the Decision Tree model yielded a substantially lower AUC-ROC (77.24%). These comparative performances are visually depicted by the Receiver Operating Characteristic (ROC) curves in Figure 3. In terms of overall accuracy, Gradient Boosting (89.55%) and SVM (89.43%) were the top-performing models. The F1-score for the primary class of interest, “Inadequate Meal Frequency” (Class 1), was highest for Gradient Boosting (93.12%) and SVM (93.02%), indicating a robust balance between precision and recall for identifying instances of IMF.

Figure 3.

Receiver operating characteristic (ROC) curves of the five models.

Figure 4 shows that all models demonstrated strong capability in identifying inadequate meal feeding (Class 1), evidenced by consistently high true positive (TP) counts (ranging from 522 to 557), reflecting robust sensitivity to the majority class. However, performance in correctly classifying adequate meals (Class 0) varied more substantially, with true negatives (TN) spanning 126–149. The Decision Tree exhibited the highest error rates for both false positives (FP = 63; adequate meals misclassified as inadequate) and false negatives (FN = 74; inadequate meals missed as adequate), indicating poorer calibration for minority-class recognition. In contrast, SVMs achieved the highest TN (149) and lowest FP (40), optimizing specificity for adequate meals, while Random Forest maximized TP (557), prioritizing sensitivity to inadequacy. Gradient Boosting and Logistic Regression balanced these metrics effectively, with the latter achieving the lowest FN (42), minimizing critical failures to detect inadequate feeding. The pervasive class imbalance (596 inadequate vs. 189 adequate samples) underscores the challenge of minority-class precision and suggests context-dependent model selection, prioritizing FP reduction (e.g., SVM for resource conservation) or FN minimization (e.g., Random Forest for safety-critical applications).

Figure 4.

Confusion matrices for (a) Decision Tree, (b) Logistic Regression, (c) Random Forest, (d) SVM, and (e) Gradient Boosting models.

Feature Importance

To identify the principal determinants of IMF, feature importance was assessed for each of the ML models. The methods for deriving importance varied by model: absolute magnitude of coefficients for Logistic Regression, Gini importance for tree-based models (Random Forest, Decision Tree, Gradient Boosting), and permutation importance for the SVM.

For the Gradient Boosting model, which demonstrated superior predictive performance in the evaluation, Birth Order emerged as the overwhelmingly most influential feature, with an important score of 0.741, as depicted in Figure 5. Subsequent features, while considerably less impactful, included Breastfeeding status (0.053), Region (0.033), Child age (0.031), and Maternal age group (0.027).

Figure 5.

Most important factors using Gradient Boosting (in descending importance).

Figure 6(a–d) presents the five most important factors associated with IMF using Logistic Regression model. Analysis of coefficient magnitudes also identified Birth Order as the most significant predictor (coefficient magnitude: 3.214). Other features with notable importance were Breastfeeding status (0.753), Marital status (0.458), Child Size at birth (0.345), and Delivery (place of delivery). The Random Forest model, as depicted, similarly highlighted Birth Order as the primary predictor (importance: 0.347). This was followed by region (0.120), Maternal age (0.073), Wealth Quantile (0.068), and Child age (0.057). For the Decision Tree model, Birth Order again held the highest importance (0.419). Other key predictors identified by this model included Region (0.114), Maternal age group (0.081), Wealth Quantile (0.074), and Child age (0.033). Using permutation importance, the SVM also identified Birth Order as the most critical feature (importance: 0.263). Notably, many other features exhibited zero permutation importance in this model, suggesting a strong reliance on Birth Order for its predictions.

Figure 6.

Top five feature importances across models. Subplots (a), (b), (c), and (d) display the top five feature importances for Logistic Regression, Random Forest, Decision Tree, and SVM (permutation importance), respectively. In all subplots, features are ranked in ascend.

Feature importance analysis (Figure 6a–d) consistently identified Birth Order as the paramount predictor of IMF across all models, though its quantification varied by method: highest magnitude in Logistic Regression coefficients (3.214), greatest importance weight in Random Forest (0.347) and Decision Tree (0.419), and dominant permutation importance in SVM (0.263). Secondary predictors diverged across models: Breastfeeding status, Marital status, and Child Size were prominent in Logistic Regression, whereas Region, Maternal Age, and Wealth Quantile recurred in tree-based models. Notably, SVM exhibited exclusive reliance on Birth Order, with zero importance assigned to other features. Despite methodological differences, key determinants like Region, Maternal Age, Child Age, and Wealth Quantile persistently surfaced across multiple models, reinforcing their robust epidemiological association with feeding outcomes and value for targeted interventions.

Discussion

This study, leveraging ML approaches on data from the inaugural SDHS 2020, provides novel insights into the determinants of IMF among children aged 6–23 months in Somalia. The overall weighted prevalence of IMF was found to be an alarming 78.51%, starkly positioning Somalia among countries with the poorest IYCF practices globally. This rate significantly exceeds those reported in neighboring Ethiopia (ranging from 30.6% to 47.0%)^13,14 and pooled estimates for SSA, where approximately 38.5% meet Minimum Meal Frequency.¹¹ To our knowledge, this research represents the first application of diverse ML models, Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, and SVM to analyze IMF-related outcomes in Somalia, and potentially the first such comprehensive ML-driven analysis in the broader East African region for this specific IYCF indicator.

The application of five distinct machine-learning algorithms enabled a robust identification and ranking of factors associated with IMF. The comparative performance of these models is presented in Table 3 and further illustrated using ROC curves (Figure 3) and confusion matrices (Figure 4a–e). Ensemble-based approaches, such as Gradient Boosting and Random Forest, which aggregate multiple decision trees, demonstrated superior predictive performance compared with the single Decision Tree model. This finding suggests that complex and potentially non-linear relationships between the covariates and IMF outcomes may be more effectively captured by these algorithms, although their performance may also reflect advantages such as variance reduction and improved generalization. In contrast, the relative performance of Logistic Regression indicates that additive and approximately linear associations between specific predictors and the probability of IMF remain informative, implying that simpler parametric relationships continue to contribute meaningfully to prediction alongside more flexible machine-learning models.

The feature importance analysis across all ML models consistently identified Birth order as the most significant predictor of IMF. As illustrated by the Gradient Boosting model, our best performer (Figure 6), birth order accounted for an overwhelming 74.07% of its Gini importance. This dramatic finding, where higher birth order substantially increases the risk of IMF, suggests severe intra-household resource dilution and caregiver burden, potentially exacerbated in the Somali context by large family sizes and socio-economic fragility (Ministry of Planning, 2020). While higher birth order is often linked to poorer IYCF outcomes,^13,48 the extreme magnitude identified by our ML models underscores a critical vulnerability for later-born Somali children, demanding highly targeted interventions.

This primacy of Birth order was further corroborated by the feature importance analyses of the other four models, Logistic Regression, Random Forest, Decision Tree, and SVM, as summarized in the combined visualization presented in Figure 6(a–d). Across these diverse algorithms, Birth order consistently ranked as the top predictor. For instance, in the Logistic Regression model, Birth order exhibited the largest coefficient magnitude. Similarly, the Random Forest and Decision Tree models highlighted Birth order with the highest Gini importance scores among their respective top five features. The SVM, using permutation importance, also identified Birth order as the most critical feature, with many other features showing negligible importance, suggesting a strong reliance of the SVM on this single predictor.

Beyond Birth order, other factors consistently appearing with notable importance across several models (as seen in Figure 6) included Breastfeeding status (prominent in Gradient Boosting and Logistic Regression), Region (important in Random Forest, Decision Tree, and Gradient Boosting), and Maternal Age group (featuring in Random Forest and Decision Tree). The common paradoxical finding in IMF analyses, where currently breastfed children show higher IMF,^14,25,49,50 was evident in our bivariable results, and its continued predictive importance in ML models emphasizes the need for nuanced messaging around complementary feeding alongside breastfeeding.

Socio-economic determinants, namely maternal education and household wealth, were also identified by our ML models as important factors. For example, Wealth quantile was among the top predictors for both the Random Forest and Decision Tree models, while educational level or related proxies like Maternal Age group also showed relevance. This aligns with a vast body of literature from SSA and beyond,^{8,11,13,15,16,51} including ML-driven studies on micronutrient deficiencies in Ethiopia.³⁴ Given that 82.03% of mothers in our sample had no formal education and 43.98% of children lived in the poorest/poorer households, these findings underscore the profound impact of socio-economic disparities on child nutrition in Somalia.

Child age was another consistently important predictor, as demonstrated by the Gradient Boosting, Random Forest, and Decision Tree models. The bivariable analysis showed higher IMF in the 12–17 months age group, a pattern also observed in Ethiopia and SSA,^11,13 indicating this transitional period for breastfeeding and caregivers being less vigilant on active feeding unlike the period of initiation of complementary feeding, is a high-risk window in Somalia, a finding confirmed by its predictive significance in our ML models. The importance of Region in the Gradient Boosting, Random Forest, and Decision Tree models points to localized contexts of conflict, food security, service access, and cultural norms heavily influencing feeding practices in Somalia, consistent with spatial heterogeneity findings in Ethiopia.^13,14,34

The application of ML in this study offers distinct advantages. Methodologically, the use of five distinct algorithms served as a form of sensitivity analysis; the consistent identification of Birth Order as the primary predictor across models with fundamentally different approaches (e.g., the linear Logistic Regression, the tree-based Random Forest, and the hyperplane-based SVM) significantly strengthens the validity of this finding. Beyond confirming known risk factors, the consistent ranking of predictors like Birth Order with such high importance across diverse algorithms provides robust evidence for prioritization. Furthermore, ML models can implicitly capture non-linear relationships and complex association between variables that traditional regression models might not fully elucidate,³⁶ crucial in a multifaceted issue like IMF in a complex humanitarian setting. The high predictive accuracy of models like Gradient Boosting (accuracy: 89.55%) suggests their potential utility in identifying high-risk children or communities for targeted interventions, a key aim of ML in public health.^35,37 This could be particularly valuable in a data-scarce context like Somalia, where such models could be developed into screening tools for community health workers, allowing them to prioritize households for nutritional support even when direct measurement of dietary intake is not feasible. However, some factors significant in bivariable analyses or traditional regressions in other studies media exposure, place of delivery, as seen in Harindintwari et al. (2024) or Kang et al. (2019), showed lower or inconsistent importance in our ML models for Somalia, possibly due to the overwhelming influence of dominant factors like birth order in this specific context or algorithmic differences in predictor weighting.^51,52

Strengths and limitations

Key strengths of this study include its use of the first nationally representative SDHS data for Somalia, the application of a suite of robust ML algorithms for predictor identification, and a substantial sample size. This ML-centric approach provides a novel and timely lens on IMF determinants in a data-scarce, fragile context, representing a pioneering effort in this specific research area for Somalia. However, the study is subject to limitations inherent in cross-sectional DHS data, such as the potential for recall bias in dietary intake reporting and the inability to definitively establish causality. The exclusion of eight insecure strata from the original SDHS sampling frame may also affect the generalizability of findings to the most vulnerable and hard-to-reach populations. While ML models are powerful in identifying complex associations and achieving high predictive accuracy, the “black box” nature of some algorithms can make the direct interpretation of specific relationship effects challenging, although feature importance metrics provide valuable insights into predictor contributions. Furthermore, the meal frequency indicator itself is quantitative and does not capture the qualitative aspects of dietary intake, such as nutrient density or diversity.

Future research

Building on these findings, future research should aim to incorporate longitudinal data to explore causal pathways and the temporal dynamics of IMF. Qualitative studies are warranted to delve deeper into the socio-cultural drivers of the observed feeding patterns, particularly the extreme risk associated with high birth order and the potential protective factors in nomadic communities. Expanding the ML approach to include a wider array of potential predictors, such as detailed household food security indicators, paternal characteristics, and access to specific health and nutrition services, could further refine predictive models. Investigating the application of more advanced ML techniques, including deep learning or hybrid models, might also yield additional insights, particularly if richer data sets become available. Finally, exploring the operationalization of these ML models into practical screening tools for community health workers or program planners could translate these research findings into tangible public health impact.

Conclusion

This study used ML-driven analysis of the SDHS 2020 data that reveals an extremely high prevalence of IMF among young Somali children. Birth order emerged as an exceptionally dominant predictor, alongside child age, breastfeeding status, maternal education, household wealth, and regional context. These findings, underscored by the robust performance of ML models like Gradient Boosting, call for urgent, multi-sectoral interventions. Priorities should include targeted support for families with high birth order children, initiatives to empower women through education and economic opportunities, poverty reduction strategies, and geographically tailored programs. The insights gained from this ML approach can help refine targeting strategies and improve the effectiveness of nutritional interventions in this critical humanitarian context.

Footnotes

ORCID iDs

Mohamed Abdirahim Omar

Omran Salih

Ethical approval

This study is based on secondary analysis of the SDHS 2020 data set. The original data collection protocol was approved by the IRB of the SNBS. Informed consent was obtained from parents or legal guardians for all child participants in accordance with established ethical standards. As this research utilizes publicly available, anonymized data, no additional ethical approval was required for this study. Confidentiality and anonymity were strictly maintained throughout the analytical process.

Consent for publication

This study uses anonymized secondary data where individual consent for publication is not required.

Authors' contributions

Mohamed Abdirahim Omar was responsible for authoring the paper, conducting the analysis, composing the results section, and preparing the manuscript. Omran Salih contributed to editing, analyzing, reviewing, and correcting the manuscript for scientific integrity. Omran Salih also served as Mohamed Abdirahim Omar's supervisor throughout the study. Both authors contributed to conceiving the research topic, exploring the idea, performing the analysis, and drafting the manuscript. The authors have read and agreed to the published version of the manuscript.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Availability of data and material

The data analyzed in this study are secondary data obtained from the 2020 Somali Demographic and Health Survey. The data set is publicly available to registered users through the DHS Program website and the Somalia National Bureau of Statistics microdata platform ().

References

WHO. World Health Organization—WHO. In: The Europa Directory of International Organizations 2021. Routledge; 2021, pp.370–384.

Black

Victora

Walker

, et al. Maternal and child undernutrition and overweight in low-income and middle-income countries. Lancet 2013; 382: 427–451.

Dewey

Adu-Afarwuah

. Systematic review of the efficacy and effectiveness of complementary feeding interventions in developing countries. Matern Child Nutr 2008; 4: 24–85.

Victora

Bahl

Barros

AJD

, et al. Breastfeeding in the 21st century: epidemiology, mechanisms, and lifelong effect. Lancet 2016; 387: 475–490.

UN. Transforming Our World: The 2030 Agenda for Sustainable Development. In: A New Era in Global Health, https://digitallibrary.un.org/record/3923923?v=pdf (2018, accessed 7 May 2025).

WHO. WHO Guideline for complementary feeding of infants and young children 6–23 months of age. Global web icon National Center for Biotechnology Information; 2023.

UNICEF. Children Food and Nutrition; 24, https://www.unicef.org/media/60811/file/SOWC-2019-Exec-summary.pdf (2019, accessed 7 May 2025).

Gatica-Domínguez

Neves

PAR

Barros

AJD

, et al. Complementary feeding practices in 80 low-and middle-income countries: prevalence of and socioeconomic inequalities in dietary diversity, meal frequency, and dietary adequacy. J Nutr 2021; 151: 1956–1964.

White

Bégin

Kumapley

, et al. Complementary feeding practices: current global and regional estimates. Matern Child Nutr 2017; 13: e12505.

10.

Issaka

Paradies

Stevenson

. Modifiable and emerging risk factors for type 2 diabetes in Africa: a systematic review and meta-analysis protocol. Syst Rev 2018; 7: 1–10.

11.

Tebeje

Seboka

Lemma

, et al. Minimum meal frequency and associated factors among children aged 6-23 months in Sub-Saharan Africa: a multilevel analysis of the demographic and health survey data. Front Public Heal 2024; 12: 1468701.

12.

Belay

Taddese

Gelaye

. Does socioeconomic inequality exist in minimum acceptable diet intake among children aged 6–23 months in Sub-Saharan Africa? Evidence from 33 Sub-Saharan African countries’ demographic and health surveys from 2010 to 2020. BMC Nutr 2022; 8: 30.

13.

Tekeba

Gonete

Dessie

, et al. Spatial variation and determinants of inadequate Minimum meal frequency among children aged 6–23 months in Ethiopia: spatial and multilevel analysis using Ethiopian Mini demographic and health survey (EMDHS) 2019. Ann Glob Heal 2024; 90: 1–37.

14.

Tesfie

Endalew

Birhanu

, et al. Spatial distribution of inadequate meal frequency and its associated factors among children aged 6-23 months in Ethiopia: multilevel and spatial analysis. PLOS ONE 2024; 19: 1–30.

15.

Wake

. Prevalence of minimum meal frequency practice and its associated factors among children aged 6 to 23 months in Ethiopia: a systematic review and meta-analysis. Glob Pediatr Heal 2021; 8: 1–12.

16.

Terefe

Jembere

Abie Mekonnen

. Minimum meal frequency practice and associated factors among children aged 6–23 months old in the Gambia: a multilevel mixed effect analysis. Sci Rep 2023; 13: 22607.

17.

Manzione

Kriser

Gamboa

, et al. Maternal employment status and minimum meal frequency in children 6-23 months in Tanzania. Int J Environ Res Public Health 2019; 16: 1137.

18.

Derseh

Shewaye

Agimas

, et al. Spatial variation and determinants of inappropriate complementary feeding practice and its effect on the undernutrition of infants and young children aged 6 to 23 months in Ethiopia by using the Ethiopian Mini-demographic and health survey, 2019: spatial. Front Public Heal 2023; 11: 1158397.

19.

Ministry of Planning I and ED 2020. Somali National Development Plan 9-2020-2024. Ministry of Planning, Investment and Development, https://mop.gov.so/somali-national-development-plan-9-2020-2024/ (2020, accessed 7 May 2025).

20.

SNBS. Somali Health and demographic survey 2020. Somali Natl Bur Stat 2020; 1: 1–11.

21.

Croft

Marshall

AMJ

Allen

, et al. Guide to DHS statistics. Rockville, Maryland, USA: ICF, 2018.

22.

SHDS. Somali Health and Demographic Survey 2020. SHD Surv 2020 Somalia. 2020;2020.

23.

Dadzie

Amo-Adjei

Esia-Donkoh

. Women empowerment and minimum daily meal frequency among infants and young children in Ghana: analysis of Ghana demographic and health survey. BMC Public Health 2021; 21: –9.

24.

Khan

Zaki

Ghenimi

, et al. Predicting preterm birth using explainable machine learning in a prospective cohort of nulliparous and multiparous pregnant women. PLOS ONE 2023; 18: 1–17.

25.

Melak

Abeje

Bayou

, et al. Individual and community level determinants of minimum meal frequency among breastfeeding children aged 6–23 months in Ethiopia: a multilevel analysis of 2019 Ethiopian demographic health survey data. Front Public Heal 2024; 12: 1445370.

26.

Tamirat

Nigatu

Tesema

, et al. Spatial and multilevel analysis of unscheduled contraceptive discontinuation in Ethiopia: further analysis of 2005 and 2016 Ethiopia demography and health surveys. Front Glob Women’s Heal 2023; 4: 895700.

27.

Organization WH. Country cooperation strategy for WHO and Somalia 2021–2025, 2022.

28.

Beyene

Worku

Wassie

. Dietary diversity, meal frequency and associated factors among infant and young children in Northwest Ethiopia: a cross-sectional study. BMC Public Health 2015; 15: –9.

29.

Hirvonen

Hoddinott

. Agricultural production and children’s diets: evidence from rural Ethiopia. Agric Econ 2017; 48: 469–480.

30.

Ogbo

Page

Idoko

, et al. Trends in complementary feeding indicators in Nigeria, 2003–2013. BMJ Open 2015; 5: e008467.

31.

Christodoulou

Collins

, et al. REVIEW A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019; 110: 12–22.

32.

Sidey-Gibbons

JAM

Sidey-Gibbons

. Machine learning in medicine: a practical introduction. BMC Med Res Methodol 2019; 19: 1–18.

33.

Fayyaz

Phan

TLT

Bunnell

, et al. Who will leave a pediatric weight management program and when ? A machine learning approach for predicting attrition patterns. 2020;5112.

34.

Gebeye

Dessie

Yimam

. Predictors of micronutrient deficiency among children aged 6–23 months in Ethiopia: a machine learning approach. Front Nutr 2023; 10: 1–13.

35.

Hassan

Muse

Chesneau

. Machine learning study using 2020 SDHS data to determine poverty determinants in Somalia. Sci Rep 2024; 14: 1–19.

36.

Kloska

Harmoza

Kloska

, et al. Predicting preterm birth using machine learning methods. Sci Rep 2025; 15: 5683.

37.

Zemariam

Abey

Kassaw

, et al. Comparative analysis of machine learning algorithms for predicting diarrhea among under-five children in Ethiopia: Evidence from 2016 EDHS. Health Inf J 2024; 30: 1–29.

38.

Marriott

White

Hadden

, et al. World health organization (WHO) infant and young child feeding indicators: associations with growth measures in 14 low-income countries. Matern Child Nutr 2012; 8: 354–370.

39.

Nguyen

Sanghvi

Kim

, et al. Factors influencing maternal nutrition practices in a large scale maternal, newborn and child health program in Bangladesh. PLOS ONE 2017; 12: e0179873.

40.

World Health Organization. WHO/PAHO. Guiding principles for complementary feeding of the breastfed child. World Heal Organ UNICEF, https://www.who.int/publications/i/item/9275124604 (2000, accessed 7 May 2025).

41.

Hosmer

Lemeshow

Sturdivant

. Applied logistic regression. Hoboken, NJ: John Wiley & Sons, 2013.

42.

Breiman

Friedman

Olshen

, et al. Classification and regression trees. Abingdon: Routledge, 2017.

43.

Breiman

. Random forests. Mach Learn 2001; 45: 5–32.

44.

Hastie

Tibshirani

Friedman

. The elements of statistical learning. New York, NY: Citeseer, 2009.

45.

Cortes

Vapnik

. Support-vector networks. Mach Learn 1995; 20: 273–297.

46.

Abbafati

Machado

Cislaghi

, et al. Global age-sex-specific fertility, mortality, healthy life expectancy (HALE), and population estimates in 204 countries and territories, 1950–2019: a comprehensive demographic analysis for the global burden of disease study 2019. Lancet 2020; 396: 1160–1203.

47.

Fawcett

. An introduction to ROC analysis. Pattern Recognit Lett 2006; 27: 861–874.

48.

Mitchodigni

Amoussa Hounkpatin

Ntandou-Bouzitou

, et al. Complementary feeding practices: determinants of dietary diversity and meal frequency among children aged 6–23 months in southern Benin. Food Secur 2017; 9: 1117–1130.

49.

Issaka

Agho

Renzaho

AMN

. Prevalence of key breastfeeding indicators in 29 Sub-Saharan African countries: a meta-analysis of demographic and health surveys (2010–2015). BMJ Open 2017; 7: e014145.

50.

Terefe

Jembere

Abie Mekonnen

. Minimum meal frequency practice and associated factors among children aged 6–23 months old in the Gambia: a multilevel mixed effect analysis. Sci Rep 2023; 13: 1–12.

51.

Kang

Chimanya

Matji

, et al. Determinants of Minimum dietary diversity among children aged 6–23 months in 7 countries in east and Southern Africa (P10-035-19). Curr Dev Nutr 2019; 3: nzz034.

52.

Harindintwari

Mochama

Nsanzabera

, et al. Factors associated with Minimum acceptable diet among children aged 6 to 23 months in Rwanda. Rwanda J Med Heal Sci 2024; 7: 445–453.