Abstract
The Sustainable Development Goals (SDGs), established by the United Nations in 2015, offer a comprehensive framework to address major global challenges, including poverty, inequality and environmental degradation, with the overarching aim of achieving prosperity and well-being for all by 2030. Reliable predictions of SDG indicators are crucial for proactive policy-making and optimizing resource allocation, ensuring that interventions are effectively targeted to areas of greatest need. This paper introduces a two-step process for constructing machine learning models to forecast SDG indicators. In the first step, we apply a shape-based clustering method to group countries with similar underlying characteristics, thereby forming more homogeneous clusters for analysis. In the second step, machine learning models, based on XGBoost and LSTM, are trained for each cluster, tailored to the specific characteristics of the countries within these groups. Additionally, models are also trained on the full, unclustered dataset for comparison. We apply this approach to SDG indicator 9.2.1, which tracks manufacturing value added per capita. Our results show that the cluster-specific machine learning models consistently outperform traditional time series forecasting methods such as ARIMA and Holt’s damped trend model, underscoring the potential of this method to enhance the accuracy of SDG forecasting. Furthermore, we use the machine learning-based forecasts to conduct an outlook assessment of SDG 9.2.1, which reveals that the majority of countries remain significantly off-track to achieving the 2030 target, emphasizing the urgent need for more targeted and timely policy interventions.
Get full access to this article
View all access options for this article.
