Abstract
Precise forecasting of the Pavement Condition Index (PCI) is essential for efficient pavement maintenance planning within budget constraints. This research formulates and evaluates three models: Support Vector Machine (SVM), Back Propagation Neural Network (BPNN), and Particle Swarm Optimization-optimized SVM (PSO-SVM), using field data from two roads in China: one ordinary and one expressway. Five influential factors—road age, annual average daily traffic per lane, average annual temperature, annual precipitation, and relative humidity—function as inputs, with PCI as the resultant output. PSO is used to tune the hyperparameters of SVMs, specifically c and γ. The results indicate that PSO-SVM, as an optimization-enhanced model, achieves higher prediction accuracy and more stable comparative performance than the baseline SVM and BPNN under the present small-sample conditions. Random Forest (RF) study indicates that road age, traffic volume, and temperature are the primary determinants influencing PCI. This method provides practical guidance for pavement repair decision-making in frigid regions. The model’s performance was assessed by a 70/30 hold-out split and 5-fold cross-validation to mitigate partition bias in a small-sample context.
Keywords
1. Introduction
China’s road network has developed rapidly in the last forty years. By the end of 2024, the total length of roads exceeded 5.49 million kilometers, with expressways surpassing 190,700 kilometers. 1 While this expansive network supports economic development, it faces significant challenges. The continuous increase in traffic volume and vehicle overloads, coupled with complex environmental factors such as fluctuating temperature and precipitation, significantly accelerates pavement degradation. These stressors lead to common distresses, such as cracking and rutting, which compromise driving safety and increase maintenance costs. With limited funds but ever-increasing management requirements, the gap between infrastructure needs and available resources is becoming more serious. To address this, systematic assessment and accurate forecasting using indicators such as the PCI, Ride Quality Index (RQI), Structural Strength Index (SSI), and Side Force Coefficient (SFC) or Pendulum Friction Number (BPN) are vital for identifying vulnerable road sections and optimizing maintenance strategies. As demonstrated by the Long-Term Pavement Performance (LTPP) program and other research, a robust Pavement Management System (PMS) facilitates a better understanding of future conditions, allowing for proactive maintenance that effectively extends the service life of pavement infrastructure.
The types of performance prediction models in PMS are not highly varied. It mainly falls into three categories: deterministic, stochastic, or machine learning. 2 Deterministic models produce a single pavement condition forecast value from a given set of conditions and timespan. In stochastic models, the randomness of pavement condition evolution is considered, and the pavement status probability distribution at any given time is generally provided, with the Markov model being the most representative.3,4 Due to advances in fields such as math and computing, various machine learning models have emerged, including support vector machines (SVMs), k-nearest neighbors (KNNs), and artificial neural networks (ANNs).5–7 Machine learning captures the correlations and underlying structures in data, enabling reasoning and prediction for complex problems, especially those involving high-dimensional, nonlinear data.
Although machine learning methods have shown strong potential for pavement performance prediction, single models still have inherent limitations.8–10 ANNs may suffer from instability during training and local optima problems; SVMs are highly sensitive to kernel and parameter selection; and KNN-based methods can be affected by feature scaling, data distribution, and the choice of k. As a result, single-model approaches may not always yield stable, reliable predictions in complex pavement engineering scenarios. To address these limitations, recent studies have increasingly adopted hybrid models that combine optimization algorithms, feature selection methods, or multiple learners to improve prediction accuracy, robustness, and generalization. Accordingly, pavement performance prediction research has increasingly evolved from the application of single models to the development of hybrid modeling frameworks. Such approaches generally combine optimization algorithms, feature selection methods, or multiple learning strategies with conventional machine learning models to address the inherent limitations of individual models. In particular, hybrid frameworks combining neural networks, SVMs, and ensemble learning have emerged as prominent research directions due to their potential to improve predictive accuracy, robustness, and adaptability under complex pavement engineering conditions. Among these hybrid approaches, models based on different machine learning techniques have been widely explored for pavement performance prediction. Yang et al. 11 developed a machine-learning-based framework for pavement performance prediction and found that the PSO-BP neural network achieved the best predictive performance among the compared models. Xiao et al. 12 developed a PSO-BPNN model, demonstrating the effectiveness of combining metaheuristic optimization with neural networks. More directly related to this study are SVM-based hybrid approaches. Yan et al. 13 used PSO to optimize SVM parameters for PCI evaluation, achieving better results than empirical parameter selection. Li et al. 14 applied a Particle Swarm Optimization–Support Vector Regression (PSO-SVR) framework to highway inspection data, achieving faster convergence and lower prediction error. Li et al. 15 combined an improved Firefly Algorithm with SVM to enhance prediction stability and generalization. Wang et al. 16 integrated Grey Relational Analysis with SVR and achieved robust prediction performance under small-sample conditions.
In recent years, machine learning methods have been widely applied to the prediction of asphalt pavement performance. Deep learning and interpretable learning frameworks have shown strong potential for network-level and multi-indicator prediction, while also enhancing model transparency and supporting maintenance decision-making.6,17,18 Meanwhile, studies based on the LTPP database and field investigation data have demonstrated that SVR, RF, gradient boosting (GB), and stacked ensemble models can effectively predict International Roughness Index (IRI), PCI, rutting, cracking, and related distress indicators.19–21 In addition, optimization-based hybrid models, such as Improved Firefly Algorithm–Support Vector Machine (IFA-SVM) and Support Vector Machine–Firefly Algorithm (SVM-FFA), have further improved the accuracy, stability, and robustness of pavement performance prediction, particularly under nonlinear and small-sample conditions.15,22 Overall, these studies indicate that integrating deep learning, ensemble learning, and optimization strategies can effectively enhance pavement condition prediction, especially for complex, high-dimensional datasets.
Despite recent advances in machine learning for pavement performance prediction, two major limitations remain in the existing literature. First, most previous studies have relied on large datasets, typically comprising hundreds or even thousands of samples from databases such as the LTPP program. In practical pavement management at the provincial or regional level, however, constraints such as infrequent inspections, limited funding, and incomplete historical records often restrict highway agencies to only a few dozen valid samples for specific road types or regions. Under such small-sample conditions, conventional machine learning models are more likely to exhibit unstable evaluation results, overfitting, and sensitivity to data partitioning. Second, existing studies have primarily focused on improving prediction accuracy, while paying insufficient attention to model stability, reliability, and practical applicability in small-sample scenarios. From an engineering perspective, models with consistent performance and clear interpretability are often more valuable than those that achieve only marginally higher accuracy but suffer from poor generalization.
This study investigates PCI prediction under small-sample conditions using limited field data collected from one expressway and one ordinary road in China. Three representative models—SVM, BPNN, and PSO-SVM—are compared based on predictive stability, generalization consistency, and resistance to overfitting. In addition, a random-forest-based feature importance analysis is conducted to identify the key factors influencing PCI degradation, thereby enhancing the interpretability of the results and supporting pavement maintenance decision-making at both the provincial and network levels.
2. Evaluation of asphalt pavement performance
Pavement performance assessment indicators are generally divided into two categories: single indicators and composite indicators. An individual evaluation assesses specific aspects of performance, such as pavement condition, based on the type, degree, and distribution of distress. As for composite evaluation, which compares and summarizes results from different parts to produce an overall assessment of the pavement’s service level. Another way of saying this is that a single evaluation gives us just one point of view, while a composite evaluation covers the whole. According to current maintenance standards in China, the commonly used individual indicators are PCI, RQI, SSI, SFC, or BPN. The functional relationship is defined as follows: the SSI characterizes load-bearing capacity; the RQI represents ride smoothness; the SFC or BPN evaluates skid resistance; and the PCI quantifies the extent of visible surface distress. The entire indicator system and process flow are shown in Figure 1. Pavement performance evaluation system.
As per the “Highway Technical Condition Evaluation Standard”, the assessment of asphalt pavements comprises five indicators: pavement distress, rutting, smoothness, pavement structural strength, and pavement skid resistance (Figure 2). Every indicator is normalized to 0-100 from raw inspection data using standard methods, and better conditions are associated with higher scores. The individual results are then combined through weighing to form PQI. Highway technical condition indicators diagram.
Though PCI fails to capture specific types of distress and damage patterns in asphalt paving paths, it can still serve as a comprehensive index to indicate the overall level of road surface damage. Therefore, this paper uses PCI as an important basis for preventive maintenance decision-making. Among the available pavement performance indicators, PCI was selected as the target variable in this study because it directly reflects visible surface deterioration and is widely used in engineering practice for pavement condition assessment and maintenance prioritization. Its calculation method and rating standards comply with the “Highway Technical Condition Evaluation Standard.’’
DR—Pavement overall damage rate;
Pavement distress condition evaluation standards.
Preventive maintenance standards for asphalt pavements of ordinary roads.
3. Methodology
3.1 Dataset description
Pavement inspection data of the expressway in Gansu Province, China. 23
Pavement inspection data of the ordinary highway in Guizhou Province, China. 24
A total of 62 valid sample records were formed by combining data from both regions, with PCI values ranging from 74.88 to 100, road ages spanning from 1 to 17 years, and AADT ranging from 4,484 to 37,731 vehicles per day. The distribution of road age, AADT, average annual temperature, annual precipitation, and annual relative humidity in relation to PCI is shown in Figure 3. The overall trend indicates a negative correlation between PCI and road age. Under the same road age conditions, ordinary roads with higher traffic volumes show a faster decline in PCI, while expressways, benefiting from better maintenance conditions, exhibit greater stability. This difference highlights the combined impact of regional environment, traffic load, and road grade on pavement performance. It provides multi-level data support for the subsequent development of predictive models based on SVM, BPNN, and PSO-SVM. Data distribution of PCI and various influencing factors.
To further examine the comparability between the expressway and ordinary-road datasets, descriptive statistics and Mann–Whitney U tests were conducted for the main variables, including road age, AADT, climatic factors, and PCI. The results showed significant differences between the two road categories in all examined variables (p<0.05), indicating substantial heterogeneity in the pooled dataset. Therefore, the pooled analysis in this study should be interpreted primarily as a methodological comparison under mixed data conditions, rather than as evidence of full comparability between the two road classes.
3.2 Preprocessing
It is necessary to process the raw data systematically before training with machine learning models. Since there are significant differences in the ranges and units of different features, including them all directly in the model can lead to slower convergence and poorer predictive performance. Therefore, it is necessary to standardize all indicators to a standard scale, enabling comparisons of features and improving the reliability and accuracy of training models.
Therefore, we normalize the dataset and linearly transform the original indicators to the [0,1] range using a specified proportion. The data we get is processed so that data with different variable values maintain the ratio, and the influence of unit differences is omitted, ensuring the features have the same units and can be applied at the same level. The normalization formula is as follows:
Furthermore, the dataset was scrutinized for missing values and potential outliers before model training. No absent entries were detected in the gathered field inspection data. Pavement performance statistics inherently reflect varied service conditions, including differences in traffic loading, maintenance history, and environmental exposure, which may result in outliers. Thus, excessive numbers may reflect genuine engineering situations rather than measurement inaccuracies. Due to the constrained sample size (n=62), eliminating or indiscriminately adjusting extreme observations may lead to information loss and skewed model estimation. Consequently, no explicit outlier removal was performed in this investigation. Min-max normalization was used to address scale differences among input features, and robust assessment measures, including mean absolute error (MAE) and root mean square error (RMSE), were adopted, as they are less sensitive to isolated extreme values.
In model training, it is standard practice to divide the dataset into a training set for model development and a test set for evaluation. The training set reveals underlying patterns and attributes within the data and refines the model’s parameters; the test set evaluates the model’s ability to predict new samples, hence assessing the accuracy of its predictions. In this study, the 62 pavement inspection records collected from an expressway in Gansu Province and an ordinary road in Guizhou Province were randomly partitioned in a 7:3 ratio while preserving sample integrity. Of the total, 44 samples were assigned to the training set and 18 to the test set. This split ensured sufficient data for model training and enabled an independent evaluation of predictive performance. Additionally, to reduce the uncertainty arising from the limited sample size, a 5-fold cross-validation method was employed as a supplementary validation strategy. The 70/30 train–test split was used to provide an intuitive evaluation of model performance on an independent test set, whereas 5-fold cross-validation was adopted to reduce the influence of a single random partition and to obtain a more stable assessment under small-sample conditions. Because these two validation strategies use different data partitioning mechanisms, some discrepancy in their results is expected.
3.3 Modeling methods
3.3.1 BP neural network
The BPNN is a typical feedforward artificial neural network, consisting of an input layer, hidden layers, and an output layer, capable of solving complex problems through nonlinear mapping. The PCI value of asphalt pavement can be affected by factors such as road age, road load, structural strength, climate, and pavement thickness in pavement performance prediction. In the past, traditional approaches such as empirical formulas or regression models were often adopted, but they failed to capture the nonlinear relationships among these factors. Compared with traditional empirical formulas or linear regression models, the BPNN can automatically learn and model the complex nonlinear relationships between input variables and the PCI through its multi-layer structure and nonlinear activation functions. However, BPNN training relies on gradient-based backpropagation, which may converge to a local rather than a global optimum. Moreover, because the established BPNN has a relatively large number of trainable parameters compared with the limited number of training samples (44 in this study), it is more susceptible to overfitting in the present small-sample setting, which may compromise its generalization performance on unseen data.
Based on the feature selection of factors influencing asphalt pavement performance, this paper selects five key indicators—road age, AADT, average annual temperature, annual precipitation, and annual relative humidity—as input variables for the BPNN, and establishes a three-layer structured prediction model, as shown in Figure 4. The input layer contains 5 neurons, and the output layer consists of 1 neuron, corresponding to the predicted value of the PCI. BP neural network architecture diagram.
As a baseline model in this study, the BPNN parameters were selected using empirical formulas and repeated trials. For the established BPNN model, based on the empirical formula:
3.3.2 Support vector machine (SVM)
SVM is a widely used machine learning algorithm for regression analysis. Based on statistical learning theory,16,25 it can effectively handle complex nonlinear problems. The core idea of SVM is to use a kernel function to map data into a higher-dimensional feature space, thereby enabling nonlinear regression. For asphalt pavement performance prediction, particularly for the PCI, SVM can effectively capture the nonlinear relationships between pavement performance and influencing factors such as road age, traffic volume, and climate. This makes it more advantageous in prediction accuracy and applicability than traditional linear regression or empirical models. In this study, the radial basis function (RBF), polynomial, linear, and sigmoid kernels were considered, and the detailed procedure of the SVM prediction model is shown in Figure 5. Flowchart of the SVM prediction model.
In this study, given the strong nonlinearity of asphalt pavement data, the RBF kernel was selected for the SVM model. The RBF kernel has strong nonlinear mapping capabilities and requires fewer parameters, effectively handling complex multidimensional data and better capturing the patterns of pavement performance changes in higher-dimensional space. In this study, the same input and output features as the BPNN were selected. The SVM regression model was implemented in Matlab and its toolbox. As a baseline model in this study, its parameters were determined using conventional empirical settings and repeated trials, with the penalty factor c set to 10.0 and the kernel parameter set γ to 0.1.
3.3.3 Implementation process of PSO-Optimized SVM
The standard SVM was enhanced by applying PSO to globally optimize the penalty parameter c and RBF kernel parameter γ. PSO is a typical swarm intelligence evolutionary algorithm that simulates the process of group cooperation in nature, such as searching for food, to optimize complex functions. Unlike traditional gradient-based optimization methods, PSO does not rely on the differentiability of the objective function. Instead, it approximates the optimal solution through the interactions among multiple “particles” in the swarm. Each particle represents a candidate solution and has two attributes: position and velocity. During the iterative process, particles adjust their direction and movement magnitude based on their own historical best experience (personal best) and the best shared experience within the swarm (global best). This “dual memory mechanism” ensures a balance between global exploration and local exploitation. Meanwhile, the individual best and global best solutions are updated based on the fitness comparison of the current solution and the historical optimal solutions. The velocity update formula is as follows14,26:
To make the SVM’s parameters optimal through PSO to avoid overfitting or underfitting. PSO uses population-based search and iterative updates to find the best solution across a larger parameter space, improving the accuracy and speed of the SVM model’s predictions. The basic flow chart of the algorithm is shown in Figure 6 below: Flowchart of the PSO-SVM prediction model.
The PSO algorithm is used to optimize the SVM model. The specific details are as follows: (1) Initialization: Set the particle swarm size, maximum number of iterations, learning factors, and other parameters. Randomly generate the initial positions and velocities of the particles. (2) Fitness Calculation: Substitute the particle’s corresponding parameters (c, γ) into the SVM model, calculate the prediction error as the fitness, and update the individual best and global best solutions. (3) Update Position and Velocity: Adjust the particle’s velocity and position based on the PSO update formula to generate new parameter combinations. (4) Termination Condition: If the maximum number of iterations is reached or the error meets the accuracy requirements, stop; otherwise, continue iterating. (5) Normalization: Normalize the input data to avoid the influence of dimensionality. (6) Train the Optimal Model: Use the optimal parameters obtained from PSO to train the SVM model and perform prediction and validation.
PSO-SVM parameters.

Fitness curve diagram.
3.4 Model evaluation metrics
The coefficient of determination R2 is used to measure the goodness of fit of a regression model. It represents the proportion of the total variation in the dependent variable that the independent variables explain. Its value ranges from 0 to 1, with values closer to 1 indicating better model fit and a higher proportion of explainable variation, thus leading to more ideal prediction results.
27
The MAE is used to measure the average deviation of model predictions. It is calculated by averaging the absolute differences between the predicted and actual values for each sample, providing an intuitive error scale. Since MAE directly reflects the magnitude of the prediction bias, it is negatively correlated with prediction accuracy: the smaller the MAE, the lower the model’s deviation from the target variable, indicating better overall prediction performance. Compared to squared-error metrics, MAE is less sensitive to outliers, providing a more objective measure of the average error across most samples.
RMSE converts the prediction bias into a value with the same units as the original data, making it easier to interpret in practical applications. Compared to MAE, RMSE gives more weight to larger errors because it is calculated as the square root of the mean squared error, thereby highlighting the impact of prediction points that deviate significantly from the true values on the overall error. The lower the RMSE value, the higher the model’s prediction accuracy.
3.5 Feature importance analysis based on Random Forest
A RF regression model was used to assess the relative significance of the selected input factors, thereby improving the interpretability of the PCI prediction task. The RF model included the identical five predictors as the previous models—road age, AADT per lane, mean annual temperature, annual precipitation, and relative humidity—with PCI as the objective variable. The RF was trained on the entire normalized dataset (n=62). The tree count was set to 50, and the minimum leaf size was set to 1, achieving an effective balance between model adaptability and computational efficiency for this limited dataset.
Feature importance was calculated using a permutation-based method on the out-of-bag (OOB) samples. In this approach, one predictor is randomly permuted at a time while the others are kept unchanged, and the resulting increase in OOB prediction error is used to quantify its relative importance. Therefore, the reported importance values reflect the relative change in prediction error when each predictor is perturbed. These values should not be interpreted as percentage contributions, nor should they be regarded as evidence of causal relationships. Rather, they provide a relative ranking of predictor importance within the current dataset and modeling framework.
To quantify the uncertainty of the RF-based feature importance ranking, bootstrap resampling was additionally performed. The RF model was repeatedly trained on 200 bootstrap samples, and the mean importance value and standard deviation of each predictor were calculated and presented with error bars. Because the present study is based on a limited sample size and restricted geographic coverage, and because correlations may exist among predictors, the RF-based importance ranking should be interpreted with caution. Accordingly, this analysis is intended to provide supportive interpretive evidence for PCI prediction rather than a definitive explanation of the causal mechanisms of pavement deterioration.
3.6 Model validation strategy in small-sample contexts
Due to the limited number of pavement inspection samples (n=62), a single train-test split may yield unreliable or skewed performance assessments. A 5-fold cross-validation approach was used as an ancillary validation strategy to provide a more reliable and robust assessment of model performance in small-sample settings. In each fold, roughly 80% of the samples were allocated to model training, with the remaining 20% to testing. Min-max normalization to the [0,1] interval was applied only to the training subset and subsequently used on the corresponding test subset, effectively preventing information leakage between the training and test phases. Model performance was evaluated using the coefficient of R2, MAE, and RMSE. The cross-validation outcomes are presented as mean ± standard deviation across the five folds, collectively reflecting the average predicted accuracy and the robustness of the proposed model.
3.7 Sensitivity analysis of PSO hyperparameters
To further examine the reasonableness of the adopted PSO parameter settings, a simplified sensitivity analysis was conducted for the population size and the maximum number of iterations. In this analysis, one parameter was varied at a time while the remaining PSO parameters were kept unchanged. Considering the limited sample size and the practical scope of the present study, only two representative PSO hyperparameters were examined. For each parameter setting, the PSO-SVM model was repeatedly run under random train–test partitions, and the results were reported as mean ± standard deviation. The performance of PSO-SVM under different parameter settings was evaluated using the test-set R2, MAE, and RMSE.
4. Results and discussion
4.1 Model performance comparison
To thoroughly evaluate the effectiveness of the PSO-optimized SVM model, this study systematically compares the predictive performance of the traditional SVM, BPNN, and PSO-SVM models on both the training and test sets. It should be noted that the comparison in this study is intended to evaluate the performance gain from optimization-enhanced modeling relative to commonly used baseline SVM and BPNN configurations, rather than to conduct a fully symmetric hyperparameter-tuning comparison across all candidate models. The specific training and testing results are shown in Figure 8. Firstly, for the training set, the prediction results of all three models exhibit a noticeable linear trend. Both the SVM and PSO-SVM models show that their predicted values are closely aligned with the actual values along the diagonal, indicating that they are effective at capturing the central relationship between PCI and the input variables. The BPNN also performs well on the training set, though it slightly overestimates in the lower score range (approximately 80-87). Overall, the error is small, and the model fits the training data quite well. In comparison, the PSO-SVM model’s performance on the training set is similar to that of the SVM. However, its optimized hyperplane is more robust, enabling it to handle complex nonlinear relationships better and achieve better training performance. Model training and testing.
However, on the test set, the SVM model’s prediction accuracy drops relative to the training set; the scatter points are even further from the diagonal. Looking at the data, there is significant variance, and more interestingly, a single sample with an accurate value around 97 is way underestimated at 85, thus the model shows weaker predictive stability on the independent test set. The BPNN performs well on the test set, with overall predictions being relatively small and only a few points deviating from the reference line, indicating that it is still adapting to the test data. Compared with the SVM and BPNN, the PSO-SVM model performs better on the test set, with test points more concentrated and smaller differences between predicted and actual values. This indicates that PSO-SVM not only fits the training data well but also generalizes and remains stable when applied to new data. Therefore, the superior performance of PSO-SVM in this study should be interpreted primarily as evidence of the benefit of PSO-based hyperparameter optimization under small-sample conditions.
Overall, PSO-SVM provides more accurate and stable predictions for nonlinear PCI data and shows better comparative performance on the test set than the baseline models under the present small-sample conditions. While both BPNNs and SVMs can already fit the training set well, their generalization performance on test sets is relatively weak. PSO-SVM, because it has an optimization capability, not only fits the training data better but also provides more accurate predictions for new data, suggesting that the optimized SVM provides more stable predictive performance under the present validation setting.
The prediction performance of the BPNN, SVM, and PSO-SVM models under a single training–test split is presented in Figure 9, and evaluated in terms of R2, MAE, and RMSE. on the training set, SVM achieves an R2 of 0.93, but on the test set, it is only 0.75, indicating some overfitting. BPNNs achieve an R2 of 0.90, which is relatively good, however, its R2 on the test set decreases to 0.71, suggesting only moderate generalization ability. The PSO-SVM model achieves the best results, with R2 values of 0.95 in the training set and 0.84 in the test set, indicating good and strong fitting accuracy. Evaluation of model performance measures (R2, MAE, and RMSE) for SVM, BPNN, and PSO-SVM models on both training and testing datasets.
The baseline SVM model’s MAE was 1.02 on the training set and 1.62 on the test set, indicating a discernible drop in performance on unknown data. On the training and test sets, the BPNN showed MAEs of 1.24 and 1.58, respectively. Higher overfitting risk was indicated by bigger performance fluctuations. The PSO-SVM model, on the other hand, showed the narrowest generalization gap among the three models, recording the lowest MAEs of 0.88 (training) and 1.30 (test). These findings demonstrate PSO-SVM’s higher predictive stability and robustness, especially when considering a small dataset.
In terms of the RMSE (metric), the SVM model has an RMSE of 1.35 on the training dataset, but it rises to 3.02 on the test set, which implies that it is poorly fitting on the test set. BPNN’s RMSE is 1.63 on the train set and 3.19 on the test set, respectively, with significant error as well. PSO - SVM model has an RMSE of 1.35 on the training set and 1.6 on the testing set, with the smallest error and stable prediction.
Performance comparison based on 5-fold cross-validation.
It should be noted that the results obtained from the 70/30 train–test split and the 5-fold cross-validation are not expected to be identical. The former is based on a single random partition and may therefore be more sensitive to the composition of the training and test subsets, especially under limited-sample conditions. By contrast, the 5-fold cross-validation results are obtained by averaging model performance across multiple folds, and thus provide a more robust estimate of predictive stability and generalization. The discrepancy between the two sets of results mainly reflects the sensitivity of small-sample learning to data partitioning, rather than inconsistency in the comparative conclusions.
According to the 5-fold cross-validation results, PSO-SVM attains the highest average R2 and the lowest MAE and RMSE under the small-sample condition, indicating superior predictive accuracy and generalization stability compared with the other models. Unlike the single-split results shown in Figure 9, Table 6 provides a more robust evaluation by summarizing both the mean performance and the variation across multiple folds. Furthermore, multi-fold validation reduces dependence on a single data partition, thereby offering a more reliable assessment of model performance in small-sample contexts. Therefore, in this study, the 5-fold cross-validation results serve as the primary basis for comparative interpretation, while the 70/30 train–test split provides complementary supporting evidence.
4.2 Sensitivity analysis of PSO hyperparameters
Sensitivity analysis of population size in PSO-SVM.
Sensitivity analysis of maximum iterations in PSO-SVM.
4.3 Feature importance and influencing factors
According to the importance analysis of influencing factors shown in Figure 10, pavement condition (PCI) is affected by multiple factors. Specifically, the factors presented in the figure include X1 (road age), X2 (annual average daily traffic), X3 (average annual temperature), X4 (annual precipitation), and X5 (annual relative humidity). Among these variables, road age has the highest relative importance, indicating that the RF model relies more on this factor when predicting PCI in the current dataset. AADT and average annual temperature also exhibit comparatively high importance values, suggesting that traffic loading and thermal conditions are closely associated with PCI variation. By contrast, annual precipitation and relative humidity are comparatively less important. Random forest feature importance with error bars (mean ± SD based on 200 bootstrap resamples).
It should be noted that the RF importance values represent relative predictive relevance within the current dataset, rather than percentage contributions or causal effects. The error bars in Figure 10 show that road age remains the most consistently influential factor, while the importance values of the remaining variables display some overlap under bootstrap resampling. Therefore, the feature importance results should be interpreted mainly as an indication of broad relative importance rather than as a precise ranking of all predictors.
From an engineering perspective, the results suggest that road age, traffic loading, and temperature are more strongly associated with PCI variation in the present dataset. By contrast, precipitation and relative humidity appear to have relatively lower importance. These findings may provide useful support for pavement condition assessment and maintenance prioritization under limited-data conditions.
5. Conclusion
This study utilized 62 pavement inspection records from an expressway in Gansu Province, China, and an ordinary road in Guizhou Province, China, to develop and compare three models: SVM, BPNN, and PSO-SVM. The objective was to predict the PCI using five readily available variables: road age, annual average daily traffic (AADT) per lane, mean annual temperature, annual precipitation, and relative humidity. The model’s performance was evaluated using a 70/30 hold-out test and 5-fold cross-validation to mitigate assessment uncertainty associated with a limited number of engineering samples. The primary conclusions are as follows: (1) A uniform small-sample modeling workflow was developed by incorporating inspection, traffic, and climatic variables, and implementing normalization to reduce unit- and scale-induced bias, facilitating equitable comparisons among models trained on diverse inputs. (2) Based primarily on 5-fold cross-validation results and supported by a 70/30 train–test split, PSO-SVM demonstrated superior overall accuracy and generalization compared with SVM and BPNN under the present small-sample conditions. This advantage mainly reflects the performance gain achieved through PSO-based hyperparameter optimization relative to the baseline model settings. In addition, the sensitivity analysis indicated that the adopted PSO settings were practically reasonable for the present study. (3) Random Forest-based feature analysis suggests that road age, AADT, and average annual temperature are more strongly associated with PCI variation in the present dataset than precipitation and relative humidity. Given the limited sample size, restricted geographic scope, and pooled data from different road classes, these findings should be regarded as supportive interpretive evidence rather than definitive causal hierarchies. (4) The proposed PSO-SVM framework demonstrates strong potential for PCI prediction under data-constrained conditions and provides a practical modeling approach for pavement performance assessment where inspection data are limited. The results highlight its ability to support decision-making by identifying segments with higher deterioration risk while maintaining robustness in small-sample settings. (5) This study provides a structured methodological comparison for PCI prediction under small-sample conditions, but the results should be interpreted with caution because the pooled dataset includes both expressways and ordinary roads, and the compared models did not adopt fully symmetric hyperparameter-tuning strategies. Future research should expand the dataset, develop road-type-specific models, apply comparable tuning strategies across candidate models, and further examine model robustness using leave-one-out cross-validation or bootstrap-based validation methods, so as to improve model transferability and the rigor of comparative evaluation.
Supplemental material
Supplemental material - Pavement condition prediction under small-sample conditions using a particle swarm optimization-based support vector machine
Supplemental material for Pavement condition prediction under small-sample conditions using a particle swarm optimization-based support vector machine by Wenyuan Xu, Zehao Yang, Yongcheng Ji in Science Progress
Footnotes
Ethical considerations
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent to participate
This research does not involve any human participants or animals.
Authors contributions
Wenyuan Xu and Zehao Yang jointly designed the study and developed the methodology, implemented the model and performed the computational analysis, organized and visualized the results, and drafted and revised the manuscript.
Yongcheng Ji supervised the overall research, provided key technical guidance, and critically reviewed and revised the manuscript.
Ping Huang contributed to data acquisition and result validation, provided engineering context, and participated in manuscript review and editing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Research on the Development of a Technical System for the Evaluation of Road and Bridge Structural Conditions based on Damage Detection Results (HJK2023B009-5), Research and Application of Key Technologies for Smart Construction Site Monitoring and Inspection during Highway Construction (HJK2023B009) and Development Program of Heilongjiang (GZ2024009).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/ or publication of this article.
Data Availability Statement
Data will be made available on request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
