Sage Journals: Discover world-class research

Abstract

For developing a pavement resilience framework, it is critical to understand and predict the moisture content variation of the unbound layers, which significantly affects pavement response and performance. Utilizing data from the Long-Term Pavement Performance (LTPP) program, two models were developed and compared: a linear regression model and a random forest model. The linear regression model explained 59.5% of moisture content variation, identifying key factors such as maximum single-day precipitation, time since maximum rainfall, and depth of the water table, as well as surface and unbound layers’ material types. The random forest model demonstrated superior performance, explaining 92.3% of moisture content variation. A case study for a LTPP section in Texas demonstrated the models’ ability to simulate moisture distribution over time and depth after a significant rainfall event, providing insights into the drainage behavior of different pavement layers and subgrade materials. The case study also demonstrated the random forest model’s capability to capture different moisture behaviors after precipitation, which the linear model fails to account for. While detailed trends in subterranean layers’ moisture content level can be site-specific, coarser materials tend to handle the excessive water from rainfall better than finer ones, as they experience a shorter duration with elevated moisture concentration. While this was expected, the model allows this aspect to be quantified.

Keywords

pavement moisture resilient modulus flood resilience machine learning

The response and performance of pavement infrastructure are critical for maintaining efficient and safe transportation networks, yet they are subjected to a variety of environmental stressors. As highlighted by Zapata et al. ( 1 ), key environmental factors including precipitation, temperature fluctuations, freeze–thaw cycles, and the depth of the water table interact with internal characteristics of the pavement, such as the materials’ susceptibility to moisture and freeze–thaw damage, the drainage capabilities of paving layers, and the pavement’s potential for water infiltration. Among the external factors, moisture and temperature stand out as the two most significant environmentally driven variables affecting the properties of the pavement layers and subgrade, ultimately affecting the pavement’s load-carrying capacity. Recent hurricanes in Texas (i.e., Harvey and Beryl) highlighted the significant effect of flooding on pavements. As flooding induces elevated moisture levels within pavements, understanding its impacts on pavement performance is becoming increasingly critical. Floods can significantly compromise the structural integrity and functional performance of pavements, leading to accelerated deterioration, reduced service life, and increased maintenance costs ( 2 ). Immediate physical damage to pavements occurs through reducing structural capacity with the saturation of subgrade and base layers, differential swelling in expansive subgrade soils, and stripping of asphalt binders ( 3 ). Moreover, the prolonged exposure to water can lead to long-term deterioration of pavement materials, reducing their service life and performance as the residual moisture content in pavement layers post-flooding can continue to affect pavement performance long after the floodwaters have receded ( 4 ).

Qiao et al. ( 5 ) emphasized that pavements are highly sensitive to climate variations. Climatic conditions directly influence their deterioration rate, maintenance needs, and lifecycle costs. According to Knott et al. ( 6 ), various climatic changes such as sea-level rise (SLR), SLR-induced groundwater rise, changes in season duration, seasonal average temperatures, the number of freeze–thaw cycles, and extreme temperatures significantly threaten pavements. The increasing frequency and intensity of flood events pose a significant and growing threat to road infrastructure worldwide. Flexible pavements, constituting the majority of road surfaces in the U.S.A., are particularly vulnerable to climate-induced stresses. Almeida and Picado-Santos ( 7 ) highlight asphalt pavements’ vulnerability to water, citing Pregnolato et al. ( 8 ), Sultana et al. ( 9 ), Lu et al. ( 10 ), Cong et al. ( 11 ), Asadi et al. ( 12 ), and Mateos et al. ( 13 ), to demonstrate the weakening effect on flexible pavement structures by moisture intrusion. Gudipudi et al. ( 14 ) found that climate change could increase asphalt pavement fatigue cracking by 2%–9% and rutting by 9%–40% over a 20-year period. As climate change projections indicate an increased likelihood of intense precipitation events in many regions ( 15 ), a better understanding and quantification of pavement flooding and its impact is crucial for developing more resilient pavement designs, implementing effective maintenance strategies, and ensuring the long-term sustainability of road infrastructure in flood-prone areas ( 8 ). Khan et al. ( 16 ) conducted a case study on the effect of extreme moisture intrusion in untreated layers on pavement performance, highlighting the detrimental effects of flooding on pavement structure.

Despite the critical nature of this issue, there remains a gap in our understanding of how flood exposure affects pavement performance over time, particularly in the context of changing climate patterns. This knowledge gap hinders the development of effective strategies for enhancing the resilience of road infrastructure in flood-prone areas ( 17 ). As such, comprehensive research is needed to quantify the impacts of increasing flood frequency on road infrastructure and inform adaptive design and maintenance practices. As highlighted by Hedayati and Hossain ( 18 ), variations in subgrade moisture content significantly alter geotechnical properties, such as the resilient modulus and shear strength, leading to premature pavement failure. Predicting these variations helps optimize drainage system design and material selection to reduce moisture retention and improve long-term serviceability. Qiao et al. ( 19 ) acknowledged the risk of increased lifecycle costs caused by cumulative damage from repeated flood inundation, which can occur when the pavement is inadequately designed. Knowledge of the probability of exposure to elevated moisture content levels provides critical insights early in the design process, enabling the development of more informed and resilient pavement designs. By understanding potential moisture-related risks, engineers can proactively incorporate drainage systems, select appropriate materials, and implement structural adjustments that mitigate moisture-induced damage.

One critical aspect of moisture impact on pavement performance is the volumetric moisture content (VMC) of the subgrade and other unbound layers, such as the base and subbase. Excessive moisture can weaken these layers, reducing shear strength and increasing deformation, and ultimately causing premature pavement failure ( 20 ). Several factors contribute to the variability of moisture content within pavement layers, including the soil type, drainage conditions, climatic conditions, and presence of groundwater.

Literature Review

Moisture can enter the pavement through surface infiltration, via joints and cracks, and by moving in subsurface regions ( 21 ). The presence of subsurface moisture may be because of high water tables, interrupted groundwater systems, subsurface flow, and capillary action. This excessive moisture can lead to a reduction in the shear strength of unbound materials (leading to the reduction of the resilient modulus), weak layers from the migration of fines, frost heave, reduced strength during frost melt, durability cracking, loss of support from pumping fines, and asphalt stripping. One of the appendices to the Guide for Mechanistic-Empirical Design of New and Rehabilitated Pavement Structures explored a variety of models seeking to improve moisture prediction accuracy for the integrated climate model ( 22 ), as summarized in Table 1.

Table 1.

Summary of Soil Water Retention Models (22)

Reference	Equation	Unknowns
Fredlund and Xing ( 23 )	$θ_{w} = C (h) \times [\frac{θ_{sat}}{{[\ln ((1) + {(\frac{h}{a})}^{b})]}^{c}}]$	$θ_{w} = volumetric water content$ $h = soil matric suction in kPa$ $a =$ a soil parameter that is primarily a function of the air-entry value of the soil in kPa $b =$ a soil parameter that is primarily a function of the rate of water extraction from the soil, once the air-entry value has been exceeded $c =$ a soil parameter that is primarily a function of the residual water content $h_{r} =$ a soil parameter that is primarily a function of the suction at which residual water content occurs in kPa
Fredlund and Xing ( 23 )	$C (h) = [1 - \frac{\ln (1 + \frac{h}{h_{r}})}{\ln (1 + \frac{10^{6}}{h_{r}})}]$
van Genuchten ( 24 )	$θ_{w} = θ_{r} + \frac{θ_{s} - θ_{r}}{{[1 + {(\frac{h}{a})}^{b}]}^{c}}$	$θ_{r}$ = residual volumetric water content $a =$ a soil parameter that is primarily a function of the air-entry value of the soil in kPa $b =$ a soil parameter that is primarily a function of the rate of water extraction from the soil, once the air-entry value has been exceeded $c =$ a soil parameter that is primarily a function of the residual water content
McKee and Bumb ( 25 )	$θ_{w} = θ_{r} + \frac{θ_{s} - θ_{r}}{1 + \exp {(1)}^{[\frac{(h - a)}{b}]}}$	$θ_{r}$ = residual volumetric water content $a = curve - fitting parameter$ $b = curve - fitting parameter$
van Genuchten ( 24 ) and Mualem ( 26 )	$θ_{w} = θ_{r} + \frac{θ_{s} - θ_{r}}{{[1 + {(\frac{h}{a})}^{b_{m}}]}^{(1 - \frac{1}{b_{m}})}}$	$θ_{r}$ = residual volumetric water content $a =$ a soil parameter that is primarily a function of the air-entry value of the soil in kPa $b_{m} =$ a soil parameter that controls the slope at the inflection point in the soil–water characteristic curve
van Genuchten ( 24 ) and Burdine ( 27 )	$θ_{w} = θ_{r} + \frac{θ_{s} - θ_{r}}{{[1 + {(\frac{h}{a})}^{b}]}^{(1 - \frac{2}{b})}}$	$θ_{r}$ = residual volumetric water content $a =$ a soil parameter that is primarily a function of the air-entry value of the soil in kPa $b =$ a soil parameter that is primarily a function of the rate of water extraction from the soil, once the air-entry value has been exceeded
Gardner ( 28 )	$θ_{w} = θ_{r} + \frac{θ_{s} - θ_{r}}{1 + {(\frac{h}{a})}^{b}}$	$θ_{r}$ = residual volumetric water content $a =$ a soil parameter that is primarily a function of the air-entry value of the soil in kPa $b =$ a soil parameter that is primarily a function of the rate of water extraction from the soil, once the air-entry value has been exceeded
Brooks and Corey ( 29 )	$θ_{w} = θ_{r} + (θ_{s} - θ_{r}) {(\frac{a_{b}}{h})}^{b_{b}}$	$θ_{r} =$ residual volumetric water content $a_{b} = bubbling pressure in kPa$ $b_{b} = pore size index$
Williams et al. ( 30 )	$\ln Θ_{e} = A + B \ln h$	$A = fitting parameter$ $B = fitting parameter$
Farrel and Larson ( 31 )	$h = {(u_{a} - u_{w})}_{b} \exp [α (θ_{s} - θ_{w})]$	$α = empirical constant$ ${(u_{a} - u_{w})}_{b} = air - entry value$ $Ψ = capillary head$
Assouline et al. ( 32 )	$θ_{w} = θ_{L} + (θ_{s} - θ_{L}) [1 - \exp [- ξ {(\frac{1}{Ψ} - \frac{1}{Ψ_{L}})}^{η}]]$	$Ψ_{L} =$ capillary head that corresponds to a very low water content, at which the hydraulic conductivity is negligible $θ_{L} =$ volumetric water content at capillary head $Ψ_{L}$ $η = fitting parameter$ $ξ = fitting parameter$

Johari et al. ( 33 ) highlighted the importance of the soil–water characteristic curve (SWCC), a sigmoidal function describing how the soil’s water storage capacity varies with suction, in understanding the mechanical behavior of unsaturated soils. The authors classified existing models into four categories: a combination of regression and curve-fitting based on soil properties and porosity ( 34 , 35 ); analytical regression equations with basic soil properties ( 36 , 37 ); physico-empirical modeling ( 38 , 39 ); and artificial intelligence methods. As the last approach had still largely not been attempted at the time the study was conducted, the authors proposed the genetic programming (GP) approach, taking as inputs a list of basic soil properties including the initial void ratio, initial gravimetric water content, logarithm of suction normalized by atmospheric air pressure, clay content, and silt content. The model was developed with a database comprised of 186 pressure plate test results spanning a range of soil properties, such as the void ratio, suction, specific gravity, water content, and grain size distribution.

Hedayati and Hossain ( 18 ) conducted a case study to analyze real-time subgrade moisture variation as a function of precipitation. The site was chosen along a low-volume hot-mix asphalt (HMA) pavement section in a rural area in North Texas, where the moisture sensor and rain gauge were installed. The data collection process lasted over 2 years, monitoring hourly precipitation and VMC measurement at different soil depths along the pavement centerline. A pattern with combined seasonal and temporal variation was identified. While seasonal variation accounted for limited contribution within ±5% of the average value, temporary increase in moisture content caused by rainfall was as high as 12%.

Huang et al. ( 40 ) analyzed the seasonal variation of moisture content of the pavement subgrade, applying machine learning models to predict the VMC along the Integrated Road Research Facility test road in Edmonton, Alberta. The three variables used in the study were pavement temperature, day of the year, and depth, demonstrating significantly higher accuracy compared to traditional models.

Mousavi et al. ( 41 ) developed a system dynamics model (SDM) to study the performance of flexible pavement systems under moisture variations in real-time. Among the three components of the SDM, the first is a hydrological structure that acquires current or forecasted climatic conditions and soil hydraulic properties to model water flow in unsaturated and saturated pavement layers. The model simulates moisture flux in and out of the pavement layers caused by various factors, such as precipitation, evaporation, surface water flow, and groundwater level (GWL) fluctuations, providing a time-dependent water content profile at various depths in pavement layers as the output. Richards’ ( 42 ) unsaturated flow equation was used to estimate water flux in and out of unsaturated subgrade layers:

\frac{\partial θ}{\partial t} = \frac{\partial}{\partial z} [K (θ) (\frac{\partial h}{\partial z} + 1)]

(1)

where $θ$ is the volumetric water content (VWC), $z$ is depth, $h$ is the soil pressure head, $t$ is time, and $K (θ)$ is the moisture-dependent hydraulic conductivity.

Van Genuchten’s ( 24 ) soil water retention curve (SWRC) formula was used to obtain the initial VWC profile in subgrade soil, and calculation of moisture-dependent hydraulic conductivity at each soil layer was according to Mualem ( 26 ).

Qiao et al. ( 19 ) applied the finite element model (FEM) to simulate water movement after flooding events in three different pavements under three flood scenarios. The accuracy of the FEM was validated by comparison with laboratory testing as it captured post-flooding weakening and recovery behavior in pavements. A recovery phase, whose start is typically marked by a sudden drop in VWC or an increase in layer stiffness, was found to last much longer than the weakening period, primarily because of the moisture hysteresis phenomenon of the soil materials. Pavements built on a sandy subgrade accumulate damage more quickly but experience much less total damage as compared to those built on silty or clayey soils. During the weakening phase, it was found that the base layer lost the most stiffness as compared to any subgrade materials.

Sun et al. ( 43 ) conducted finite element method based hydraulic simulations considering different pavement structures, soil subgrade types, groundwater table (GWT) levels, and flooding scenarios. The study generated saturation profiles within the pavement during inundation events, extracted the VWC within the entire pavement structure, analyzed its changing patterns over time, and developed a predictive model for forecasting the short-term temporal evolution of saturation levels within pavement structures during flooding events. The study found that a second-degree polynomial function was most suitable to describe the water content changing pattern within the vadose zone:

VWC = β_{0} + β_{1} D + β_{2} t + β_{3} Dt + β_{4} D^{2} + β_{5} D^{2} t

(2)

where $D$ is depth, $t$ is simulation time, and $β_{i}$ are the fitting parameters.

The inputs taken by the model include asphalt concrete (AC) layer thickness (in.), base layer thickness (in.), GWT depth from the pavement surface (in.), subgrade type in American Association of State Highway and Transportation Officials (AASHTO) classification, and peak inundation time (h). The model also allows the user to select the location at either the pavement edge or the wheelpath. The outputs generated are the prediction of time-descriptive indicators (peak saturation time and restoration time), as well as VWC curve versus depth at each defined time point.

While extensive studies have been devoted to understanding soil moisture content variation both within and beyond the context of pavement performance modeling, most of the developed models either take detailed inputs such as laboratory tested properties that are not necessarily available in situ and largely rely on assumptions, or have limited applicability because of calibration at specific sites. This knowledge gap thus motivates the need for a methodology to translate generic, readily available site information into temporal and spatial soil moisture distribution with sufficient accuracy.

Data Description

Established in 1987, the Long-Term Pavement Performance (LTPP) database offers a comprehensive long-term record of pavement performance across diverse climatic regions and under various loading conditions ( 44 ). This extensive dataset enables practitioners and researchers to study the effects of environmental factors, including precipitation and flooding, on pavement deterioration over time ( 45 ), allowing for the analysis of these factors by providing detailed information on subgrade soil classification, measured moisture content at various depths, and corresponding environmental conditions, such as temperature and precipitation. The LTPP Seasonal Monitoring Program (SMP) has extensively utilized time domain reflectometry (TDR) for in situ moisture content measurements, providing valuable data for modeling efforts ( 46 ). Apart from moisture content in the unbound layers, data collection is also devoted to keeping track of the subsurface temperature, GWT depth, air temperature, and precipitation, among others.

SMP Dataset Preparation

For this study, climate, moisture, and material data were extracted from the LTPP InfoPave Standard Data Release (SDR) version 37 available since August 2023 ( 47 ). The daily average air temperature and total daily precipitation stored in the SMP dataset "SMP_ATEMP_RAIN_DAY" were used as climate data. "SMP_TDR_AUTO_MOISTURE," also part of the SMP dataset, supplied the moisture content data. While available with respect to both gravimetric and volumetric measurements, the scope of this analysis was limited to the VMC. Most of the sections have 10 or slightly fewer than 10 TDR probes installed for moisture content measurement at subterranean depths ranging from 0.15 to 2.35 m. To monitor moisture fluctuation over time while ensuring the relevance of the temperature and precipitation information used, only climatic conditions observed in advance but not exceeding 30 days ahead of each moisture reading were kept. GWT depth data were also obtained from the SMP dataset, filtered to keep only readings taken before, but not exceeding, 3 days ahead of the moisture measurements. After merging and filtering, the available data captured a total of 32 LTPP sections across 24 states and provinces, with details listed in Table 2.

Table 2.

Section ID by Number of Days with Climate Measurement

Unique ID	State/province	Moisture measurement days	TDR count
50-1002	Vermont	47	6
27-6251	Minnesota	36	8
23-1026	Maine	30	10
89-3015	Quebec	28	10
33-1001	New Hampshire	26	10
09-1803	Connecticut	24	10
90-6405	Saskatchewan	21	10
37-1028	North Carolina	19	10
48-3739	Texas	19	10
48-4143	Texas	19	9
24-1634	Maryland	18	10
36-4018	New York	17	10
08-1053	Colorado	16	9
83-1801	Manitoba	16	10
83-3802	Manitoba	16	10
48-1068	Texas	14	3
40-4165	Oklahoma	12	10
31-3018	Nebraska	11	10
48-4142	Texas	10	10
16-1010	Idaho	7	10
06-3042	California	5	10
28-1802	Mississippi	5	10
46-9187	South Dakota	5	10
49-3011	Utah	4	8
13-1005	Georgia	2	9
13-1031	Georgia	2	7
30-8129	Montana	2	9
48-1077	Texas	2	10
49-1001	Utah	2	7
28-1016	Mississippi	1	8
42-1606	Pennsylvania	1	10
56-1007	Wyoming	1	10

Note: TDR = time domain reflectometry.

Compilation of Subgrade and Other Unbound Layer Information

Subgrade properties are stored in the “SUBGRADE_LAYER_PROP_EXP” table ( 47 ). This table contains information including the plasticity indices, soil classification, soil strength, laboratory moisture–density relationships, in situ properties, soil suction, expansion index, frost susceptibility, and key gradation properties ( 48 ). Analogously, "UNBOUND_LAYER_PROP_EXP" contains similar material information for the unbound or stabilized base or subbase layers. Per the LTPP definition, the "LAYER_NO" field is a "unique sequential number assigned to pavement layers, starting with layer 1 as the deepest layer (subgrade)" ( 47 ). For sections containing multiple AASHTO soil classifications for the same layer, each entry is weighted by the inverse of the number of classes. Joined by the unique layer sequential number, the material properties are combined with the layer thickness information stored in the "TST_L05B" table. The cumulative thickness from the top down translates into the depth of each layer below the surface. Information can be deducted on which layer a TDR probe is measuring moisture content for, and which layer the water table has penetrated into. It should be noted that the LTPP database stores layer thickness in inches, while the TDR probe depth and GWT depth is stored as distance from the surface in meters. Conversion to SI units was conducted to ensure consistency. The final compiled dataset for model estimation contained 6774 entries.

Linear Regression

As one of the simplest and most widely used statistical methods for modeling, an ordinary least squares (OLS) linear regression approach was first attempted. While VMC is used as the dependent variable, the independent variables include numerical factors such as precipitation, days elapsed, and average daily air temperature, as well as categorical factors such as the surface type, soil type, and layer measured:

y = β_{0} + β_{1} x_{1} + \dots + β_{i} x_{i} + β_{j} x_{j} + \dots + β_{k} x_{k} + ε

where y is the VMC, β₀ is the intercept, $β_{1}$ to $β_{i}$ are the coefficients of the numerical variables, $x_{1}$ to $x_{i}$ are the numerical variables (precipitation, days elapsed, average daily air temperature, water table depth, TDR depth), $β_{j}$ to β _k . are the coefficients of the categorical variables, $x_{j}$ to $x_{k}$ are the categorical variables (surface type, unbound material soil type, layer measured), and ε is the error term.

Table 3 shows the numerical variables used in the model with some descriptive statistics on their distribution. Figure 1 illustrates the correlations among the independent variables. Notably, all pairwise correlations have absolute values below 0.15, suggesting minimal collinearity and supporting the appropriateness of proceeding with linear regression.

Table 3.

Descriptive Statistics of the Variables Used for the Linear Model

Variable name	Min.	1st Q	Median	Mean	3rd Q	Max.	SD
Volumetric moisture content (%)	1.50	10.80	19.70	20.50	28.00	49.70	11.50
Max. single-day rainfall (mm)	0.10	9.65	17.75	22.54	30.15	144.40	19.51
Days since max. rainfall	0.27	6.43	13.72	14.25	21.69	29.66	8.87
Average daily air temperature (°C)	−26.70	3.30	10.70	10.39	18.30	30.40	10.41
Water table depth (m)	0.73	1.92	2.83	2.74	3.48	4.73	0.98
TDR depth (m)	0.15	0.68	1.09	1.10	1.44	2.35	0.52

Note: TDR = time domain reflectometry; Min. = minimum; Max. = maximum; Q = quarter; SD = standard deviation.

Figure 1.

Correlation across numeric variables as input to the linear regression model.

The regression results are shown in Table 4. The selected base case was a section with a HMA surface layer and a gravel subgrade. The model can explain 59.5% of the variation in moisture content, with a residual standard error of 6.4 on 6760 degrees of freedom. Two significant variables are (i) the maximum single-day precipitation within the past 30 days before the moisture measurement was taken and (ii) the number of days elapsed since that maximum rainfall event until the day the moisture was measured. These two variables were calculated as follows: record the measurement date for each moisture reading, extract 30 consecutive days of daily precipitation data leading up to it, identify the maximum daily rainfall within the 30 days, and then calculate the difference in the number of days from that day until the date of moisture measurement. After these two variables are accounted for, other precipitation indicators, such as daily or monthly cumulative rainfall, dropped below the significance level of 5%, and thus were eliminated from the model. These two variables describe the moisture input from rainwater accumulated within a relatively short timeframe and the draining process that ensues.

Table 4.

Regression Results

Predictor variable	Estimate	SE	t value
(Intercept)	2.975	0.514	5.79
Max. single-day rainfall (mm)	0.043	0.005	9.09
Days since max. rainfall	−0.061	0.010	−6.08
Average daily air temperature (°C)	0.074	0.009	8.06
Water table depth (m)	−0.939	0.104	−9.02
Fine sand	7.357	0.374	19.66
Silty or clayey gravel sand	10.409	0.367	28.36
Silty soils	16.206	0.345	46.95
Clayey soils	28.939	0.401	72.11
Portland cement concrete	−2.274	0.223	−10.21
Surface treatment	−5.066	0.388	−13.06
TDR depth (m)	5.650	0.200	28.27
Subbase layer	9.627	0.412	23.35
Base layer	12.001	0.372	32.28

Note: TDR = time domain reflectometry; SE = standard error; Max. = maximum.

The coefficients suggest that after a rainy day with 22.5 mm of precipitation, the mean value observed from the data shown in Table 3, it takes approximately 15.9 days for a subsurface location to return to its pre-event moisture level on average, assuming all other conditions remain constant. This was consistent with the 16-day draining period concluded by Ismail et al. ( 49 ). The water table depth, per the LTPP definition, is the distance of the water table from the pavement surface. The negative coefficient indicates that the farther the water table retreats below the surface, the lower the moisture content measured at a specific depth is expected to be, with an average 0.94% drop in VMC as the water table drops by 1.0 m farther from the surface.

It can also be seen that, compared to the coarsest gravel and sand (A-1-a and A-1-b), subgrades of all other types of materials tend to hold more moisture when other conditions are held constant, especially the clayey soils (A-6, A-7-5, and A-7-6), demonstrating a general trend of an increasing tendency to hold moisture as the unbound layer material becomes finer. Compared to sections with HMA surface layers subjected to the same conditions, those with portland cement concrete (PCC) and surface treatments have an expected 2.27% and 5.07% lower moisture content, respectively, at a fixed depth below. Although, when other factors are held constant, the base and subbase layers tend to experience a higher moisture content. Note that these layers are located at shallower depths, which counterbalances the overall expected moisture content level.

While linear regression is simple and clear to interpret, it often fails to capture some nonlinear relationships, considering the complexity of the interplay of various factors that may be driving the soil moisture level. This over-simplicity motivated the exploration of an alternative model capable of better handling the nonlinear relationships and interactions.

Random Forest

Proposed by Breiman ( 50 ), random forests have shown promise in handling high-dimensional data and capturing complex interactions. A random forest is an ensemble of decision trees h(x;θk), with $k = 1, . . ., K$ , where x represents the $p$ -dimensional input vector of observed covariates, corresponding to a random vector X. Each tree is characterized by its own random vector $θ_{k}$ , drawn from independent and identically distributed (i.i.d.) random vectors across all trees. The training data (x₁,y₁),…,(x_n,y_n) is assumed to be a random sample of size $n$ , with each member being a $(p + 1)$ -tuple, independently drawn from the joint distribution of (X,Y). When applied for regression, for the tree predictor h(x;θk), $k = 1, \dots, K$ , taking on numerical values as follows:

\bar{h} (x) = \frac{1}{K} \sum_{k = 1}^{K} h (x; θ_{k})

(3)

where $\bar{h} (x)$ is the random forest prediction and $h (x; θ_{k})$ is the prediction of the $k$ th tree. By the law of large numbers, as $k \to \infty$ :

E_{X, Y} {(Y - \bar{h} (X))}^{2} \to E_{X, Y} {(Y - E_{θ} h (X; θ))}^{2}

(4)

the convergence of which ensures protection against overfitting ( 51 ).

At the construction of each tree, a different bootstrap sample is drawn from the training data, leaving out about one-third of the cases. The mean squared error (MSE) calculated across these left-out entries is defined as the out-of-bag (oob) error. A forest size of 200 and a minimum node size of four were selected based on the stabilizing trend toward oob error minimization. This means that the random forest consists of 200 individual decision trees, and each tree continues to split until no terminal node contains fewer than four observations. With a random split of 70/30, the model is trained on 70% of the dataset and tested on the remaining 30%, with the prediction versus observation for the testing dataset shown in Figure 2. Each dark point represents a prediction–observation pair from the 30% testing dataset. The dashed red line denotes the 1:1 line (perfect agreement), while the surrounding gray band indicates the 95% confidence interval. With a mean of squared residuals of 7.7, the model is capable of explaining 92.3% of the variation within the target.

Figure 2.

Predicted versus actual volumetric moisture content across the test dataset.

While the specific soil classification per the AASHTO standard (A-1-a, A-3, A-4, A-7-5, etc.) did not provide much additional information compared to the broader categories of gravel, sand, silt, and clay, the random forest shows better performance when the specific classification is known. Detailed surface layer material information also enhanced model performance compared to only more generic categories being provided. The features used in the model are shown in descending order of importance in Figure 3.

Figure 3.

Feature importance of the random forest model predicting volumetric moisture content.

Note that the only input variable not readily available, once the site location and timeframe are known, is the water table depth. Therefore, an additional step was taken to model the depth of the water table with respect to spatial and temporal parameters. The modeled value can then be used as a surrogate input when direct measurement is not available. The final model uses an ensemble of 200 trees with a node size of five. Taking latitude, longitude, elevation, monthly cumulative precipitation, and date of measurement as features, the model captures 94.6% of the variation in water table depth with a mean of squared residuals of 0.048.

Case Study

Constructed on a subgrade and subbase of class A-7-6 clayey soils and a base layer of class A-2-4 gravel or sand, section 48-1068 along State Highway 19 (SH 19) was chosen for the case study. It was first assigned as a General Pavement Study (GPS) experiment section in 1987 with an original surface layer of dense-graded HMA. The section received multiple surface treatments, including fog seals and chip seals over the surface course, as a LTPP section until December 2005. Precipitation and temperature information were retrieved for its location from August 2004 to December 2005. In addition, the duration of its last construction number was used to enter into the model as climate input for consistency with structural information. It should be noted that these are the best available inputs for observing the responses from the model, not intended to conclude the current condition of the section itself. The location of the pavement received 39 mm of rainfall on October 4, 2004, the maximum level throughout its last construction number. The daily temperature and precipitation 1 day before and the month following that day are shown in Figure 4. As shown in Figure 4a, the single-day precipitation of 39 mm on October 4 is treated as the maximum precipitation event for all subsequent days. This precipitation pattern, specific to the selected site and timeframe, is particularly well-suited for analyzing how moisture levels rise following an isolated rainfall event and gradually return to pre-event levels over time.

Figure 4.

Temperature comparison of case study sections: (a) precipitation before and after the October 4, 2004, rainfall at section 48-1068 and (b) temperature before and after the October 4, 2004, rainfall at section 48-1068.

As water table depth was not monitored at the location after 1997, a set of readings over time was first simulated based on the coordinates, elevation, temperature, and monthly cumulative precipitation at the site. With the simulated water table depth values as input along with other site-specific information already known, moisture distribution over time and depth after the single-day precipitation was simulated at a spatial resolution of 0.001 m and a time interval of 1 day. Figure 5 shows the simulated results. The first run was conducted under status quo structural conditions with subgrade and subbase layers of class A-7-6 clayey soil and a base layer of class A-2-4. For comparison purposes, a second run was conducted assuming all other variables are constant but replacing the subgrade and subbase layers with class A-4 silty soil. The two cases can be compared side by side, as shown in Figure 5, a and b .

Figure 5.

Simulated moisture distribution over time and depth after the October 4, 2004 rainfall at section 48-1068: (a) clayey soil subgrade and subbase and (b) silty soil subgrade and subbase.

To better illustrate the trends in average moisture level over time, the VMC percentages were averaged within each unbound layer, namely the subgrade, subbase, and base, as shown in Table 5. The two simulated scenarios are tabulated side by side. With the existing condition of clayey soil layers, moisture levels reach their maximum on the day of the precipitation event for all unbound layers. The base layer and subgrade reach the pre-event moisture level on day 13, counting from the day of maximum rainfall, and the subbase layer on day 14. A delay in elevated moisture content was observed in the alternative scenario with silty soil layers, as the maximum is not reached until day 9. Removal of excessive moisture was found to be more rapid, reaching the pre-event level by day 14 despite the much later peak, leaving a shortened period of structural weakening caused by moisture intrusion.

Table 5.

Average Post-Rainfall Volumetric Moisture Content over Time by Layer

Date	Clayey soil			Silty soil
Date	Subgrade (%)	Subbase layer (%)	Base layer (%)	Subgrade (%)	Subbase layer (%)	Base layer (%)
2004-10-03	24.0	20.2	23.4	22.3	19.4	22.0
2004-10-04	25.4	22.6	24.6	21.5	18.9	21.1
2004-10-05	25.1	22.1	24.4	21.5	18.9	21.2
2004-10-06	24.6	21.4	23.9	21.5	18.8	21.2
2004-10-07	24.6	21.2	23.9	22.3	19.2	21.9
2004-10-08	24.4	21.0	23.7	22.2	19.1	21.8
2004-10-09	25.0	21.7	24.3	22.5	19.0	22.0
2004-10-10	25.2	21.7	24.4	22.8	19.4	22.3
2004-10-11	25.0	21.4	24.3	23.3	20.1	22.9
2004-10-12	25.1	22.1	24.5	23.6	20.3	23.1
2004-10-13	25.0	22.0	24.4	23.6	20.5	23.1
2004-10-14	24.9	21.9	24.3	23.2	19.8	22.7
2004-10-15	24.1	21.1	23.6	22.9	20.1	22.5
2004-10-16	24.0	20.9	23.4	22.8	19.9	22.4
2004-10-17	23.1	19.7	22.5	21.2	18.5	20.9
2004-10-18	22.7	19.1	22.1	21.1	18.3	20.8
2004-10-19	22.6	18.8	22.0	21.1	18.4	20.7
2004-10-20	22.8	19.0	22.1	20.9	18.0	20.5
2004-10-21	22.5	18.8	21.9	21.0	18.3	20.7
2004-10-22	23.1	19.3	22.5	21.8	18.8	21.5
2004-10-23	23.3	19.4	22.7	21.5	18.5	21.2
2004-10-24	23.3	19.5	22.7	21.4	18.5	21.1
2004-10-25	23.1	19.1	22.5	21.3	18.5	21.0
2004-10-26	23.1	19.1	22.5	21.3	18.5	21.0
2004-10-27	23.0	19.1	22.4	21.2	18.2	20.9
2004-10-28	23.0	19.2	22.4	21.3	18.2	21.0
2004-10-29	22.6	18.9	22.1	21.2	18.5	21.0
2004-10-30	22.9	19.3	22.4	21.3	18.5	21.0
2004-10-31	22.9	19.3	22.4	21.4	18.7	21.2
2004-11-01	23.3	19.6	22.6	21.5	18.6	21.1
2004-11-02	23.6	19.9	22.9	21.6	18.4	21.2

Conclusion

Understanding and quantifying the moisture variation in the unbound layers is crucial for pavement resilience to environmental stressors such as flooding. As the inputs for many existing models are not readily available without laboratory testing, knowledge of the temporal and spatial distribution of subsurface moisture after precipitation events can be difficult. During this study, an attempt was made to bridge this gap by developing and calibrating two models with generic pavement and unbound layer information and in situ conditions observed at some of the LTPP sections. The linear regression model explained 59.5% of the moisture content variation, identifying significant factors such as maximum single-day precipitation, time since maximum rainfall, water table depth, and material types of surface and unbound layers. The random forest model demonstrated superior performance, explaining 92.3% of the moisture content variation.

To account for cases when the water table depth reading was not easily accessible, a separate random forest model was developed to predict the water table depth, explaining 94.6% of its variation based on the coordinates, elevation, measurement date, and monthly cumulative precipitation at the site location. The case study on section 48-1068 in Texas demonstrated the ability of the model to simulate moisture distribution over time and depth after a significant rainfall event, providing insights into the drainage behavior of different pavement layers. Both the linear regression model and the scenario comparison using the random forest models suggest better performance in handling moisture by coarser subgrade and unbound layer materials.

These findings have important implications for pavement planning, design, maintenance, and management. The developed models can help engineers and planners predict moisture-related pavement issues, enabling more proactive maintenance strategies. Furthermore, the insights gained into the moisture retention characteristics of various subgrade materials can inform more effective material selection in pavement design. In addition, the capability to simulate moisture distribution over time provides a valuable tool for assessing the potential impact of extreme weather events on pavement structures, allowing for better preparedness and resilience in infrastructure planning.

While this study represents a significant advancement in pavement moisture modeling, there are several areas for future research and model enhancements. Firstly, while the random forest model provides robust predictive performance, its interpretability is inherently limited compared to linear regression. With the model hyperparameters documented, feature significance ranked, and a sensitivity case study included to illustrate the effect of material properties, the detailed model configuration pertains to the data it was trained on. Although the training dataset had a reasonable number of entries, the limited range of input variable limits its area of applicability to be on data within the training range only. Secondly, the case study presented on section 48-1068 is illustrative but lacks direct measured VMC for validation. While the case was selected for its ideal precipitation pattern and available input data, the absence of ground-truth SMP measurements means that the model’s realism can only be inferred through comparative behavior and consistency with empirical findings. This limits the ability to rigorously quantify predictive accuracy at the site level. Lastly, the observed variation in VMC was relatively narrow in magnitude. While effective saturation calculations demonstrate that even small volumetric differences can correspond to significant shifts in saturation, the model’s implications on structural response (e.g., resilient modulus variation) need further investigation. Future work should couple the moisture predictions with mechanistic pavement response models to more directly assess performance impacts.

These limitations point toward several opportunities for future research, including improved temporal data harmonization, integration with mechanistic models, and broader validation using sections with complete SMP data. Furthermore, expanding the framework with machine learning tools capable of uncertainty quantification could offer greater insight into model reliability under extreme conditions.

Footnotes

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: R. Li; data collection: R. Li; analysis and interpretation of results: R. Li, J.A. Prozzi, F. Hong; draft manuscript preparation: R. Li. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Ruohan Li

Jorge A. Prozzi

Feng Hong

References

Zapata

C. E.

Andrei

Witczak

M. W.

Houston

Incorporation of Environmental Effects in Pavement Design. Road Materials and Pavement Design, Vol. 8, No. 4, 2007, pp. 667–693.

Mallick

R. B.

Tao

Daniel

J. S.

Jacobs

Veeraragavan

Development of a Methodology and a Tool for the Assessment of Vulnerability of Roadways to Flood-Induced Damage. Journal of Flood Risk Management, Vol. 10, No. 3, 2017, pp. 301–313.

Rokade

Agarwal

Shrivastava

Drainage and Flexible Pavement Performance. International Journal of Engineering Science and Technology (IJEST), Vol. 4, No. 4, 2012, pp. 1308–1311.

Elshaer

Ghayoomi

Daniel

J. S.

Impact of Subsurface Water on Structural Performance of Inundated Flexible Pavements. International Journal of Pavement Engineering, Vol. 20, No. 8, 2019, pp. 947–957.

Qiao

Dawson

A. R.

Parry

Flintsch

Wang

Flexible Pavements and Climate Change: A Comprehensive Review and Implications. Sustainability, Vol. 12, No. 3, 2020, p. 1057.

Knott

J. F.

Elshaer

Daniel

J. S.

Jacobs

J. M.

Kirshen

Assessing the Effects of Rising Groundwater from Sea Level Rise on the Service Life of Pavements in Coastal Road Infrastructure. Transportation Research Record: Journal of the Transportation Research Board, 2017. 2639(1): 1–10.

Almeida

Picado-Santos

Asphalt Road Pavements to Address Climate Change Challenges—An Overview. Applied Sciences, Vol. 12, No. 24, 2022, p. 12515.

Pregnolato

Ford

Wilkinson

S. M.

Dawson

R. J.

The Impact of Flooding on Road Transport: A Depth-Disruption Function. Transportation Research Part D: Transport and Environment, Vol. 55, 2017, pp. 67–81.

Sultana

Chai

Chowdhury

Martin

Anissimov

Rahman

Rutting and Roughness of Flood-Affected Pavements: Literature Review and Deterioration Models. Journal of Infrastructure Systems, Vol. 24, No. 2, 2018, p. 04018006.

10.

Tighe

S. L.

Xie

W.-C.

Impact of Flood Hazards on Pavement Performance. International Journal of Pavement Engineering, Vol. 21, No. 6, 2020, pp. 746–752.

11.

Cong

Guo

Effects of Moisture on the Bonding Performance of Asphalt-Aggregate System. Construction and Building Materials, Vol. 295, 2021, p. 123667.

12.

Asadi

Mallick

Nazarian

Numerical Modeling of Post-Flood Water Flow in Pavement Structures. Transportation Geotechnics, Vol. 27, 2021, p. 100468.

13.

Mateos

Harvey

Millan

Paniagua

Full-Scale Experimental Evaluation of the Flood Resiliency of Thin Concrete Overlay on Asphalt Pavements. Transportation Research Record: Journal of the Transportation Research Board, 2022. 2676(4): 461–472.

14.

Gudipudi

P. P.

Underwood

B. S.

Zalghout

Impact of Climate Change on Pavement Structural Performance in the United States. Transportation Research Part D: Transport and Environment, Vol. 57, 2017, pp. 172–184.

15.

IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge, UK and New York, NY, 2021.

16.

Khan

M. U.

Mesbah

Ferreira

Williams

D. J.

A Case Study on Pavement Performance Due to Extreme Moisture Intrusion at Untreated Layers. International Journal of Pavement Engineering, Vol. 20, No. 11, 2019, pp. 1309–1322.

17.

Douglas

Garvin

Lawson

Richards

Tippett

White

Urban Pluvial Flooding: A Qualitative Case Study of Cause, Effect and Nonstructural Mitigation. Journal of Flood Risk Management, Vol. 3, No. 2, 2010, pp. 112–125.

18.

Hedayati

Hossain

Data Based Model to Estimate Subgrade Moisture Variation Case Study: Low Volume Pavement in North Texas. Transportation Geotechnics, Vol. 3, 2015, pp. 48–57.

19.

Qiao

Zhang

Wang

Dawson

Wake

Simulating Floodwater Movement in Pavements for Developing Post-Flooding Time-Depth-Damage Functions. Construction and Building Materials, Vol. 396, 2023, p. 132408.

20.

Mehrotra

Evaluating the Influence of Moisture Variation on Resilient Modulus for Unsaturated Pavement Subgrades. LSU Master's Theses, Louisiana State University and Agricultural & Mechanical College, 2014.

21.

Diefenderfer

B. K.

Galal

Mokarem

D. W.

Effect of Subsurface Drainage on the Structural Capacity of Flexible Pavement. Technical Report. Virginia Transportation Research Council (VTRC), 2005.

22.

ARA Inc. ERES Division. Guide for Mechanistic-Empirical Design of New and Rehabilitated Pavement Structures: Final Document Appendix DD-4: Improvement of the Integrated Climatic Model for Moisture Content Predictions. Technical Report. Prepared for National Cooperative Highway Research Program. Champaign, Illinois: National Cooperative Highway Research Program, Transportation Research Board, National Research Council, 2000.

23.

Fredlund

D. G.

Xing

Equations for the soil-water characteristic curve. Canadian Geotechnical Journal, Vol. 31, No. 4, 1994, pp. 521–532.

24.

Van Genuchten

M. T.

A Closed-Form Equation for Predicting the Hydraulic Conductivity of Unsaturated Soils. Soil Science Society of America Journal, Vol. 44, No. 5, 1980, pp. 892–898.

25.

McKee

Bumb

Flow-Testing Coalbed Methane Production Wells in the Presence of Water and Gas. SPE formation Evaluation, Vol. 2, No. 4, 1987, pp. 599–608.

26.

Mualem

A New Model for Predicting the Hydraulic Conductivity of Unsaturated Porous Media. Water Resources Research, Vol. 12, No. 3, 1976, pp. 513–522.

27.

Burdine

Relative Permeability Calculations from Pore Size Distribution Data. Journal of Petroleum Technology, Vol. 5, .No. 3, 1953, pp. 71–78.

28.

Gardner

Some Steady-State Solutions of the Unsaturated Moisture Flow Equation with Application to Evaporation from a Water Table. Soil Science, Vol. 85, No. 4, 1958, pp. 228–232.

29.

Brooks

Corey

Hydraulic Properties of Porous Media. Colorado State University, Fort Collins, CO, 1964.

30.

Williams

Prebble

Williams

Hignett

The Influence of Texture, Structure and Clay Mineralogy on the Soil Moisture Characteristic. Soil Research, Vol. 21, No. 1, 1983, pp. 15–32.

31.

Farrell

Larson

Modeling the Pore Structure of Porous Media. Water Resources Research, Vol. 8, No. 3, 1972, pp. 699–706.

32.

Assouline

Tessier

Bruand

A conceptual model of the soil water retention curve. Water Resources Research, Vol. 34, No. 2, 1998, pp. 223–231.

33.

Johari

Habibagahi

Ghahramani

Prediction of Soil–Water Characteristic Curve Using Genetic Programming. Journal of Geotechnical and Geoenvironmental Engineering, Vol. 132, No. 5, 2006, pp. 661–665.

34.

Hutson

Cass

A Retentivity Function for Use in Soil–Water Simulation Models. Journal of Soil Science, Vol. 38, No. 1, 1987, pp. 105–113.

35.

Aubertin

Ricard

J.-F.

Chapuis

R. P.

A Predictive Model for the Water Retention Curve: Application to Tailings from Hard-Rock Mines. Canadian Geotechnical Journal, Vol. 35, No. 1, 1998, pp. 55–69.

36.

Cresswell

H. P.

Paydar

Water Retention in Australian Soils. I. Description and Prediction Using Parametric Functions. Soil Research, Vol. 34, No. 2, 1996, pp. 195–212.

37.

Tomasella

Hodnett

M. G.

Estimating Soil Water Retention Characteristics from Limited Data in Brazilian Amazonia

Soil Science, Vol. 163, No. 3, 1998, pp. 190–202.

38.

Fredlund

D. G.

Rahardjo

Soil Mechanics for Unsaturated Soils. John Wiley & Sons, Hoboken, NJ, 1993.

39.

Zapata

C. E.

Houston

W. N.

Houston

S. L.

Walsh

K. D.

Soil–Water Characteristic Curve Variability. Advances in Unsaturated Geotechnics, Vol. 287, No. 99, 2000, pp. 84–124.

40.

Huang

Molavi Nojumi

Ansari

Hashemian

Bayat

Evaluating the Use of Machine Learning for Moisture Content Prediction in Base and Subgrade Layers. Road Materials and Pavement Design, Vol. 24, No. 12, 2023, pp. 2910–2928.

41.

Mousavi

Ghayoomi

Dave

E. V.

A System Dynamics Framework for Mechanistic Analysis of Flexible Pavement Systems Under Moisture Variations. Transportation Geotechnics, Vol. 30, 2021, p. 100619.

42.

Richards

L. A.

Capillary Conduction of Liquids Through Porous Mediums. Physics, Vol. 1, No. 5, 1931, pp. 318–333.

43.

Sun

Sias

Dave

Predictive Model for Determining Saturation Profiles under Pavements during Flood Events. Transportation Research Record: Journal of the Transportation Research Board, 2024. 2678(10): 1523–1535.

44.

Elkins

G. E.

Schmalzer

P. N.

Thompson

Simpson

, et al. Long-Term Pavement Performance Information Management System: Pavement Performance Database User Reference Guide. Technical Report. Turner-Fairbank Highway Research Center, 2003.

45.

Schwartz

C. W.

Elkins

G. E.

Visintine

B. A.

Forman

Rada

G. R.

Groeger

, et al. Evaluation of Long-Term Pavement Performance (LTTP) Climatic Data for Use in Mechanistic-Empirical Pavement Design Guide (MEPDG) Calibration and Other Pavement Analysis. Technical Report. Turner-Fairbank Highway Research Center, 2015.

46.

Federal Highway Administration. Long-Term Pavement Performance (LTPP) Program. https://www.fhwa.dot.gov/research/tfhrc/programs/infrastructure/pavements/ltpp/. 2021. Accessed 12 June 2024.

47.

Long Term Pavement Performance InfoPave, “SDR 37 – August, 2023”, Standard Data Release. Version: SDR 37 . https://infopave.fhwa.dot.gov/Data/StandardDataRelease. Federal Highway Administration (FHWA), August 2023. Accessed 12 June 2024.

48.

Elkins

G. E.

Ostrom

Long-Term Pavement Performance Information Management System User Guide. Technical Report. Federal Highway Administration Office of Infrastructure, 2021.

49.

Ismail

M. S. N.

Ghani

A. N. A.

Ghazaly

Z. M.

Dafalla

A Study on the Effect of Flooding Depths and Duration on Soil Subgrade Performance and Stability. Geomate Journal, Vol. 19, 2020, pp. 182–187.

50.

Breiman

Random forests. Machine Learning, Vol. 45, 2001, pp. 5–32.

51.

Segal

M. R.

Machine Learning Benchmarks and Random Forest Regression. Center for Bioinformatics and Molecular Biostatistics, University of California, San Francisco, CA, 2004.

Quantification of Post-Rainfall Moisture Content in Pavement Unbound Layers Using Long-Term Pavement Performance Data

Abstract

Keywords

Literature Review

Data Description

SMP Dataset Preparation

Compilation of Subgrade and Other Unbound Layer Information

Linear Regression

Random Forest

Case Study

Conclusion

Footnotes

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

References