Sage Journals: Discover world-class research

Abstract

The creep behavior of concrete, a critical long-term property in construction, significantly impacts material selection and structural design in civil engineering. Conventional methods for predicting creep usually require many parameters, presenting two significant challenges: substantial consumption of sensor resources and the inability of limited static parameters to capture the material’s complexity. This study introduces a groundbreaking parameterless approach namely Creepformer to predict long-term creep using only early concrete creep data and proposes a simple but effective hybrid model combining convolutional neural networks (CNN) and Transformers within an encoder–decoder framework. The encoder part of the model is composed of the residual neural network, which is responsible for noise reduction, local correlation learning, and embedding feature compression. The decoder consists of a transformer decoder to capture the long-distance dependencies of the time series and global features. Our model achieved impressive performance, with a root mean squared error (RMSE) of 8.28, a mean absolute error (MAE) of 6.85 on the Northwestern University database, an RMSE of 10.09, and an MAE of 7.10 on the real-world Yihui dataset. In addition, we proposed a data completion tool based on random forest to ensure the consistency of data intervals and enrich data samples in different environments and situations compared with other studies. The data completion tool has been tested with RMSE 3.2 and MAE 2.4. Furthermore, the model was subjected to an ablation experiment to prove the effectiveness of the encoder. Finally, the model is applied as two case studies in real-world ordinary concrete and ultrahigh-performance concrete experiments.

Keywords

Concrete creep creep prediction time series process deep learning ultra-high performance concrete

Introduction

Concrete creep is the time-dependent deformation of concrete under sustained load. Unlike instantaneous elastic deformation, creep occurs gradually and can continue for years, significantly affecting the long-term performance and stability of concrete structures.^1–6 This phenomenon is influenced by several factors, including the composition of the concrete, environmental conditions, and the magnitude and duration of the applied load.^2,5–7,8 Concrete creep can lead to various hazards; one of the primary concerns is the potential for excessive deformation, which can lead to structural instability and, in severe cases, collapse.⁹ Over time, creep can cause concrete to sag or distort, leading to misalignment and uneven settling of structures. This can compromise the functionality and safety of bridges, buildings, and other infrastructure elements.^3,5,10,11 Additionally, creep-induced stress can result in cracking, which not only weakens the structure but also allows ingress of harmful substances like water and chemicals, accelerating deterioration and reducing the lifespan of the concrete.^4,11 These issues will not only occur in ordinary concrete, high-performance concrete, and ultrahigh-performance concrete (UHPC) also suffer from these problems.^12,13

Predicting concrete creep is essential for the planning, designing, and maintaining infrastructure. Accurate predictions make engineers account for long-term deformations in their designs, ensuring that structures can withstand sustained loads without compromising safety and functionality.^14,15 In engineering practice, infrastructure is subject to various complex environments and load conditions. By incorporating creep predictions into their designs, engineers can improve the durability and resilience of infrastructure, reducing maintenance costs and extending service life. Conventional methods for predicting concrete creep include empirical, statistical machine learning, and deep learning methods.

Empirical methods rely on physical and empirical models to predict concrete creep. These models are typically based on established theories of material behavior and use parameters such as stress, temperature, humidity, and age of the concrete to make predictions. One of the earlier models is CEB-FIP Model Code 1990¹⁶; it laid the foundation for subsequent research by providing detailed equations for creep and shrinkage deformations. It is fundamental, so it may not accurately capture the effects of various factors in some unusual environment. To handle the lack of comprehensiveness of CEB-FIP Model Code 1990, FIB Model Code 2010, developed by the International Federation for Structural Concrete, analyzes more data while taking into account the influence of environmental conditions and material properties to improve performance.¹⁷ Motivated by CEB-FIP Model Code 1990, Bazant provides a more scientific approach, named the B3 model, addressing limitations in capturing both basic and drying creep and incorporating a more comprehensive range of influencing factors.¹⁸ Based on the success of B3, B4 was developed further. B4 uses a larger database of experimental results to refine predictions and better explain the influence of environmental conditions and material properties.¹⁹ Unfortunately, although B3 and B4 have quite a considerable accuracy, mathematical calculation, and material knowledge are required, making them less accessible for actual design work without specialized software. In addition, while these models are widely used in industry for now, their accuracy is limited by the complexity of concrete material behavior.²⁰

Machine learning methods use statistical techniques to develop predictive models based on historical data. These methods can capture complex relationships between variables without relying on explicit physical models and has been used in many fields of civil engineering.^21–24 Yunze et al.²⁵ developed a model incorporating supplementary cementitious materials, which includes¹⁷ different internal parameters for comprehensive analysis. The model also considers the impact of environmental temperature and humidity on concrete. Through Shapley Additive Explanations analysis, it was found that creep is negatively correlated with cement content and humidity, while factors such as age and water-cement ratio show a positive correlation. Minfei et al.²⁶ developed an ensemble machine learning model based on Bayesian optimization, achieving high accuracy while considering the impact of admixture dosage. However, the creep compliance curves were not smooth and lack of robustness because tree-based models are prone to overfitting. Additionally, significant data noise also affected the model’s accuracy and robustness. Furthermore, more studies further discussed the performance variations of creep under different conditions or environments and developed a series of models for specific scenarios, providing various solutions in this field.^{21,22,27–29} However, these methods are still limited by the quality and completeness of the input parameters, and their robustness and generalization require some improvements.

Deep learning methods, a subset of machine learning, utilize neural networks with multiple layers to model complex relationships in data. These methods can automatically learn features from raw data, reducing the need for manual feature engineering. Previously, many researchers developed various neural network models to provide new solutions and ideas for predicting concrete creep.^30–36 However, these approaches were limited by the feature extraction capabilities of conventional neural networks and did not show significant advantages in accuracy compared to ensemble learning. In 2021, Jingsong et al.³⁷ were the first to apply convolutional neural networks (CNNs) to unify the tasks of predicting creep and shrinkage. This method greatly improved prediction performance because of the feature extraction advantages provided by convolutional downsampling. However, the approach had several shortcomings: first, it did not effectively filter noise caused by errors in data collection; second, it trained two separate models for the two tasks, preventing weight sharing and a deeper understanding of features. The denoising residual neural network (DRNN)³⁸ was proposed to address the noise errors in data collection. This model employs an encoder–decoder structure, where the encoder performs soft sorting of tabular features, denoising, and global feature sampling. This approach significantly improved generalization and accuracy, achieving state-of-the-art performance.

These approaches rely heavily on predefined parameters to make predictions. However, the parameter-based methods presents several significant challenges:

Many sensors are required to collect the necessary data, which can be impractical and costly in some scenarios.

The inherent complexity of concrete materials means that the limited parameters used in these models cannot study the high-dimensional information needed for accurate predictions.

Conventional methods depend on external factors such as temperature and humidity, which are difficult to predict accurately in advance, further complicating the prediction of concrete creep.

Except traditional creep prediction models, many methods such as Informer and Autoformer have been primarily developed to address the long sequence time-series forecasting problem, with a particular focus on enhancing long-range dependency modeling.^39–42 For instance, Autoformer introduces a Seasonal-Trend decomposition mechanism that effectively captures long-term patterns,⁴⁰ while Informer proposes the ProbSparse self-attention mechanism to reduce the computational complexity typically encountered in ultralong time series handled by traditional Transformer architectures.³⁹

However, these advancements are not directly applicable to our task. Creep prediction data does not present the characteristics of extremely long sequences. In most cases, the observation period spans up to 3 years, with time steps recorded on a daily basis. Given the moderate sequence length and the scarcity of high-quality data in this domain, the application of overly complex architectures such as Informer or Autoformer may lead to overfitting and degraded generalization performance. Lastly, in scenarios with small datasets, overly complex models are prone to overfitting and may fail to generalize effectively. In contrast, many state-of-the-art time series forecasting models are developed and validated on large-scale benchmark datasets, making them less suitable for specialized tasks such as creep prediction, where data acquisition is costly and time-consuming. Our model strikes a balance between complexity and generalization, ensuring reliable performance under data-scarce conditions.

We propose a method to address these challenges by making predictions based solely on early creep data and observed trends without relying on any external and inner parameters namely Creepformer. This approach simplifies the prediction process and achieves state-of-the-art performance compared to existing methods. We use the Northwestern University (NU) creep database⁴³ to train and validate our model, and the experiments show that its performance has reached 6.85 in mean absolute error (MAE) and 8.28 in root mean squared error (RMSE). In addition to using experimental dataset for validation, this study also utilized industrial data collected by Yihui New Materials Co., Ltd. during the concrete mix ratio and quality inspection experiment in three provinces of China, including Shandong, Sichuan, and Yunnan, from 2013 to 2018. The performance in Yihui dataset is 7.10 in MAE and 10.09 in RMSE. These experimental results demonstrate that the model can effectively solve the problem of concrete creep prediction in real-world environments with lack of sensors to collect material parameters.

Our model is built upon a CNN-transformer encoder–decoder architecture and is specifically designed to exclude the use of material-specific parameters entirely. Which focuses on capturing the creep behavior of ordinary concrete and UHPC, relying solely on early-age creep data for prediction. It consists of a residual encoder for feature compression and noise reduction and a decoder composed of a transformer encoder for learning global features. Given the need to simultaneously perform creep prediction and feature noise reduction, our model employs a multiobjective learning approach through a designed objective function and backpropagation to learn two tasks under the same framework. We designed a random forest (RF)-based data completion method to address missing data and inconsistent data collection standards to ensure consistency in time series data during the preprocessing stage. In addition, we conducted ablation experiments on the encoder to validate its effectiveness by removing it and observing the resultant increase in model complexity. Finally, two case studies with real-world scenarios and UHPC samples showed that Creepformer outperformed the baseline models and predictive ability in UHPC. Figure 1 shows the process of the proposed method. Table 1 illustrated all the definitions of symbols where this paper used.

Figure 1.

Concrete creep prediction process. The figure illustrates that the model uses the creep values from previous time steps to predict future creep values. The sequence of blue items represents the historical creep values used as input data. The orange items indicate the predicted future values.

Table 1.

Symbol notations in this article.

Notations	Descriptions
$\hat{y}$	Predicted label data
$y$	Actual observed label data
$n$	Number of observations
$\bar{y}$	Average of the actual observed label data
$\bar{\hat{y}}$	Average of the predicted label data
t	Time step at prediction
$L$	Objective function
$W$	Weights of the CNN
$x$	Actual input data
$ℓ (\cdot)$	Loss function
$\tilde{x}$	Denoised input data
$\hat{x}$	Predicted values
$λ$	Regularization parameter
$Ω (\cdot)$	Regularization function
$Θ$	Parameters of the model
w/c	The water-cement ratio by weight
a/c	The aggregate-cement ratio by weight
f/c ₂₈	Mean compressive strength of concrete at 28 days of age in MPa.
T	Temperature, in °C
Sigma	Sustained stress during the test, in MPa

CNN: convolutional neural network.

In summary, the contribution of this experiment is as follows:

For the first time, long-term creep is predicted by only its early stage without any other parameters namely Creepformer.

The encoder achieves effective noise reduction, local feature extraction, and feature compression.

Proposed new data completion method to overcome consistency of data intervals.

Proves the effectiveness of the method in real scenarios.

Demonstrates the method’s predictive ability in UHPC and adaptability in dynamic environments and stress levels.

Methodology

In order to solve the problem of conventional concrete prediction models, CNN-transformer-based hybrid model architecture is proposed for this work. Specifically, it is mainly divided into three modules: First, residual CNN is used for preliminary feature processing of data and noise reduction due to changes in uncontrollable external environmental factors; second, the Transformer is used as the core architecture to understand and abstract time series data; finally, the fully connected network is used to generalize the model and obtain the final creep prediction value. The model framework is shown in Figure 2, and the pipeline is shown in Figure 3.

Figure 2.

Model structure. The model structure consists of an encoder–decoder architecture for parameter-free concrete creep prediction. The raw data are windowed, and Gaussian noise is added to improve robustness. The left side is a CNN-based encoder designed to reduce noise, learn local correlations, and reduce feature dimensionality. The right side is a Transformer-based decoder that captures the time series’ long-distance dependencies and global features. CNN: convolutional neural network.

Figure 3.

Method pipeline. The pipeline includes three main stages: data completion, model training, and transfer learning. First, the data are processed by removing missing samples and standardizing intervals to ensure consistency. The model training phase involves constructing data windows, adding Gaussian noise and training, and evaluating the model. In the transfer learning stage, pretrained weights are loaded and fine-tuned by the Yihui dataset to optimize the model’s accuracy.

Due to the irregularity of data sampling in time series, we used the data completion method to fill in the data in the data preprocessing stage. As we do not consider data loss and distortion of different sensors, we can use 1061 experiments with 14,024 data points in the NU database and 137 experiments with 2205 data points in the Yihui dataset. Each experiment represents a complete creep experiment cycle with dozens to hundreds of data points. Compared with other works, through the adaptation of more experiments in various environments, our model has better robustness.

The encoder in the Creepformer is mainly responsible for feature extraction, local correlation learning, feature dimensionality reduction, and noise reduction. Specifically, the encoder sets up multiobjective learning and uses residual connections for noise reduction. A more compact sequence can be obtained through the learning of the CNN encoder, which effectively shortens the sequence length and reduces the difficulty of the decoder expression. The data dimension ratio of decoder input and encoder input is 1:2. Moreover, in this model, the encoder is used to capture local patterns and correlations in the data, while the decoder is used to capture global factors as well as long-distance dependencies, which complement each other.

In the decoder part, due to the application of the attention mechanism, the model is susceptible to small changes in the time series data, so in the encoder, we add the Gaussian noise in training data and use the residual neural network and the objective function to reduce noise and improve the generalization ability of the model. The objective function includes the loss function of the predicted creep value, the actual value, and the noise value. This method realizes the multiobjective optimization of these two objectives simultaneously.

As the backbone of this approach, the Transformer has three layers of multilayer attention coding. In the input part of the component, we set the fixed position code instead of the floating position code because of the fixed window. Because of Transformer’s long-distance dependencies, our model can easily capture the relationship between the creep values of concrete over long distances in time.

In our structure, the size of one-dimensional convolution kernel is uniformly set to 3, because it can capture the local change pattern (such as change trend, mutation point) in the sequence, especially for short sequence data (such as length = 32). Residual blocks are mainly used to introduce more complex feature expressions. In the transformer part, because the dimension of the input is 32, we set the embedding size to only 16 for feature compression, and set 8 attention heads, so that each attention head is assigned with smaller moral dimension, and the attention can capture the more fine-grained changes in the time series data. Model’s setting can be found at Table 2.

Table 2.

Network architecture details.

No.	Layer type	Kernel/units	Activation	Output	Padding
1	Conv1d	Kernel size = 3	ReLU	(b, 64, 32)	Yes
2	Residual block	2 Conv1d (k = 3)	ReLU	(b, 64, 32)	Yes
3	Conv1d	Kernel size = 3	ReLU	(b, 64, 32)	Yes
4	AdaptiveAvgPool1d	Output size = 1	None	(b, 64, 1)	Yes
5	Dense (linear)	In = 64, out = 16	None	(b, 16)	None
6	Dense (linear)	In = 16, out = 32	None	(b, 32)	None
7	TransformerEncoder	Head num = 4	ReLU	(b, 1, 16)	None
8	TransformerEncoder	Head num = 4	ReLU	(b, 1, 16)	None
9	TransformerEncoder	Head num = 8	ReLU	(b, 1, 16)	None
10	Dense (linear)	In = 16, out = 1	None	(b, 1)	None

b in output column is referred as batch size. Nos. 1–6 are the architecture of denosing encoder, nos. 7–10 are the architecture of transformer decoder.

Prediction process

The whole prediction process uses the concept of sliding Windows. The concept of a rolling window is to always use the latest window data of size $n$ for the value of the next prediction. After the next value is predicted, the window “rolls” forward one step, the oldest value in the window is deleted, and the newly predicted value is added to the window, thus achieving long-term performance prediction only from the earlier creep data. Specifically, in this work, the window size $n$ is 32 days, which means that the model can predict the long-term creep trend from the data of the first 32 days in the creep experiment because of the 1-day time interval in preprocessed dataset. The process can be summarized as follows:

{\hat{x}}_{t + k} = f ({\hat{x}}_{t + k - 1}, {\hat{x}}_{t + k - 2}, \dots, {\hat{x}}_{t + 1}, x_{t}, x_{t - 1}, \dots, x_{t + k - n})

(1)

where $f (\cdot)$ is the proposed model, ${\hat{x}}_{t + k}$ is the predicted value at time $t + k$ . ${\hat{x}}_{t + k - 1}, {\hat{x}}_{t + k - 2}, \dots, {\hat{x}}_{t + 1}$ are the predicted values from previous steps in the rolling window. As we predict further ahead and $k$ increases, more of these inputs will be predicted values rather than actual observed values. $x_{t}, x_{t - 1}, \dots, x_{t + k - n}$ are the actual observed values from the time series. Initially, when $k$ is small, most inputs will be actual observed values. As $k$ increases, the inputs will increasingly consist of predicted values. The prediction process can be referred to as Figure 1.

Denosing encoder

In actual construction and experimental scenarios, the measurement of concrete creep is not always accurate, and various reasons can produce random changes or fluctuations in the measured data, which can distort the actual underlying behavior of the data.³⁸ The behaviors that cause noise are varied, such as differences in the proficiency of the experimentalists, imperfect accuracy of experimental equipment, differences in measurement methods, and so on. At the same time, attention mechanisms are very good at capturing different types of dependencies in the data. In other words, it is susceptible to small changes in the time series data, and even small noises can impact long-term creep predictions. Therefore, after data entry, our first task is to reduce the noise in the data.

According to DRNN, residual networks have significantly succeeded in noise reduction of concrete creep data.³⁸ We use a five-layer residual neural network with the objective of noise reduction. Residual neural networks can alleviate the problem of disappearing gradients through simple skip connections, allowing the network to learn more efficiently and retain more raw, noise-free information. The equation can prove the validity of the skip connection:

H (x) = F (x, W) + x

(2)

In this equation, $H (x)$ is the desired underlying mapping or output of the network. $F (x, W)$ represents the learned residual function, where x is the input to the layer, and $W$ represents the weight of the network. x is the input to the layer. The key to skip connections is that it is easier for the model to optimize the residual function $F (x, W)$ to get close to zero rather than optimizing the underlying mapping that $H (x)$ needs to fit directly, so the network can focus on learning the noise and subtracting it from the input.

Another benefit of CNN encoder utilization is embedding feature dimensionality reduction. By utilizing a CNN encoder, we can obtain a more compact sequence that effectively shortens the sequence length and simplifies the task for the decoder. The data dimension ratio of the decoder input to the encoder input is 1:2. This reduction in sequence length is crucial as it lowers the complexity and computational burden on the decoder, allowing it to focus on generating more accurate outputs. In addition, CNNs are particularly adept at capturing local patterns and correlations within the data due to their convolutional filters.⁴⁴ In contrast, Transformer decoders are proficient at capturing global features and long-distance dependencies through self-attention mechanisms. This complementary relationship between CNNs and Transformer decoders enhances the model’s overall performance.

The advantage of capturing local patterns and correlations is primarily from the local receptive field and weight-sharing mechanisms inherent in convolutional layers. By applying convolutional filters over small and contiguous time windows, CNNs can efficiently learn short-term trends, abrupt changes, or local oscillations that are common in temporal data.⁴⁴ Because the same filters are applied across all positions in the sequence, the model becomes adept at detecting similar patterns regardless of their position in time, which enhances generalization.

Objective function

In the processing pipeline, we add Gaussian noise to input data and apply a multiobjective function to give the encoder noise reduction capability.

In order to make the model realize multiobjective learning of noise reduction and creep prediction simultaneously, mean squared error (MSE) is applied to the objective function. The function design is as follows:

L = \sum_{t = T + 1}^{n} ∥ x_{t} - x_{t} ∥_{2}^{2} + \sum_{i = 1}^{N} ℓ ({\tilde{x}}_{i}, {\hat{x}}_{i}) + λ Ω (Θ)

(3)

This objective function are constituted by three terms. Data fidelity measures the squared difference between the actual values $x_{t}$ and the predicted values ${\hat{x}}_{t}$ for each time step $t$ from $T + 1$ to $n$ . In the context of noise reduction, the values $x_{t}$ are the Gaussian noise-removed observations, and the predicted values ${\tilde{x}}_{i}$ are the denoised estimates produced by the model. This means making the denoised estimates as close as possible to the actual observations. Loss function represents $ℓ$ applied to the noise-removed actual values $x_{t}$ and the predicted values ${\hat{x}}_{t}$ . Regularization is used to prevent overfitting by penalizing the complexity of the model. $Ω (Θ)$ is a regularization function applied to the parameters $Θ$ of the model, and $λ$ is a hyperparameter that controls the amount of regularization.

Transformer

Transformers initially introduced in natural language processing. It has revolutionized many areas of machine learning due to their ability to model complex patterns in data.⁴⁵ The main idea of the Transformer is the self-attention mechanism, which allows the model to measure the importance of different inputs. Conventional methods for predicting long-term creep require large amounts of empirical data and often fail to capture the nonlinear, time-dependent nature of the process accurately. Transformer offers an alternative with its ability to model time dependencies.⁴⁶ By training on early creep data, we can use self-attention mechanisms to identify the most relevant factors affecting long-term creep. The Transformer can then use this information to predict future creep behavior accurately. As shown in Figure 2, the decoder mainly consists of a position encoding, a multi-head attention layer, and a feedforward network (FFN).

The Transformer uses a combination of sine and cosine functions of varying frequencies to generate positional encodings that capture the relative positions in the input data sequence. The positional encoding is added to the creep embeddings before they are processed by the encoder, making the model understand the relationships within the time series. Each position is mapped to a vector, resulting in a matrix where each row represents a coded sequence element and its positional information.

The multihead self-attention mechanism enables the Transformer to focus on different sequence parts. For creep prediction, this means weighing the importance of different sub-sequences of creep to improve long-term creep forecasts. The multihead attention mechanism helps the model capture complex, time-varying patterns in early creep data, enhancing prediction accuracy.

The FFN in the Transformer model introduces nonlinearity, allowing it to learn complex patterns and handle the intricate nature of concrete creep influenced by various factors. The FFN processes each position independently, which is advantageous for analyzing time-series data where each time point can be affected by different factors. The FFN consists of two linear transformations with a Rectified Linear Unit activation function in between, further transforming features learned by the attention mechanism into higher-level abstract features for better prediction accuracy.

Experiment

Datasets

The experiment consists of two databases: the NU database and the Yihui dataset.

The NU database is a comprehensive resource for concrete creep and shrinkage testing.⁴³ It covers a long measurement period and includes the effects of admixtures in modern concrete mixtures. The database contains approximately 1400 creep and 1800 shrinkage curves, with significant effects of admixtures on creep and shrinkage behavior observed. It provides information on various concrete compositions, including different cement types, aggregate sizes, admixtures, and various environmental conditions, such as temperature and humidity levels, which affect creep behavior. The data in this dataset were obtained through experiments using standardized testing methods, ensuring the consistency and reliability of the data.

The Yihui dataset was collected during the infrastructure construction phase by Shandong Yihui New Materials Co., Ltd. from 2013 to 2018 for keeping test samples and providing product reports. This dataset contains commercial data on infrastructure projects in three Chinese provinces: Shandong, Sichuan, and Yunnan, including projects such as the Yibi Expressway, Xubi Railway, and Qinglan Expressway. It comprises 197 creep experiments from mainly expressway projects but still covers a wide range of conditions and scenarios, including tunnels, high-speed railways, roads, and other infrastructures. However, due to the lack of uniformity in data collection standards, many data points are missing key internal parameters of concrete, making them not directly comparable. Additionally, inconsistent collection intervals cause a challenge, as consistent intervals are essential for understanding trends and patterns over time in time series data forecasting. Despite these issues, the real-world sampling of the data makes it highly instructive for the training and evaluation of the model.

Data preprocessing

In the study of concrete creep, field data collection is a critical component. However, it is often observed that the data collection process does not adhere to a fixed time interval. This irregularity can be attributed to factors such as project-specific requirements or constraints inherent to the experimental sites. Consequently, the datasets obtained from different projects or locations may exhibit inconsistencies in their temporal resolution.

To address this challenge and ensure the uniformity of the data input into our model’s sliding window, we standardize the data interval to one sample per day across all experimental scenes by a RF-based data completion method.

The RF algorithm operates by constructing multiple decision trees during the training phase and outputting the mode of the classes or mean prediction of the individual trees.⁴⁷ It is particularly well suited for our purpose due to its ability to handle large datasets with numerous input variables and its inherent mechanism for estimating missing data. By utilizing this method, we can impute the missing values based on the patterns learned from the available data, thereby preserving the integrity of the dataset and maintaining the consistency required for accurate creep analysis. We also compared other popular data completion algorithms. To refer the Table 3, the RF achieves the best performance.

Table 3.

Performance comparison of data completion method.

	KNN	CNN	XGBoost	RF
MAE	3.34	3.37	3.12	2.55
RMSE	4.66	4.4	4.91	4.07

CNN: convolutional neural network; RF: random forest; MAE: mean absolute error; RMSE: root mean squared error.

Although RF is not inherently designed for sequential data, it can effectively capture temporal patterns when appropriate features are engineered. Since our data are time series with fixed intervals, we explicitly incorporated temporal information into the input features by encoding the timestamp associated with the loading time. This approach allows the RF to learn correlations that are implicitly related to temporal progression, thereby enhancing the quality of the imputation. Empirical results at Table 3 indicate that this time-aware feature encoding enables RF to produce relatively accurate imputations, demonstrating its viability as a data supplementation strategy for structured time series data.

The data completion method presented at Algorithm 1 is designed to process creep dataset $D$ that contains multiple experiments. For each creep experiment $T$ within the dataset, the algorithm calculates the mean time interval between the data points. The experiment is discarded if the mean time interval is less than 1 day or greater than 15 days or if any single interval between consecutive data points exceeds 60 days. For experiments that pass these criteria, the algorithm then examines each data point $d_{i}$ and the interval to the following data point $d_{i + 1}$ . If this interval is either an integer greater than or equal to 2 days or a decimal greater than 1.7 days, the data completion method $RF (d_{i})$ is applied to the data point. If neither condition is met, no action is taken on that data point. This approach ensures that only experiments with appropriate time intervals are preserved. Some examples of comparison between raw and processed data can be found at Figure 4.

Algorithm 1.

Dataset completion algorithm.

Require: Dataset $D$ , data completion method $RF (d)$
1:	for each experiment $T \in D$ do
2:	Delete creep records after 120 days.
3:	Calculate the mean time interval $μ_{T}$ for $T$
4:	if $μ_{T} < 1$ or $μ_{T} \geq 15$ or $\exists$ time interval $t_{i} \geq 60$ in $T$ then
5:	Delete experiment $T$
6:	else
7:	for each data point $d_{i}$ in experiment $T$ do
8:	if interval between $d_{i}$ and $d_{i + 1}$ is an integer and $\geq 2$ or a decimal $> 1.7$ then
9:	Apply $RF (d_{i})$ to complete the data
10:	else
11:	No operation is performed
12:	end if
13:	end for
14:	end if
15:	end for

Figure 4.

Completion samples by RF. RF: random forest.

It is important to note that our data completion method utilizes both preceding and future context to predict and fill the gap. However, in the scenario of concrete creep, where future trends are unknown at the time; therefore, the proposed data completion method is not applicable to creep prediction. The process of data completion can be referred to as Figure 5.

Figure 5.

Data completion. The diagram illustrates predicting and filling the missing value using the surrounding values. The example consists of known values (t0, t1, t2, t4, and t5) and a missing value at t3.

Evaluation metrics

To assess the performance of our concrete creep prediction model, we use three evaluation metrics: R-squared ( $R^{2}$ ), MAE, and RMSE. These metrics can evaluate the model’s accuracy and reliability in different aspects.⁴⁸R², also known as the coefficient of determination, measures how well the model explains the variability of the observed data. MAE calculates the average absolute differences between the predicted and actual values. RMSE computes the square root of the average squared differences between predicted and actual values. It more focuses on large errors compared with MAE:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\hat{y_{i}} - y_{i})}^{2}}

(4)

MAE = \frac{1}{n} \sum_{i = 1}^{n} | \hat{y_{i}} - y_{i} |

(5)

R = \frac{\sum_{i = 1}^{n} (\hat{y_{i}} - \bar{\hat{y}}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(\hat{y_{i}} - \bar{\hat{y}})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(6)

where $y_{i}$ indicates the actual observed value, the ${\bar{\hat{y}}}_{i}$ indicates the predicted value from the model. The $n$ means the number of observations, and $\bar{y}$ indicates the average of the actual observed values.

Parameter selection

In order to improve the generalization ability and model performance of the model, the model needs to apply the optimal hyperparameters. Therefore, the method deploys grid search to optimize the model’s hyperparameters. Grid search is an important technique in machine learning hyperparameter optimization.⁴⁹ It performs an exhaustive search within a specified range of hyperparameter values to find a combination that optimizes model performance. By systematically working through multiple combinations, grid search ensures the best possible results are found. This process improves the model’s accuracy, reduces the error, and dramatically enhances the model’s generalization ability. Our hyperparameters for using grid search include learning rate, multihead attention layers, level of Gaussian’s noise, and weight decay. The model is trained using the Adam optimization algorithm with MSE as the loss function.

The final parameter is chosen as the learning rate: 0.001, multihead attention layer: 3, noise level: 0.1, dropout: 0, weight decay: 5e-5.

Transfer learning

In the task of concrete creep data collection, continuous observation over the load time is a very long-term and complex task. In addition, the reporting and evaluation of creep tests require heavy labors. Because of these problems, the Yihui dataset from the real industrial environment has a very limited data size compared to the NU database. Using the Yihui dataset to train the model from scratch may lead to reduced accuracy, overfitting, and other problems. To solve these problems, we introduce the concept of transfer learning in this study.

Transfer learning solves the most critical challenge in machine learning: it allows one to take advantage of pretrained models when there is not enough labeled data to train on, thereby reducing the need for large amounts of labeled data.⁵⁰ Specifically, in this study, the NU database was trained as the pretrained dataset of the creep prediction model. After the training, the Yihui dataset was used to fine-tune the pretrained model, which improved the model’s performance on the Yihui dataset. The performance of the fine-tuned model can be referred to Table 4.

Table 4.

Performance metrics for different models on two datasets.

NU database						Yihui dataset
Metric	No. of daysModels	60	90	120	Overall	60	90	120	Overall
R ²	B4¹⁵	0.62	0.64	0.64	0.70	0.60	0.63	0.66	0.68
	SVR¹⁸	0.89	0.89	0.89	0.91	0.84	0.86	0.86	0.87
	LGBM¹⁷	0.92	0.93	0.93	0.94	0.85	0.88	0.89	0.90
	XGBoost¹⁷	0.93	0.93	0.94	0.95	0.88	0.89	0.91	0.91
	C&S CNN²³	0.91	0.92	0.92	0.93	0.84	0.85	0.86	0.89
	DRNN²⁴	0.94	0.94	0.94	0.95	0.88	0.88	0.89	0.91
	Transformer³⁰	0.94	0.94	0.95	0.95	0.91	0.92	0.92	0.92
	Creepformer	0.95	0.95	0.95	0.95	0.92	0.92	0.92	0.92
	Creepformer _transfer	—	—	—	—	0.93	0.93	0.93	0.93
MAE	B4¹⁵	14.26	14.92	16.33	19.14	16.37	18.79	21.82	25.81
	SVR¹⁸	5.77	5.94	6.22	6.94	7.42	7.87	8.05	9.36
	LGBM¹⁷	5.13	5.20	5.21	5.41	6.10	7.36	7.39	7.89
	XGBoost¹⁷	4.61	4.60	4.65	4.83	6.20	7.13	7.82	7.87
	C&S CNN²³	5.41	5.44	5.87	6.48	7.72	7.89	8.22	9.45
	DRNN²⁴	4.96	5.16	5.37	5.83	5.90	6.29	6.86	7.07
	Transformer³⁰	3.02	5.18	7.10	7.10	3.54	5.61	8.03	8.03
	Creepformer	2.63	4.12	6.85	6.85	3.15	5.89	7.22	7.22
	Creepformer _transfer	-	-	-	-	3.04	5.41	7.10	7.10
RMSE	B4¹⁵	15.29	18.66	21.41	38.17	33.73	37.65	39.37	44.29
	SVR¹⁸	10.83	10.95	11.64	13.25	10.86	13.47	14.20	17.85
	LGBM¹⁷	10.24	12.15	12.81	14.81	10.04	12.15	13.08	17.60
	XGBoost¹⁷	8.63	8.85	11.98	13.90	9.67	12.30	12.59	14.26
	C&S CNN²³	10.50	11.58	12.09	13.64	9.68	12.55	13.63	16.07
	DRNN²⁴	6.23	7.58	8.70	8.88	8.68	8.80	9.39	10.89
	Transformer³⁰	3.68	5.80	9.66	9.66	5.82	7.60	11.52	11.52
	Creepformer	3.04	4.77	8.28	8.28	4.89	7.31	10.16	10.16
	Creepformer _transfer	—	—	—	—	4.38	6.53	10.09	10.09

NU: Northwestern University; RMSE: root mean squared error; MAE: mean absolute error; R²: R-squared; CNN: convolutional neural network; DRNN: denoising residual neural network.

Result analysis

Overall result

The experimental results of our proposed method and baseline models are shown in Table 5. The method was verified on two datasets: the NU database and the Yihui dataset. Validation on the NU database provides convincing evidence of the Creepformer, while validation on the Yihui dataset demonstrates the method’s practical significance for real-world creep prediction.

Table 5.

Performance metrics for baselines at long sequence time-series forecasting task on NU database.

Model	Metric	60 days	90 days	120 days	Overall
Informer⁵⁰	R²	0.92	0.92	0.92	0.92
	MAE	6.93	8.75	8.84	9.20
	RMSE	8.62	9.95	10.37	11.04
Autoformer⁵¹	R²	0.91	0.91	0.91	0.91
	MAE	4.36	7.83	9.09	10.87
	RMSE	5.77	8.72	11.88	11.97
Creepformer	R²	0.95	0.95	0.95	0.95
	MAE	2.63	4.12	6.85	6.85
	RMSE	3.04	4.77	8.28	8.28

NU: Northwestern University; RMSE: root mean squared error; MAE: mean absolute error; R²: R-squared.

We examined the performance of the baseline and proposed models over different periods, dividing the experiments into 30–60, 90, and 120-day creep predictions. The overall performance indicates the models’ performances in the entire experimental cycle. For the parameter-free approach that utilizes time series, the duration of each experiment is settled at 120 days. According to Table 5, the Creepformer and Transformer initially exhibit very high performance, but their performance decreases significantly over time. This phenomenon occurs due to the accumulation of errors inherent in time series predictions. In time series prediction, each prediction step depends on the previous ones. This dependency means that any error in earlier predictions can propagate and amplify in subsequent predictions, leading to a cumulative effect.⁵¹

In contrast, parameter-based models have high errors initially, and the experimental results slowly decrease as the test data increases. This is because, along with the increase of data volume, more different creep states in the test data bring complexity which the model cannot identify. These phenomenons from parameter-based methods suggest that parameter-based models do not effectively leverage the time series characteristics of the creep process. Despite this, the Creepformer consistently shows SOTA performance compared to other baseline methods on both benchmarks, especially on the Yihui dataset.

The Yihui dataset, collected from an actual experimental scene, contains more noise than the standardized samples of the NU dataset. Consequently, models usually perform worse on this dataset due to the increased noise. However, the Creepformer and the DRNN demonstrate strong antinoise capabilities, which can reduce the impact of noise to a certain extent. This noise resistance highlights the encoder’s role in noise reduction, which enhances the model’s robustness and noise resistance, improving overall predictive accuracy.

Comparison with time series baseline

As shown in Table 5, our model achieves the best overall performance. Although Informer and Autoformer are specifically designed for time series forecasting, their relatively complex architectures are not well-suited to the characteristics of creep prediction data. In particular, the limited sequence length and small dataset size reduce the effectiveness of such models, which are typically optimized for large-scale, long-sequence scenarios. In contrast, our model demonstrates better adaptability and accuracy under these constraints.

Tenfold cross validation

Tenfold cross validation can usually have more reliable performance in small dataset.²³ As such, we selected DRNN and XGBoost as baselines, which have high performance in random train-test splits, to apply a 10-fold CV for comparison. The result is shown in Figure 6. Compared to random train-test splits, 10-fold CV led to a notable performance drop, suggesting that it is a more rigorous and reliable evaluation method. Additionally, as illustrated in the figure, our model consistently outperforms the baselines across all folds, unlike in random splits. This further reinforces the robustness and effectiveness of proposed model.

Figure 6.

Tenfold experimental performance for baselines.

Ablation experiment

In the ablation experiments, we systematically evaluated the role of the encoder in improving performance in the model. We evaluated the impact on the model’s predictive power by removing the encoder from the model’s structure. The results in Figure 7 show that the performance significantly decreases when the encoder is removed. The experiment demonstrated the critical role of the encoder in feature compression and noise reduction; thus, allowing the decoder to learn highly abstract temporal features.

Figure 7.

Effect of encoder.

In addition, model complexity may also affect the results of ablation experiments. We assume that when we remove the encoder, the complexity of the model decreases, so the model may cause underfitting to reduce performance. Therefore, to further demonstrate the effectiveness of the encoder, we removed the encoder in the ablation experiment and increased the number of multiple attention layers to eliminate the effect of complexity. The experiment results are shown in Figure 7, which shows that increasing model’s complexity does not have a regular positive effect on the experimental results. The experimental results further demonstrate the encoder’s effectiveness in the model’s overall structure.

Case study

In the case study, we comprehensively evaluated the proposed model’s performance in predicting the creep behavior of concrete using real-world samples. Specifically, we selected six concrete samples from the Yihui dataset, each with dimensions of 150 ×400 mm, and subjected them to testing under constant temperature and humidity conditions. The parameters of these cases are shown in Table 6, and the parameter symbol explanation can refer to Table 1.

Table 6.

Parameters of cases.

Case	w/c	a/c	f/c ₂₈	T	Sigma
1	0.53	5.5	25	20	10.00
2	0.53	5.5	25	20	10.00
3	0.40	5.3	41	20	18.03
4	0.53	7.9	20	20	7.99
5	0.33	2.7	80	20	15.00
6	0.27	6.0	61	20	40.00

The proposed model and baselines are evaluated as shown in Figure 8. Particularly during the early prediction phase, our model shows consistency with the observed creep behavior. However, as the observation period extended, deviations between the model predictions and the actual creep behavior gradually emerged. Despite encountering these challenges in the later stages, our model consistently shows robust performance in predicting the overall creep trend across all six cases.

Figure 8.

Case study for baselines. The red dashed line on the left is the input 32-day creep data, and the Ground truth data along with the prediction results of different baselines are shown on the right.

Compared with parameter-based baselines, our model illustrates advantages in accuracy and robustness. They are difficult to understand the time series features, leading to unstable deviations between predicted values and actual creep trends. While a naive Transformer shows considerable capabilities in learning time series information, Figure 8 revealed that this model is easy to noise interference by analyzing case 3. In contrast, in case 3, our model shows excellent noise resistance, which means it can accurately predict concrete creep behavior in the real world.

In addition, it can be observed that the proposed model generally exhibits better noise resistance than other models. Several ensemble learning-based baselines demonstrate significant overfitting. This overfitting occurs because these models’ Regression in high-dimensional feature space is not smooth enough; as such, they closely fit the training data, including noise and outliers. In contrast, the proposed model incorporates noise during the training stage, and the encoder learns to handle this noise, significantly enhancing the model’s robustness.

Application in UHPC

UHPC represents a breakthrough advance in concrete technology, adding considerable strength, durability, and longevity compared to conventional concrete materials.^52,53 Because of its dense microstructure, high compressive strength, and low permeability, UHPC is increasingly being used in some critical infrastructure projects, including bridges, high-rise buildings, and defense facilities. Nevertheless, as a concrete material, UHPC is still vulnerable to creep, which can compromise its long-term performance and structural integrity. Therefore, accurately predicting the creep behavior of UHPC is critical to ensuring the durability and reliability of the structures that use it.

The UHPC case study employed the University of Adelaide’s UHPC test dataset, which includes primary and dry compression creep and shrinkage data, to conduct creep experiments on 16 concrete samples over 1 year.⁵⁴ It is important to note that the dataset measures compressive strain under specific load conditions, whereas our measurements focus on creep compliance. To align the data, we use the following conversions:

Stress (MPa) = \frac{Force (kN) \times 1000}{Area ({mm}^{2})}

(7)

Creep Ccompliance (μ ε / MPa) = \frac{Compressive Sstrain (μ ε)}{Applied Sstress (MPa)}

(8)

Creep compliance, expressed as $10^{- 6} / MPa$ or $μ ε / MPa$ , is calculated as outlined in Equation (8). Moreover, in this dataset, the applied load on the UHPC samples is measured in kilonewtons (kN), necessitating a conversion using Equation (7).

After fine-tuning, the final results from these tests are illustrated in Figure 9. We compared four different UHPC samples, with each sample’s name indicating different sizes and stress levels, as detailed in Table 7. All samples tested in the same laboratory environment, with dynamic temperatures between 16 and 24°C and humidity levels from 45 to 70%. Due to the data missing, we only take first 280 days data for experiment.

Figure 9.

Case prediction for UHPC samples. UHPC: ultra-high-performance concrete.

Table 7.

UHPC samples and their parameters.

Sample	Size (mm)	Stress range (kN)
75–150–20%	75–150	115–125
75–150–40%	75–150	195–205
100–200–20%	100–200	235–245
100–200–40%	100–200	395–405

UHPC: ultra-high-performance concrete; kN: kilonewtons.

As shown in the reference Figure 9, the results prove that the proposed method not only has good prediction ability for ordinary concrete but also has high accuracy and robustness for UHPC. In addition, the experiment also demonstrated its adaptability to UHPC under dynamic environmental conditions and stress levels due to continuous changes in ambient temperature, humidity, and stress levels. This adaptability is essential for practical applications, as materials must often function under dynamic stresses and environmental influences.

Conclusion

In this study, we proposed a novel CNN-transformer hybrid model namely Creepformer. We applied it for the first time to the parameterless prediction of concrete creep, complemented by a RF-based data completion tool. Our model achieved impressive prediction results on the NU and Yihui datasets. Additionally, we performed ablation experiments to verify the effectiveness of the encoder and conducted predictions and analyses on a set of real-world creep experiments. Eventually, we applied our model in the UHPC experiment and evaluated the performance, showing outstanding accuracy and adaptability to dynamic environments and stress levels.

However, our work still has several disadvantages:

Cumulative errors with increased prediction time: Our model tends to accumulate errors as the prediction time horizon extends. This is a common issue in long-term time series prediction, where initial inaccuracies can propagate and magnify over time, leading to significant deviations from actual values. This cumulative error effect can undermine the reliability of long-term creep predictions. One possible solution is to conduct multimodal prediction model using a small number of key static parameters of the concrete material.

Inaccuracies in data completion: While effective to a degree, the RF-based data completion tool introduces errors that prevent a perfect representation of actual creep changes. Incomplete or unevenly spaced data points are a challenge, and our current method does not fully capture the intricate variations of the actual creep behavior. These inaccuracies can lead to less precise predictions, especially in sparse or highly variable data scenarios. An adaptive data completion method may solve the related problems well.

Exclusion of key static parameters: Our model currently excludes some key static parameters, such as the water-cement ratio and aggregate-to-cement ratio, which are crucial for accurately characterizing the material properties of concrete. This omission can result in large prediction variances, particularly in cases where these static parameters significantly influence the creep behavior. The absence of these parameters limits the model’s ability to fully capture the material’s complexity and variability.

Based on these shortcomings, we plan to conduct the following research in the future:

Use multimodal models for prediction: To address the issue of cumulative errors and enhance prediction accuracy, our next aim is to develop multimodal models that integrate internal static parameters (such as water-cement ratio, aggregate-to-cement ratio) with early creep data. By combining these diverse sources of information, we hope to improve the consistency and reliability of our predictions over both time and space, providing a more holistic understanding of the material behavior.

Develop adaptive data completion models: To overcome the inaccuracies in data completion, we plan to develop a new adaptive model that can handle varying data intervals. This model will be designed to function effectively even when the time intervals in the data are not uniformly equal to 1 day. By accommodating floating intervals, we aim to ensure that our predictions remain robust and accurate and overcome the irregularities in the data sampling process.

Footnotes

Acknowledgements

The authors would like to express their sincere gratitude to Shandong Yihui New Material Co., Ltd. for providing access to the data used in this study.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the Ministry of Higher Education, Malaysia, under the Fundamental Research Grant Scheme (ref: FRGS/1/2020/ICT02/MUSM/03/1).

ORCID iDs

Conghui Li

Xin Wang

References

Feldman

RF.

Mechanism of creep of hydrated portland cement paste. Cem Concr Res 1972; 2(5): 521–540.

Hannant

DJ.

The mechanism of creep in concrete. Mater Struct 1968; 1(5): 403–410.

Bažant

Jirásek

Creep and hygrothermal effects in concrete structures. Dordrecht: Springer, 2018.

Bazant

L’Hermite

Mathematical modeling of creep and shrinkage of concrete. Nuclear Engineering and Design: Wiley, 2018.

Mutungi

WN.

Creep in concrete. Reinforced concrete structures. IntechOpen, https://doi.org/10.5772/intechopen.1000870 (2023).

Zouaoui

Miled

Limam

Prediction of the basic creep of normal and high-strength concretes based on an analytical micromechanical model. J Adv Concr Technol 2021; 19(8): 913–923.

Lim

Yang

Laboratory evaluation of tensile creep behavior of concrete at early ages. Appl Sci 2024; 14(3): 1275.

Hong

S-H

Choi

J-S

Yuan

T-F

, et al. A review on concrete creep characteristics and its evaluation on high-strength lightweight concrete. J Mater Res Technol 2023; 22: 230–251.

Bie

, et al. A coupled creep and damage model of concrete considering rate effect. J Build Eng 2022; 45: 103621.

10.

Zhu

Wang

J-J

, et al. Experimental and numerical study on creep and shrinkage effects of ultrahigh-performance concrete beam. Compos B Eng 2020; 184: 107713.

11.

Bazant

Chern

J-C.

Bayesian statistical prediction of concrete creep and shrinkage. ACJ J Proc 1984; 81(4): 319–330.

12.

Comité Euro-International du Béton. CEB-FIP model code 1990: design code. Thomas Telford Publishing, 1993.

13.

Taerwe

Matthys

Fib model code for concrete structures 2010. Ernst & Sohn, Wiley, 2013.

14.

Bazant

Baweja

Creep and shrinkage prediction model for analysis and design of concrete structures: model B3. Materials and Structures: ACI Special Publications, 2000, pp. 1941–1984.

15.

Hubler

Wendner

Bažant

ZP.

Statistical justification of model B4 for drying and autogenous shrinkage of concrete and comparisons to other models. Mater Struct 2015; 48(4): 797–814.

16.

Breiman

Random forests. Mach Learn 2001; 45(1): 5–32.

17.

Liu

, et al. Modeling and analysis of creep in concrete containing supplementary cementitious materials based on machine learning. Constr Build Mater 2023; 392: 131911.

18.

Liang

Chang

Wan

, et al. Interpretable ensemble-machine-learning models for predicting creep behavior of concrete. Cem Concr Compos 2022; 125: 104295.

19.

Long

Wang

, et al. Modeling and sensitivity analysis of concrete creep with machine learning methods. J Mater Civ Eng 2021; 33(8): 04021206.

20.

Bouras

Prediction of high-temperature creep in concrete using supervised machine learning algorithms. Constr Build Mater 2023; 400: 132828.

21.

Tošić

de la Fuente

Marinković

Creep of recycled aggregate concrete: experimental database and creep prediction model according to the fib Model Code 2010. Constr Build Mater 2019; 195: 590–599.

22.

Feng

Zhang

Gao

, et al. Efficient creep prediction of recycled aggregate concrete via machine learning algorithms. Constr Build Mater 2022; 360: 129497.

23.

Gandomi

Sajedi

Kiani

, et al. Genetic programming for experimental big data mining: a case study on concrete creep equationtion. Autom Constr 2016; 70: 89–97.

24.

Zhu

Wang

Convolutional neural networks for predicting creep and shrinkage of concrete. Constr Build Mater 2021; 306: 124868.

25.

Zhang

Enhancing concrete creep prediction with deep learning: a soft-sorted one-dimensional CNN approach. IEEE Access 2023; 11: 139314–139325.

26.

Taha

MMR

Noureldin

El-Sheimy

, et al. Artificial neural networks for predicting creep with an example application to structural masonry. Can J Civ Eng 2023; 30(3): 523–532.

27.

El-Shafie

Aminah

Dynamic versus static artificial neural network model for masonry creep deformation. Proc Inst Civ Eng Struct Build 2013; 166(7): 355–366.

28.

Abed

El-Shafie

Bt Osman

SA.

Creep predicting model in masonry structure utilizing dynamic neural network. J Comput Sci 2010; 6(5): 597–605.

29.

Hubler

Wan-Wendner

Bazant

Comprehensive database for concrete creep and shrinkage: analysis and recommendations for testing and recording. ACI Mater J 2015; 112: 547–558.

30.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. Curran Associates Inc., https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf (2017).

31.

Vig

Belinkov

. Analyzing the structure of attention in a transformer language model. In: Proceedings of the 2019 ACL workshop blackboxNLP: analyzing and interpreting neural networks for NLP, 2019, pp. 63–76. Association for Computational Linguistics.

32.

Emmert-Streib

Dehmer

Evaluation of regression models: model assessment, model selection and generalization error. Mach Learn Knowl Extr 2019; 1: 521–551.

33.

O’Shea

Nash

An introduction to convolutional neural networks. 2015. arXiv preprint arXiv:1511.08458.

34.

Liashchynskyi

Grid search, random search, genetic algorithm: a big comparison for NAS. 2019. arXiv preprint arXiv:1912.06059.

35.

Ing

C-K.

Accumulated prediction errors, information criteria and optimal forecasting for autoregressive time series. Ann Stat 2007; 35(3): 1238–1277.

36.

Weiss

Khoshgoftaar

Wang

DD.

A survey of transfer learning. J Big Data 2016; 3(1): 9.

37.

Bažant

ZP.

Prediction of concrete creep and shrinkage: past, present and future. Nucl Eng Des 2001; 203(1): 27–38.

38.

Bažant

Prasannan

Solidification theory for concrete creep. I: equationtion. J Eng Mech 1989; 115(8): 1691–1703.

39.

Karthikeyan

Upadhyay

Bhandari

NM.

Artificial neural network for predicting creep and shrinkage of high performance concrete. J Adv Concr Technol 2008; 6(1): 135–142.

40.

Bal

Buyle-Bodin

Artificial neural network for predicting creep of concrete. Neural Comput Appl 2014; 25(6): 1359–1367.

41.

Ahmed El-Shafie

Noureldin

. Neural network modeling of time-dependent creep deformations in masonry structures. Neural Comput Appl 2010; 19(4): 583–594.

42.

Savita Maru

AKN

. Neural network for creep and shrinkage deflections in reinforced concrete frames. J Comput Civ Eng 2004; 18(4): 350–359.

43.

Sun

Bennett

Visintin

. Creep data of UHPC. https://doi.org/10.25909/20011574.v1 (2022).

44.

Huang

Wang

Wei

, et al. Creep behaviour of ultra-high-performance concrete (UHPC): a review. J Build Eng 2023; 69: 106187.

45.

Sun

Visintin

Bennett

Basic and drying creep of ultra-high performance concrete. Aust J Civ Eng 2023; 0(0): 1–11.

46.

Graybeal

BA.

Material property characterization of ultra-high performance concrete. Washington DC: Federal Highway Administration, 2006.

47.

Azmee

Shafiq

Ultra-high performance concrete: from fundamental to applications. Case Stud Constr Mater 2018; 9: e00197.

48.

Wendner

Hubler

Bažant

ZP.

Statistical justification of model B4 for multi-decade concrete creep using laboratory and bridge databases and comparisons to other models. Mater Struct 2015; 48(4): 815–833.

49.

Zhou

Zhang

Peng

, et al. Informer: beyond efficient transformer for long sequence time-series forecasting. Proc AAAI Conf Artif Intell 2021; 35(12): 11106–11115.

50.

Wang

, et al. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. In: Proceedings of the 35th international conference on neural information processing systems (NeurIPS 2021), article no. 1717, 2021, p. 12. Red Hook, NY: Curran Associates Inc.

51.

Vakharia

Gujar

Prediction of compressive strength and portland cement composition using cross-validation and feature ranking techniques. Constr Build Mater 2019; 225: 292–301.

52.

Gujar

Vakharia

Prediction and validation of alternative fillers used in micro surfacing mix-design using machine learning techniques. Constr Build Mater 2019; 207: 519–527.

53.

Han

Zhang

Yin

EEG emotion recognition based on the TimesNet fusion model. Appl Soft Comput 2024; 159: 111635.

54.

Kitaev

Kaiser

Levskaya

Reformer: the efficient transformer. arXiv preprint arXiv:2001.04451, 2020.

A parameterless approach for long-term creep prediction in concrete using hybrid CNN-transformer model

Abstract

Keywords

Introduction

Methodology

Prediction process

Denosing encoder

Objective function

Transformer

Experiment

Datasets

Data preprocessing

Evaluation metrics

Parameter selection

Transfer learning

Result analysis

Overall result

Comparison with time series baseline

Tenfold cross validation

Ablation experiment

Case study

Application in UHPC

Conclusion

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iDs

References