Sage Journals: Discover world-class research

Abstract

Civil and maritime engineering systems must be efficiently managed to control the failure risk at an acceptable level as their performance is gradually degraded throughout the operational life, caused by fatigue and corrosion. Structural health monitoring develops a timely capability to assess the structural condition and performance metrics. However, using actual long-term monitoring data to guide the life-cycle management under stochastic environments has not been sufficiently studied. To realize an optimal maintenance strategy within the service life, an integrated monitoring-based optimal management framework is developed on the basis of the partially observable Markov decision processes (POMDPs) and Bayesian forecasting. In the proposed framework, the stochastic fatigue processes are quantified by the state transition matrix. The Bayesian dynamic linear model is embedded in POMDPs as a continuous observation part to forecast the cycling impacts and estimate the deterioration rate using long-term dynamic strain responses. In addition, making use of the special features of the problem considered in this paper, an adaptive discretization strategy is proposed to alleviate the complexity of large discrete observed spaces in the POMDP. The applicability and feasibility of the framework are evaluated by intelligent maintenance of fatigue-sensitive components with real-world monitoring data. After solving the POMDP by an efficient offline solver, the results obtained in this paper demonstrate that structural interventions are uneconomical to extend the life when a welded detail is approaching its end of life due to the normal service. Furthermore, if multiple interventions are available, the framework can find optimal maintenance actions based on the trade-off between long-term utility and the corresponding cost. This framework as the prototype could also be adjusted to aid life-cycle intelligent maintenance of other types of components under different deterioration scenarios.

Keywords

Structural health monitoring (SHM)fatigue life assessment Bayesian dynamic linear models (BDLMs)partially observable Markov decision processes (POMDPs)life-cycle cost optimization

Introduction

Within the service life of an engineered structure, it could be subjected to different deterioration scenarios (e.g., corrosion, fatigue), which could affect the serviceability and functionality of the structure. For instance, Wardhana and Hadipriono¹ summarized failures of bridge structures in the United States between 1989 and 2000 and found 11 bridges failed due to fatigue. In addition, fatigue of ship critical details has always been the main concern for ship operation and is one of the deterioration mechanisms that can affect structural performance. Under these circumstances, decision-makers are faced with fatigue failure risk and must distribute limited budgets so that the component can be operated safely during the life cycle. These facts highlight the importance of an effective decision-making system for the vulnerable infrastructure in the years to come, and underscore the imperative need for developing versatile and intelligent decision-support frameworks that can serve multi-purpose real-time and life-cycle objectives. Motivated by the evolution of cutting-edge monitoring technologies, reliable transducers, and effective information processing algorithms, structural health monitoring (SHM) systems develop a real-time data-driven performance-based assessment ability which is the precondition of rational maintenance. Accordingly, developing decision-support systems for infrastructure management based on real-world monitoring data becomes possible and necessary.

In this paper, fatigue assessment based on long-term monitoring data is conducted. Through reviews of the successful cases in SHM, the fatigue of the steel components is usually assessed by the rain-flow algorithm and linear cumulative damage theory. Ni et al.² proposed a fatigue reliability model which integrates the distribution of the T-joint stress range with Miner’s damage cumulative rule to estimate the fatigue life with long-term monitoring data. At the structural level, Ye et al.³ proposed a monitoring-based method for the fatigue assessment of Tsing Ma bridges by using the continuously measured dynamic strain responses. Furthermore, Farreras-Alcover et al.⁴ incorporated monitoring data with the S–N model to estimate the residual service life of welded joints in the Great Belt Bridge. Herein, to enhance the compatibility associated with different types of data, the Markov chain model is proposed to assess the fatigue process. This model can express a non-stationary state deterioration based on inspection data, dynamic fatigue degradation rested on monitoring data, and is rooted in the stochastic process.

SHM is recognized as a feasible field facilitating to secure the operational safety of structures and prognosis of the damage in time to prevent additional maintenance costs or catastrophic failure.⁵ This is attributed to the capability of the SHM system in monitoring the impacts actually acting on the vulnerable components. SHM provides an insight into the structures undergoing gradual changes in condition. For fatigue component management, one of the underlying requirements is to rationally forecast cycling impacts based on the long-term strain responses. However, it poses a challenge in the practical application of prediction when facing the uncertainty and non-stationary features inherent in the monitoring data. To solve this aspect, different forecasting algorithms have been widely studied and attempted, for instance, the statistical models,⁶ time series models,⁷ gray system theory,⁸ and singular spectrum model.⁹ More recently, the Bayesian forecasting model has been applied in real-world data prediction due to its merits. Bayesian dynamic linear model (BDLM) can automatically accommodate non-stationary time series data through time-varying model parameters,¹⁰ which increases the robustness to tackle data missing. In addition, BDLM can adjust the model parameters through newly obtained observations and provide ahead forecasts without reconstructing models, and it can consider the uncertainty in monitoring data and quantifying the probabilistic distribution in forecasting.¹¹ These advantages motivate the BDLM favorably for forecasting the performance of an in-service structure based on long-term monitoring data. For instance, Ni et al.¹² employed this model in expansion joints assessment and damage alarm on the basis of the long-term displacement and temperature data. Wang and Ni¹³ also used BDLMs to predict the structural strain response. Although establishing a BDLM using SHM data is achievable, life-cycle maintenance of fatigue-sensitive components incorporating the long-term SHM data and predicted dynamic strain responses resulting from BDLM has not been well investigated. In this paper, BDLM is constructed to estimate the daily cumulative damage based on real-world dynamic strains and delineate the fatigue development by using different hidden blocks, such as overall trend, seasonal, and auto-regressive (AR). Moreover, predictions made available by BDLM are essential for the decision-maker to make more informed and rational maintenance actions on a time scale.

Another key step in infrastructure management is how to exploit prior information and long-term monitoring data to guide optimal maintenance. Some difficulties are associated with (i) incomplete information about the structural condition, (ii) state stochastic degradation and random outcome of maintenance, and (iii) long-term objectives. The majority of the models^14–17 in infrastructure management cannot simultaneously account for the above three paramount factors. The frameworks that satisfy three requirements^18–20 are solely feasible in simulation models without being compatible with the actual data (e.g., inspection, monitoring data). More realistic decision models are needed to develop in infrastructure management facing real data and uncertainty. These sequential decision-making models can be classified as genetic algorithm (GA), decision tree, and dynamic programming. The GA in optimal maintenance has been systematically studied.^21–23 Through setting genetic operators and mutation, the GA can search for the optimal strategy under multi-objectives. However, heavier computation leads to the low convergence speed of GA in complex decision problems. The decision tree²⁴ reduces the computation cost by setting a series of criteria to determine maintenance actions. This method is realized in high efficiency but may result in a feasible scheme rather than the optimal solution. In recent studies, dynamic programming methods in the robotic control field have been applied in infrastructure management. An approach referred to as the partially observable Markov decision processes (POMDPs) in dynamic programming provides a well-suited framework for maintenance in structural stochastic degradation under imperfect observation. Due to the sound life-cycle optimality guarantees and good convergence property,²⁵ POMDPs are employed in highway pavement maintenance,¹⁹ deck structure,²⁶ and bridge component management.²⁷ It is also employed to consider the value of information for the SHM system.^28–30 However, the aforementioned studies are restricted to numerical studies and existing research lacks sufficient attention to how to verify and update some assumptions, such as discrete observation value and stationary state transition matrix in traditional POMDPs,³¹ by incorporating the long-term health monitoring data. To accommodate the dynamic deterioration process and actual monitoring data, the solver for non-stationary³² and continuous observation POMDP³³ should be utilized. However, most online solvers use limiting steps lookahead exploration methods or artificially discretize the observation space to reduce the computation to a tractable level.³⁴ This simplification may induce the policy to fall into the sub-optimal solution.³⁵ Lim et al.³⁶ pointed out that there are no theoretically guaranteed online solvers that can find optimal solutions with the limit of finite computational resources. Until now, solving POMDPs with continuous observation spaces in an uncertain environment remains challenging.³⁷ In summary, to address the optimal decision-making problem for welded details, the model should address two issues: (1) establish a quantified model for structural informatics, probabilistic performance prediction, risk management, and control by incorporating real-world long-term health monitoring data. (2) Use a highly efficient algorithm to compute optimal or near-optimal solutions.

In this paper, an intelligent decision-support framework is proposed for welded details management based on long-term real-world monitoring data and POMDPs. The primary functions of the framework include a database from SHM, structural condition assessment, degradation prediction, life-cycle cost, and decision-making. The fatigue deterioration model is developed by using long-term health monitoring data and quantified by the Markov chain. The dynamic strain response of welded details is embedded into the observation part of POMDPs which successively update the prior information on structural conditions using the monitoring data to guide the decision-making. Furthermore, this study develops an efficient discretization strategy on the basis of the features of the state transition matrix and reward matrix in infrastructure management to tackle continuous observation spaces. The remainder of the paper is organized as follows: Methodology presents the Markov chain-based approach for fatigue assessment, the BDLM, and the continuous observation POMDPs. Subsequently, Illustrative example describes the application of the proposed model in life-cycle optimal maintenance of welded details under partially observable stochastic environments. The maintenance results are discussed in Results and discussions, before drawing conclusions. In summary, this study aims to: (i) use the Markov chain to describe the probabilistic fatigue damage process; (ii) establish the BDLMs with different regressive blocks incorporating the long-term monitoring data; (iii) design an adaptive discretization strategy for POMDPs with continuous observation spaces based on the features of infrastructure management; and (iv) establish an optimal maintenance policy-making framework using long-term monitoring data. It can (1) avoid theoretical inconsistencies of existing studies, (2) be compatible with various statistical models and advanced sequential decision-making algorithms, and (3) give optimal maintenance policy to the decision-makers.

Methodology

In this section, the developed computational flowchart by incorporating the algorithms in the framework is illustrated in Figure 1, which includes the Markov chain in fatigue assessment, and BDLMs in fatigue deterioration prediction. The continuous POMDPs are explained in how it formulates infrastructure management. The flow chart is shown in Figure 1.

Figure 1.

Optimal life-cycle management framework to consider stochastic deterioration components based on long-term monitoring data. The bold words denote the algorithms introduced in this paper.

Markov-chain-based fatigue assessment

This part introduces the assessment of the cumulative fatigue damage via the Markov chain. The model uses discrete states and stochastic state-transition probabilities to describe the fatigue process, which is identical to the concept of state transition in POMDP. Generally, there exist two conventional methods for fatigue assessment. The crack growth model describes the possible crack propagation during the stress cycles. S–N curve model focuses on failure probability after the structure suffers stress cycles. The Markov chain model retains the properties of the above two models, which can define different physical states to express the crack expansion or the failure probability.

Fatigue model

Fatigue describes the blunting and re-sharpening advancing progress of a crack front with increasing and decreasing stresses respectively during load cycles. One characteristic of fatigue is the uncertainty of crack development as depicted in Figure 2 which uses different colors to distinguish confidence intervals. The expected crack growth function (black line in Figure 2) in the welded details thickness direction is given as³⁸:

\begin{matrix} N = \frac{1}{C} \int_{a_{0}}^{a_{c}} \frac{d a}{{(Δ K M_{k} F)}^{m}} \\ = \frac{1}{C} \int_{a_{0}}^{a_{c}} \frac{d a}{{(Δ σ \sqrt{π a} M_{k} F)}^{m}} \end{matrix}

(1)

Figure 2.

Crack development curve. The crack in fatigue presents slow growth in early stage and rapid propagation close to failure.

Equation (1) indicates the expected cycling number N of the crack growth from the lower limit $a_{0}$ to the upper limit $a_{c}$ of integral. C and m are experimentally obtained constants. $Δ K$ denotes the stress intensity factor associated with the stress cycle $Δ σ$ and crack depth a. Geometry modification coefficient F is introduced for the geometrical deviations from the central through-thickness crack in an infinite plate to a semi-elliptical shape crack at the plate surface on the weld side.³⁹ In addition, a semi-empirical formula is proposed by Bowness and Lee⁴⁰ to estimate non-uniform stress effects $M_{k}$ at the weld notch. As there is uncertainty within the fatigue assessment process as Figure 2 depicts, the probabilistic model should be embedded in the crack growth process. In this study, the logarithmic normal distribution is employed in considering the uncertainty of fatigue, in which the standard deviation $S_{\log N}$ is regarded as constant during crack development.⁴¹

To simplify Equation (1), the S–N curve model is introduced to present a bilinear relation between the stress amplitude $Δ σ$ and the expected number of cycles $μ_{\log N}$ to failure on the double-logarithmic scale. The bilinear inflection point is the fatigue limit $Δ K_{th}$ . The Equation of the S–N curve is given as:

μ_{\log N_{0}} = \log A_{0} - m \log Δ σ

(2)

where m is the slope of a linear function, and $A_{0}$ is the material property determined by the corresponding fatigue experimental results. The subscript number of $μ_{\log N_{0}}$ and $A_{0}$ refer to the reliability. For variable amplitude stress spectrum $Δ σ_{i}$ , the corresponding failure cycling number $N_{0}^{i}$ is calculated by Equation (2). The damage is assumed as a linear cumulative process $n_{i} / N_{0}^{i}$ , where $n_{i}$ is cumulative cycles in stress range $Δ σ_{i}$ .

For fatigue analysis, corrosion is an essential parameter since the fatigue life of welded details in the marine environment is significantly lower than in dry air. The corrosion of the welded details can be approximately divided into three phases.⁴² In the first phase, the corrosion happens at the welded area due to the stress concentration which could trigger coating disabling protection. In phase 2, moderate corrosion weakens the cross-section and induces the principal stress increase. In phase 3, the welded toe loses mass which induces severe damage to the details. To quantify the corrosion in the component life-cycle, an approximate estimation method is proposed⁴³:

M = ν t^{λ}

(3)

where M refers to the mass loss. In welded details, as the size of the thickness direction is much less than the other two dimensions, M is equivalent to the thickness reduction. $ν$ denotes the mass loss after exposure of the first year, while $λ$ indicates the corrosion diffusion rate. Notably, Equation (3) is a general statistical model which does not consider nonuniform corrosion influences and coupling effects of erosion and stress. But applications^42,44,45 in structural details indicate that the above approximations are acceptable. Due to the reduction of cross-section, the equivalent cycling impacted $N_{e}$ after corrosion is derived from Equation (2):

N_{e} = Λ^{m} N = \frac{N}{{(1 - 2 ν t^{λ} / T)}^{m}}

(4)

where $Λ$ is the stress amplification factor, and T is the thickness of details. According to the erosion experiments⁴⁶ and practical statistics,⁴⁷ the coating is conservatively assumed that it can protect welded details without corrosion for 4 years.

Markov Chain model

The Markov chain model is employed because it not only represents the stochastic degradation process but simplifies complex mathematical operations, such as integration in Equation (1). The Markov chain is related to the state transition. It uses the probability transition matrix to describe a component state $s_{t}$ degradation to state $s_{t + 1}$ after cycling impacts, as indicated in Equation (5). Herein, the duty cycle (DC) is introduced to quantify cumulative cycling during the fixed time intervals (e.g., year). To connect fatigue with the Markov chain model, the discrete states s need to be defined as specific physical quantities, such as crack depth or failure probability. Then, the extent of fatigue damage is approximately characterized by finite discrete states. Because of uncertainty in the fatigue process, the component condition at any moment is not a certain state, but represented by a d-sized vector $b_{t}$ (belief state) in a probabilistic manner $(p_{s_{1}} (t), p_{s_{2}} (t), \dots, p_{s_{d}} (t))$ , where the subscript d denotes the number of discrete states. Afterwards, the expected state $E (s_{t})$ is obtained by multiplying probability with the corresponding state, and variance $σ (s_{t})^{2}$ is calculated through $E (s_{t}^{2}) - E (s_{t})^{2}$ . These two parameters $E (s_{t})$ and $σ (s_{t})$ correspond to the N and $S_{\log N}$ in the S–N model. In summary, the Markov chain model uses an alternative way to approximately present the uncertainty in fatigue by a finite discrete state $s_{j}$ and its probability $p_{s_{j}}$ , shown bottom right in Figure 2.

The following step is to build the numerical expression between failure probability and DC in the Markov chain model. Different from the S–N model, the fatigue process in the Markov chain is expressed by state vector transform in the matrix, where the state is defined as reliability. The unit-jump matrix is assumed to denote the state transition in a DC based on the statistical inspection results,⁴⁸ defined as⁴⁹:

P_{DC} = {(\begin{matrix} \frac{r_{1}}{r_{1} + 1} & \frac{1}{r_{1} + 1} & \dots & 0 & 0 \\ 0 & \frac{r_{2}}{r_{2} + 1} & \frac{1}{r_{2} + 1} & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & \frac{r_{d - 1}}{r_{d - 1} + 1} & \frac{1}{r_{d - 1} + 1} \\ 0 & 0 & 0 & 0 & 1 \end{matrix})}_{(d \times d)}

(5)

The unit-jump assumes that the current state can only degenerate to an adjacent higher state in a DC. In Equation (5), the argument $r_{j} (j \in [1, d])$ in each row is defined as state transition resistance (STR) which represents the expected number of DC to deteriorate a unit state, as indicated in Figure 3. Using STR, the multiple-slope $(Δ S_{j} / (r_{j} + 1))$ lines are plotted to approximate the black curve in Figure 2. Once the component state is defined, the expected DCs $(r_{j} + 1)$ are obtained based on the curve. Another parameter that needs to be determined is the size of matrix d. In fact, the variance of fatigue failure distribution, the size of discrete states d, and the expected number of DCs before failure $N_{f}$ have followed the relationship⁵⁰:

σ (s_{t}) = \sqrt{N_{f} (\frac{N_{f}}{d - 1} - 1)}

(6)

The parameters’ standard deviation $σ (s_{t})$ and expected failure cycles $N_{f}$ can be obtained from fatigue experimental statistics. Finally, the probabilistic distribution of the belief state $b_{t}$ from $b_{t - 1}$ after bearing the number $N_{t}$ of DCs is expressed by the Markov chain:

b_{t} = b_{t - 1} (P_{DC})^{N_{t}} = b_{0} Π_{i = 1}^{t} (P_{DC})^{N_{i}}

(7)

It should be noted that the latter part of Equation (7) assumes $P_{DC}$ is constant in the life cycle. To consider the non-homogeneous deterioration processes induced by the erosion and material properties changing, the equivalent cycling number $N_{t}$ needs to be adjusted by Equation (4). In summary, the merits of the Markov chain model in fatigue are:

- Initial damage can be incorporated into the model via setting the initial condition vector $b_{0}$ .

- The evolution of damage can be traced at any moment.

- The accelerated fatigue damage process is approximated by different STRs $r_{j}$ .

- Uncertainty is quantified as $E (s_{t})$ and $σ (s_{t})^{2}$ in discrete states.

The Markov chain model plays as an important role in connecting different algorithms in the integrated management framework. Through Equation (7), the monitoring data is transferred into the probabilistic distribution of the belief state $b_{t}$ , which is used as the observation vector to update the prior state vector in the POMDP model.

Figure 3.

The fatigue process is described in the Markov chain model. $r_{j}$ as the expected number of DC in a state jump.

Bayesian dynamic linear models

An appropriate maintenance strategy should consider the current component condition and the deterioration rate in the future. Thus, a prediction method named BDLM is employed in this study. The fundamental background theories of AR and Kalman filtering (KF) which are embedded within the BDLM are not discussed for brevity. The BDLM has shown promising merits in the field of SHM. The flexibility in BDLM allows it to incorporate various regressive blocks to reflect the characters of data. In addition, it also considers the relationship between different blocks through covariance. Since KF is rooted in the updating process, BDLM has robustness to deal the cases with distortion or incomplete data.

BDLM contains two parts, the dynamic linear model (DLM) and the Bayesian method to update the model parameters. For DLM, the observation equation and system evolution equation are implemented to describe the dynamic changing of monitoring data. Define Y as the dynamic response of structure based on SHM sampling. $F$ is the regressor and superscript represents transpose, X is the system variable, and v is the noise term obeying normal distribution. Then, the time-dependent observation equation is written as⁵¹:

Y_{t} = F_{t}^{T} X_{t} + v_{t}, v_{t} ~ N [0, V_{t}]

(8)

Since the $X_{t}$ is evolved over time, it develops from a static value into a dynamic variable with an evolution function⁵¹:

X_{t} = G_{t} X_{t - 1} + ω_{t}, ω_{t} ~ N [0, W_{t}]

(9)

where $G_{t}$ indicates the evolution matrix which correlates system variables from $X_{t - 1}$ to $X_{t}$ . $ω_{t}$ denotes a stochastic error vector that quantifies the information loss over systematic variables evolution. It is expressed by multivariate Gaussian distributions with zero mean and covariance matrix $W_{t}$ . It should be noted that the observation noise $v_{t}$ and variable evolution error $ω_{t}$ are assumed independent in DLM.

The recursive process in DLM indicates the initial variable $X_{0}$ can affect subsequent variables. To consider the uncertainty of initial value, assuming $X_{0}$ is formed by Gaussian distributions $N [z_{0}, C_{0}]$ , where $z_{0}$ and $C_{0}$ represent the mean value and covariance matrix. Starting with the initial variable $X_{0}$ , the evolution of $X_{t}$ is obtained via Equation (9). Meanwhile, the corresponding response $Y_{t}$ is estimated by Equation (8). If the prediction $Y_{t}$ is identical with observation $Y'_{t}$ , it illustrates that the current model parameters match the monitoring data. Nonetheless, Bayesian updating is necessary to modify the model variable when a discrepancy between prediction and observation cannot be negligible. Before detailing the updating procedures, the following part briefly reviews the regressive blocks in BDLM.

Block component

Theoretically, BDLM can combine any mathematical functions which are utilized to describe a changing rule, such as linear growth, accelerated increasing, periodic fluctuation in Fourier form, periodic oscillation in Kernel regression form, and AR. For stress cycling prediction, three-block components are employed: (a) a trend component, (b) a seasonal component, and (c) an AR component.

The trend is the most fundamental component in the forecast model which describes a steady variation in time series. This component can be expressed using polynomial functions. Normally, second-order polynomials are recommended to avoid the overfitting problem when using higher-order polynomials, the matrix pattern is given as¹¹:

\begin{matrix} F_{T}^{T} = (1, 0); X_{T_{t}} = (X_{L}, X_{T})_{t} & G_{T} = (\begin{matrix} 1 & Δ t \\ 0 & 1 \end{matrix}); W_{T} = σ_{T}^{2} (\begin{matrix} Δ t^{4} / 4 & Δ t^{3} / 2 \\ Δ t^{3} / 2 & Δ t^{2} \end{matrix}) \end{matrix}

(10)

where subscript T denotes that the above parameters are related to the trend component. Additionally, any arguments evolving with the time series are marked with subscript t. To avoid repetitiveness, the observation noise $v_{t}$ is individually included in the final observation equation.

Periodicity is another feature of structural monitoring data. The periodic fluctuation of strain is affected by freight transport, monsoon, and temperature. To consider these factors, the seasonal component is introduced in DLM. By contrasting the likelihood value between the Fourier form and the Kernel form, the Fourier form is proved more suitable for fatigue cycling monitoring data. The seasonal block is expressed as¹¹:

\begin{matrix} {F_{S}}^{T} = & (1, 0, 1, 0, \dots, 1, 0) \\ X_{S_{t}} = & (X_{S 1}^{1}, X_{S 1}^{2}, X_{S 2}^{1}, X_{S 2}^{2}, \dots, X_{Sn}^{1}, X_{Sn}^{2})_{t}^{T} \\ G_{S} = & diag (G_{S 1}, G_{S 2}, \dots, G_{Sn}); G_{Si} = (\begin{matrix} \cos p_{i} & \sin p_{i} \\ - \sin p_{i} & \cos p_{i} \end{matrix}) \\ W_{S} = & diag (ω_{S 1}, ω_{S 2}, \dots, ω_{Sn}); ω_{Si} = σ_{Si}^{2} (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) \end{matrix}

(11)

In Equation (11), the subscript i indicates the number of periods considered. An oscillation is decided by amplitude $\sqrt{X_{Si}^{1} + X_{Si}^{2}}$ and frequency $p_{i}$ .

The AR component describes the relationship between the current output and the n-order historical outputs. The previous n-order $X_{{AR}_{t - j}}$ and coefficient $ϕ_{j}$ are used to estimate the current variable $X_{A R_{t}}$ . The matrix pattern is written as¹¹

\begin{matrix} F_{AR}^{T} = & (1, 0, \dots, 0); X_{A R_{t}} = (X_{AR 1}, X_{AR 2}, \dots, X_{AR n})_{t} \\ G_{AR} = & (\begin{matrix} ϕ_{1} & ϕ_{2} & \dots & ϕ_{n - 1} & ϕ_{n} \\ 1 & 0 & \dots & 0 & 0 \\ 0 & 1 & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & \dots & 1 & 0 \end{matrix}); \\ W_{AR} = & diag (σ_{AR}^{2}, 0, 0, \dots, 0) \end{matrix}

(12)

In Equation (12), the subscript AR refers to the auto-regressive feature. Once three-block components are integrated, the synthetical observation function is obtained as:

Y_{t} = Y_{T} + Y_{S} + Y_{AR} + v_{t}, v_{t} ~ N [0, V_{t}]

(13)

Sequential updating

In this part, the Bayesian updating process for system variables $X_{t}$ is introduced. The essential model arguments in the Bayesian process are listed: regressor $F_{t}$ , evolution matrix $G_{t}$ , noise item $v_{t}$ , and error item $ω_{t}$ , which remain constant in the updating process. The values of those arguments will be calculated in the next section by the optimization algorithm. Starting with initial variables $X_{0}$ , the Bayesian updating process is illustrated in Figure 4. When the posterior distribution of variables $X_{t}$ at time t is obtained as $P (X_{t} | D_{t}) ~ N [z_{t}, C_{t}]$ , where $D_{t}$ is the information from the initial distribution $N [z_{0}, C_{0}]$ and observation sequence $[Y'_{1}, Y'_{2}, \dots, Y'_{t}]$ . Using Equation (9), the prior distribution of $P (X_{t + 1} | D_{t})$ at time step $t + 1$ is estimated as:

\begin{matrix} E [X_{t + 1} | D_{t}] = G_{t + 1} E [X_{t} | D_{t}] + E ω_{t + 1}] \\ = G_{t + 1} z_{t} = μ_{t + 1} \end{matrix}

(14)

\begin{matrix} Var [X_{t + 1} | D_{t}] = G_{t + 1} Var [X_{t} | D_{t}] G_{t + 1}^{T} + Var [ω_{t + 1}] \\ = G_{t + 1} C_{t} G_{t + 1}^{T} + W_{t + 1} = R_{t + 1} \end{matrix}

(15)

Once the variables $X_{t + 1}$ are obtained, the prior distribution of response $P (Y_{t + 1} | D_{t})$ can be calculated by Equation (8). This process is referred to as the 1 step-ahead forecast in BDLM:

\begin{matrix} E [Y_{t + 1} | D_{t}] = F_{t + 1} E [X_{t + 1} | D_{t}] + E [v_{t + 1}] \\ = F_{t + 1} μ_{t + 1} = f_{t + 1} \end{matrix}

(16)

\begin{matrix} Var [Y_{t + 1} | D_{t}] = F_{t + 1} Var [X_{t + 1} | D_{t}] F_{t + 1}^{T} + Var [v_{t}] \\ = & F_{t + 1} R_{t + 1} F_{t + 1}^{T} + V_{t + 1} = Q_{t + 1} \end{matrix}

(17)

Another key step of BDLM is to update the variables $X_{t + 1}$ with the actual observation $Y_{t + 1}^{'}$ . The expected error is $e_{t + 1} = Y_{t + 1}^{'} - f_{t + 1}$ . According to the KF algorithm, two distributions can be combined through Kalman gain $K_{t + 1} = R_{t + 1} F_{t + 1} / Q_{t + 1}$ . The posterior distribution $P (X_{t + 1} | D_{t + 1}) ~ N [z_{t + 1}, C_{t + 1}]$ is given as:

z_{t + 1} = μ_{t + 1} + K_{t + 1} e_{t + 1}

(18)

C_{t + 1} = R_{t + 1} - K_{t + 1} {K_{t + 1}}^{Q_{t + 1}}

(19)

The entire process is shown in Figure 4. If current information $D_{t}$ is used to repeat 1 step-ahead forecast procedure without the Bayesian updating (Equations (18) to (19)), then the k step-ahead prediction is obtained.

Figure 4.

BDLMs updating flow chart.

Model parameters estimation

The final problem is how to calculate the arguments $Ω = {F_{t}^{T}, V_{t}, G_{t}, W_{t}}$ in BDLM which satisfies the maximum log-likelihood estimation:

Ω^{*} = \underset{Ω}{\arg max} [\log (P (Y_{1}^{'} ~ Y_{t}^{'} | Ω))]

(20)

P (Y_{1}^{'} ~ Y_{t}^{'} | Ω) = Π_{i = 1}^{t} P (Y_{i}^{'} | Y_{1}^{'} ~ Y_{i - 1}^{'}, Ω)

(21)

Equation (20) is the objective function, which aims to maximize the joint probability density of observations (Equation (21)). Any optimization algorithm,⁵² such as batch gradient descent, can be implemented to search for the optimal solution. The termination criterion is based on the log-likelihood value as follows:

L (n) > L (n - 1) \cap | \frac{L (n) - L (n - 1)}{L (n - 1)} | < τ

(22)

where $τ$ refers to the termination tolerance, which is $1 e - 6$ . n corresponds to the nth iterative model parameter of $Ω_{n}$ . Once arguments of BDLM are determined, the Kalman smoother algorithm can be employed to recalculate the variables $X_{t} ~ X_{0}$ to maximize the expected log-likelihood value. Motivated by the advantages of BDLM, the prediction model is established to estimate daily load cycles based on dynamic strain data.

Partially observable Markov decision processes

This section aims to establish a sound decision framework to plan optimal maintenance actions based on the information from the SHM. The POMDP model is employed, which is a well-suited model for sequential decision-making problems operating under partial observation and an uncertain environment.

Discrete model

A traditional POMDP can be formulated by 7-tuple arguments $〈 S, A, O, T, O, R, γ 〉$ , where the first three hollow capital letters indicate the total set of structural states $S$ , available maintenance actions space $A$ , and observable values $O$ . The corresponding minuscule is used to represent discrete elements in these sets: $s_{i} \in S$ , $a_{m} \in A$ and $o_{n} \in O$ . The scalar symbols $| S |$ , $| A |$ , $| O |$ denote the number of elements in those sets. The middle three bold capital letters indicate three matrices in POMDP, which construct fundamental operators. The state transition matrix T with three-dimensions $| S | \times | S | \times | A |$ characterizes a state evolution process in a conditional probability $p (s_{t + 1} | s_{t}, a_{t})$ . The observation probability matrix $O (| S | \times | O | \times | A |)$ quantifies the probability of observation $p (o_{t + 1} | s_{t + 1}, a_{t})$ after the action $a_{t}$ is executed and the state transfers to $s_{t + 1}$ . The reward matrix $R (| S | \times | A |)$ defines the immediate cost $r (a_{t}, s_{t})$ to the agent after conducting an action. The last parameter is the reward discounting factor $(γ \in [0, 1))$ which guarantees the convergence of POMDPs in infinite-horizon time series. Summarizing the aforementioned framework, an episode in infrastructure management is defined as: when a structure starts with the initial state $s_{0}$ and follows the stochastic degradation, the agent implements a series of maintenance actions based on the observed system (SHM). This episode will terminate once the component fails or finishes its life service. The goal of the manager is to establish an optimal policy $π^{*}$ to minimize the cost by considering the long-term utility of maintenance action in the light of observations, formulated as:

π^{*} = \underset{π}{\arg max} E [\sum_{i = 0}^{t} γ^{i} r (s_{i}, a_{i}) | π (a_{0}, o_{1}, \dots, a_{t}, o_{t + 1})]

(23)

The challenge in POMDP is that the structural state is hidden. To handle this, we need to speculate the state distribution conforming to the Bayesian theory. A pertinence parameter referred to as the belief state vector b is introduced to fuse the historical information into a probabilistic distribution over a set of all possible discrete states. The hidden state in POMDP is evolved to a belief-MDP, as shown in Figure 5. The posterior belief state $b_{t + 1}$ is derived from a Bayesian update based on the prior state transition matrix T and observation matrix O , given as⁵³:

\begin{matrix} b_{t + 1} (s_{t + 1}) = p (s_{t + 1} | o_{t + 1}, a_{t}, b_{t}) \\ = \frac{p (o_{t + 1} | s_{t + 1}, a_{t})}{p (o_{t + 1} | a_{t}, b_{t})} \sum_{s_{t} \in S} p (s_{t + 1} | s_{t}, a_{t}) b (s_{t}) \end{matrix}

(24)

where the denominator $p (o_{t + 1} | a_{t}, b_{t})$ is the normalization coefficient which integrates all possible cases, given as:

p (o_{t + 1} | a_{t}, b_{t}) = \sum_{s_{t + 1} \in S} p (o_{t + 1} | a_{t}, s_{t + 1}) \sum_{s_{t} \in S} p (s_{t + 1} | s_{t}, a_{t}) b (s_{t})

(25)

Additionally, the hidden state $s_{i}$ in POMDPs needs to be substituted by the belief state $b (s_{i})$ , such as reward $\sum_{s_{i} = 1}^{S} r (a_{t}, s_{i}) b (s_{i})$ . In MDP, the policy decides the optimal action based on the current state and observation $π (s, o) \Rightarrow a$ . Since the belief state b contains all the historical information, the action-decision can rely on the belief state mapping $π (b) \Rightarrow a$ as the dashed line in Figure 5 shows. Then, the expected maximum cumulative reward (Equation (23)) in an episode is evolved to:

V^{π^{*}} (b_{0}) = E^{π^{*}} [\sum_{t = 0}^{n} γ^{t} r (π (b_{t}), b_{t}) | b_{0} = b]

(26)

where Equation (26) is defined as the value function of belief point b under optimal policy $π^{*}$ . The objective of the POMDP model is to determine the optimal action in any belief point to maximize the value function. In dynamic programming, Equation (26) can be rewritten compactly by a one-step forward value function, named the Bellman Equation⁵⁴, based on the transition between two belief states $(b_{t} \to {b_{t}}_{+ 1})$ :

\begin{matrix} V^{π^{*}} (b_{t}) = max_{a_{t} \in A} [r (a_{t}, b_{t}) \\ + γ \sum_{o_{t + 1 \in O}} p (o_{t + 1} | a_{t}, b_{t}) V^{*} (b_{t + 1} | o_{t + 1}, a_{t}, b_{t})] \end{matrix}

(27)

Equation (27) contains two parts, instantaneous reward and expected belief point value after being transferred $(b_{t} \to b_{t + 1})$ . An important feature of the POMDP model is that the optimal value functions are piecewise linear and convex, thus they can be approximated by a linear polynomial function,⁵⁵ expressed as:

V (b) = max_{α \in Γ} \sum_{s_{j} = 1}^{S} α (s_{j}) b (s_{j})

(28)

Through Equation (28), the belief point value is simplified as the probability of state $b (s)$ times the value coefficient $α (s)$ . To visualize the physical meaning of Equation (28), a geometric model is constructed. Assuming the structural state number equals 3, the belief state vector is written as $(b (s_{1}), b (s_{2}), 1 - b (s_{1}) - b (s_{2}))$ . If the $b (s_{1})$ , $b (s_{2})$ , and belief value $V (b)$ are the coordinate values in Cartesian space, as shown in Figure 6, the adjacent belief points which share an identical value coefficient can be represented by a plane in belief domain. Afterwards, using the family of value coefficients $Γ = (α_{1}, α_{2}, \dots, α_{n})$ , multiple planes can be constructed in the belief domain. Finally, the upper boundary of these multiple planes represents the maximum value of Equation (28), which is the geometrical solution of POMDP guided by a policy. The optimal policy $π^{*}$ of POMDP is equivalent to finding the value coefficient $Γ = (α_{1}, α_{2}, \dots, α_{n})$ satisfying Equation (27) in the entire belief domain.

Figure 5.

A schematic illustrating the POMDPs transference to belief-MDPs. The additional consideration is the probability of observation in each possible state.

Figure 6.

The solution process for three-dimensional POMDPs. The policy is represented by the multiple planes. The belief point value change from $V (b_{t})$ to $V (b_{t + 1})$ after the belief state stochastic transition.

Importing Equation (28) into Equation (27), the belief value updating function (29) is derived as:

\begin{matrix} V (b_{t}) = & max_{a_{t} \in A} [\sum_{s_{t} \in S} r (s_{t}, a_{t}) b (s_{t}) + γ max_{α \in Γ} \sum_{s_{t} \in S} b (s_{t}) \sum_{o_{t + 1} \in O} \sum_{s_{t + 1} \in S} p (o_{t + 1} | s_{t + 1}, a_{t}) p (s_{t + 1} | s_{t}, a_{t}) α_{t + 1} (s_{t + 1})] \end{matrix}

(29)

In optimization, the above equation is named the BACKUP operator. An alternative pattern that facilitates programming in computer language is written as:

\begin{matrix} \underset{α \in Γ}{\arg max} α b_{t} = \underset{a \in A}{\arg max} [r (a_{t}, b_{t}) \\ + γ \sum_{o_{t + 1} \in O} p (o_{t + 1} | a_{t}, b_{t}) max_{α \in Γ} α b_{t + 1}] \end{matrix}

(30)

Notably, Equations (29) and (30) imply that each value coefficient $α$ is associated with a specific action that can maximize the current belief value. Hence, a mapping can be established from value vector $α$ to action a once the entire optimal vectors $Γ^{*}$ are obtained. In summary, computing an optimal policy $π^{*}$ in the POMDPs is to construct the maximum value boundary of multiple hyperplanes in the belief domain. However, there are still existing limitations before continuous POMDPs can be implemented in bridge management incorporating the SHM system. Since the distributions generated by BDLMs are arbitrary, they are needed to be expressed as the probability density function (PDF) as the observation part in POMDPs. As a result, a tailored Gaussian-based methodology is introduced to overcome this problem.

Continuous observation model

To embed BDLMs into the continuous observation POMDPs, the Gaussian mixture model (GMM) is employed to express the distribution associated with the prediction. It is well known that an arbitrary distribution can be divided into finite Gaussian distributions $g (o | μ, σ)$ with corresponding weight factors w through the expectation-maximization algorithm.⁵⁶ Thus, the PDF of monitoring data can be written as:

f_{p} (o) = \sum_{i = 1}^{K} w_{i} g (o | μ_{i}, σ_{i})

(31)

where K is the number of distributions in GMM. Once the PDF of the fatigue cycles $f_{p} (o)$ is established from the monitoring data, the number of DC $(N_{t})$ is obtained during the time interval $t \to t + 1$ . The component condition after DC is given by Equation (7):

p ({b'}_{t + 1} | o_{t + 1}, a_{t}, b_{t}) = p (b_{t}) (P_{DC})^{N_{t}}

(32)

Then, the observation item $p (o_{t + 1} | s_{t + 1}, a_{t})$ in Equation (29) for continuous observation is derived by Bayes’ theorem:

\begin{matrix} p (o_{t + 1} | s_{i}, a_{t}) = \frac{p (s_{i} | o_{t + 1}, a_{t}) f_{p} (o_{t + 1})}{\int_{+ \infty}^{- \infty} p (s_{i} | o_{t + 1}, a_{t}) f_{p} (o_{t + 1}) d o} \\ i \in (1, 2, \dots, d) \end{matrix}

(33)

Another discrepancy between POMDP and the model in this study is the hidden state transition. Generally, POMDP assumes that the state degradation only relies on the prior state probability transition matrix. The observation is utilized to detect the current hidden state since the measurements will not affect the results. However, these two assumptions are not consistent with the practical component-level management incorporating the SHM system due to the following aspects. As the fatigue problem intrinsically contains high uncertainty, the condition of the component estimated by monitoring data is a probabilistic distribution rather than confidence about the state. In addition, the prior state transition matrix established by the inspected database reflects the generally deteriorated behavior of the similar component. For structural-level management, this roughly statistical model may be acceptable. But in component-level management, this approximation could induce large discrepancies with specific component deterioration. Otherwise, all the components in the bridge can share the same maintenance strategy, which is contradictory to the component maintenance in different locations of the bridge, load cases, and environment. Thus, the concept that the state transition depends on the posterior matrix is proposed. The confidence factor $η \in [0, 1]$ is introduced to combine the prior state transition matrix with observation. This coefficient reflects the agent’s confidence about the structural state decided by the prior transition matrix or observation. Zero indicates that the agent only believes prior state transition. One represents the updated belief state only depending on observation. The posterior belief distribution over the defining state 1–d is given:

\begin{matrix} p (b_{t + 1} | o_{t + 1}, a_{t}, b_{t}) = p (b_{t}) T (a_{t}) \\ + η p (b_{t}) ((P_{DC})^{N_{t}} - T (a_{t})) \end{matrix}

(34)

Point-based algorithms in solving POMDP

To enhance the calculation efficiency, the point-based algorithm is proposed to alleviate the complexity in POMDP by avoiding the exponential increase of $α$ vectors in every time step. The core of this algorithm is to restrict value iteration to a meaningful collection of discrete belief points, rather than calculate the maximum value $V (b)$ in every belief point. In infrastructure management, there is only a finite number of belief states that are actually visited. For instance, a component belief state has specific transferred paths based on the transition parameters T and Equation (34), rather than an arbitrary transition in the belief domain. These visited points are defined as the reachable belief point B. To further cut down the set of reachable points, the subset of reachable belief points $B^{*}$ under optimal sequences of actions is proposed. However, the method which can directly decide the optimal action in the arbitrary belief point does not exist. The recommended sampling approach uses successive approximations of the reachable space under optimal policies (SARSOP). This solver can avoid redundant searching in the suboptimal solution through heuristic exploration and information gathered from earlier samples. The detailed algorithm is introduced by Kurniawati et al.⁵⁷

However, the continuous observation space could cause other issues when utilizing point-based algorithms to sample the representative belief points. The lossless partitioning of the observation space could induce large observation spaces that need to be considered in POMDPs. To alleviate the complexity of observation spaces, various online solvers^33,37,38 are developed, which use heuristic search in conjunction with branch-and-bound pruning to construct a look-ahead decision-making tree. Through interleave planning and plan execution, online solvers use the forward sampled scenarios to estimate the current belief value and decide the single best action for the current state. Although the online solver can dramatically decrease the computation, the solution achieved often falls sub-optimal. To obtain the optimal or near-optimal solution, the off-line solver needs to be employed, in conjunction with the algorithm which can “slim” the representative belief points. Therefore, by the consideration of the POMDPs features in infrastructure management, an efficient discretization strategy is proposed in this paper. Two characteristics in structural maintenance are utilized, (1) Generally, the consequence of failure is unacceptable in structure. Hence, the cost of failure state has high orders of magnitude compared with other states. (2) All the state transition matrix is unit-jump upper triangular matrix as indicated in Equation (5), except the replacing action. The mathematical property that the arbitrary value function $V (b)$ is concave and would monotonically decrease with the belief state transition is proved in Appendix A. Motivated by this property, the large discrete observation space that selects the same action can be aggregated. The Kernel of the proposed method is to find the belief area in which different observations lead to choosing different actions, as Figure 7 shows the belief state $b_{i} ~ {b_{i}}_{+ 1}$ , and ${b_{i}}_{+ 3} ~ {b_{i}}_{+ 4}$ . Then, the observation in these areas needs to be fine-discretized which in turn can define the boundary. The detailed derivation and computational process are introduced in Appendix A steps 1–7.

Figure 7.

Adaptive discretization strategy in representative belief points sampling.

For the welded details, the observation in SHM is the number of stress cycles. Since the stress cycles are quantified as the unit-jump upper triangular matrix, the belief state is also monotonically increasing with the number of cycles. Hence, once it defines the belief state range corresponding to the same maintenance action in the optimal policy, the continuous observation that affects the belief state distribution in this range can be aggregated.

Illustrative example

The fatigue details in bridge are studied to demonstrate how to plan the optimal maintenance strategies based on long-term monitoring data. To possess the real-time condition of the bridge, a dynamic SHM system was installed which includes weldable foil-type strain gauges. Major strain transducers are attached to components prone to fatigue. In this paper, since the methodology is concentrated on individual component management, the most vulnerable fatigue joint is selected which experiences the maximum stress range. Figure 8 marks the site of sensors which is fixed under the top flange of the box girder. This detail connects to the top flange plate through the U-welded joint. In this paper, the parameters of welded detail are listed in Table 1. The material arguments (F2 class steel) C and m are determined by the BS-7910 code and fatigue experimental results.³⁸

Figure 8.

Welded detail monitored in the box girder. (a) Marks the welded joint location and traffic and (b) Shows the geometry of welded details.

Table 1.

Parameters of welded detail.

Geometry parameters								Material parameters
$a_{0}$	$a_{c}$	T	$a / c$	$θ$	$L / T$	W	$φ$	C	$Δ σ_{0}$	m ( $\geq Δ σ_{0}$ )	m ( $< Δ σ_{0}$ )
0 mm	9 mm	10 mm	0.2	$π / 4$	2	$\infty$	$π / 2$	$3.60 e - 13$	35MPa	3	5

It should be noted here that the rate of crack propagation approaches infinity at the end of the failure phase. Since fragile behavior is unacceptable in the serviceability limit stage, conservative state thresholds need to be set. Herein, the states related to reliability are defined as infinite, 4, 3, 2, 1, based on the standard⁴⁹ proposed. State 1 refers to the intact state whose no cracks are possible to detect, while state 5 implies the potentially hazardous state at the onset of fracture. The fatigue properties of F2 class steel $A_{i}$ in Table 2 are proposed by standard.⁵⁸ Afterwards, the expected number of cycles $N_{i}$ corresponding to different reliability can be calculated by Equations (1) and (2). If DC is defined as 100 stress cycles (35 MPa as stress range), the STR $r_{i}$ in the unit jump matrix is obtained via Equations (2) and (5). All the parameters of the Markov chain model are given in Table 2. As mentioned before, corrosion reduces the section area of the component and increases the cycling impacts according to Equation (4). In the marine atmosphere, the proposed parameters⁴³ in Equation (3) are listed in Table 3.

Table 2.

Markov chain model parameters.

Reliability (state)	4	3	2	1	Fatigue parameters in Equation (6)
Material fatigue property $A_{i}$	$0.15 e 12$	$0.25 e 12$	$0.43 e 12$	$0.64 e 12$	Variance $V (s_{t})$	$2.58 e + 14$
Expected cycles number $N_{i}$	$3.5 e 6$	$5.9 e 6$	$1.0 e 7$	$1.5 e 7$	Failure cycles $N_{f}$	$2.87 e + 07$
STR $r_{i}$	$3.5 e 4$	$2.4 e 4$	$4.1 e 4$	$4.8 e 4$	Matrix size d	5

Table 3.

Corrosion parameters.

Atmosphere	Site	Time of wetness (h/year)	$ν (μ m)$	$λ$
Light	Barcelona	3200	49.9	0.6–0.9
Severe	Alicante	4300	92.6	0.6–0.9
Severe	Hong Kong	4152	92.6	0.75

Data processing

The intensive sampling reproduces the original waveforms of the measurands, especially for extreme points. The representative measurements are shown in Figure 9(a), which has three characteristics: First, the strain pulses have a significant decrement from 1:30 am to 5:30 am as the train ceases its service during this period. It can infer that the strain pulses caused by the train are much larger than the vehicles by contrasting the amplitude of strain in daily strain time history. Moreover, the overall drift of strain is strongly synchronized with the temperature variation. However, the temperature strain does not produce stress in this welded detail because no constraints are in the strain direction. To separate temperature strain from the original signal, the maximal overlap discrete wavelet transform is used. This algorithm can decompose the signal into kth different frequency bands. Each band only contains the frequency that belongs to $[1 / (2^{k + 1} Δ t), 1 / (2^{k} Δ t)]$ , where the $Δ t$ is the sampling interval, and k is the integer from 0 to 17. After filtering the signal, a smooth curve is extracted under the frequency band $[0, 1 / (2^{18} Δ t)]$ , which is represented by the dark blue line in Figure 9(a). Afterwards, the strain response to the live load is obtained. The effective stress range $Δ σ$ and the number of cycles N are calculated by the rain-flow cycle counting technique. $Δ σ$ less than 2 MPa is not considered because their effects on fatigue life can be neglected. The statistical results of stress cycles are shown in histograms of Figure 10. Different stress amplitudes are adopted in Figure 10 according to the BS5400, the low-stress ranges should use a small interval to reduce the conservatism in fatigue assessment. An insight into Figure 10 reveals the daily traffic conditions. The motor vehicles induce high-frequency and low-stress (2–16 MPa) cycles, while the low-frequency and high-stress cycles (16–36 MPa) represent the railway impacts. Due to the combined effect of highway traffic, railway traffic, and wind loads, occasional impacts can reach the range of 36–50 MPa. Finally, the daily cumulative cycles N are calculated by linear superposition principle $n_{i} / N_{0}^{i}$ , where $n_{i}$ is the cycles in different stress range $Δ σ_{i}$ . Figure 11 displays 2017-year cycling impacts, which will be utilized to build the BDLMs to predict the fatigue cycles. If missed or distorted data occupy 30% in daily sampling, then those dynamic response data will not be used in the fatigue analysis. The data has a 20% missing due to the temporary electricity interruption or periodic maintenance for the SHM system, as Figure 11 shows.

Figure 9.

Typical daily strain spectra. The upper figure displays the temperature stain hidden in data. The lower picture is the live load strain.

Figure 10.

Stress spectra histogram. The different intervals are implemented based on the BS5400 recommendation.

Figure 11.

Whole year cumulative daily cycle. The fluctuation means the external impacts have randomness.

BDLM establishing

BDLM is constructed based on the daily stress cycling data in 2017, as Figure 11 shows. As mentioned before, the BDLM contains various block components where each block reflects one characteristic of data. The predicted model establishment is to attempt the different combinations of blocks to find a model that matches the data in Figure 11 with maximum likelihood value and acceptable error. Four representative BDLMs are assessed and listed in Table 4: (a) the AR model ( $ϕ$ and $σ_{A}$ in Equation (12)) with the local level block (normal distribution $μ_{L}$ and $σ_{L}$ ), (b) the periodic model (p and $σ_{S}$ in Equation (11)) with local level blocks, (c) the AR model with the periodic and local level blocks, and (d) the AR model with periodic and trend blocks ( $μ_{T}$ and $σ_{T}$ in Equation (10)). For brevity, the above models are abbreviated as LAR, LP, LPAR, and TPAR, respectively. The model parameters are calculated using a batch gradient descent algorithm with the objective function (20) and the terminated criterion (22). Because the autocorrelation function (ACF) exponentially reduces to zero with the lag increases and the partial ACF cuts off after lag 1, as Figure 12 shows. Three BDLMs choose the 1-order AR block according to the joint analysis of the ACF and partial ACF. The results of the four model parameters are summarized in Table 4. The last two items in Table 4 are used as quantifiable indicators of the fitting degree of BDLMs. The log-likelihood value indicates the probability which the monitoring data fall into the area defined by BDLM. The residual error $V ~ [0, σ_{v}]$ quantifies the features of data not reflected in BDLM. Accordingly, the periodic block and AR block are essential parts of the BDLMs. As indicated, the residual error decreases from 9.1 in LAR to 3.2 in LPAR and 19.9 in LP to 3.2 in LPAR. Although the TPAR model extracts a linearly decreasing function, the discrepancy induced by this function between LPAR and TPAR is negligible. To demonstrate whether this decrease is conformed to real situations, the statistics of actual traffic volume are analyzed. According to the Transport Department of Hong Kong SAR Government report,⁵⁹ the railway volume is stationary based on the schedule in daily operation. In addition, the number of heavy trucks (>24 tonnes) presents fluctuation which has a minute increase from $9.4 e 5$ in 2017 to $9.5 e 5$ in 2018, but it decreases to $7.6 e 5$ in 2019. In general, the increment of heavy traffic volume reaches stabilization after bridge operation for several years. The small decrease in the TPAR model may represent the localized trend in long-term fluctuation. The daily fatigue impacts are assumed to obey the LPAR model. Using the parameters in LPAR, Figure 13 is drawn with Kalman smoother (fill the missing data) and KF (prediction). The black dots in Figure 13 is the prediction by executing a Monte Carlo simulation. If Monte Carlo simulations are carried out enough times, then the distribution of the cumulative number of cycles N will be obtained. This distribution is finally parameterized as the GMD pattern, defined in Equation (31). In this case, the accumulative number of cycles in a year can be represented using the Gaussian distribution $N ~ [75, 823, 1149^{2}]$ .

Figure 12.

ACF and PACF in the AR blocks. The upper plot demonstrates lag reduced to zero. The lower plot shows a lag decrease to zero after AR(1).

Table 4.

BDLMs model parameters.

Models	$μ_{L}$	$σ_{L}$	$ϕ$	$σ_{A}$	p	$σ_{S}$	$μ_{T}$	$σ_{T}$	Log-likelihood	Residual error $σ_{v}$
LAR	209.7	0	0.61	19.1	—	—	—	—	−1.204e+03	9.1
LP	207.7	0	—	—	365.2	0	—	—	−1.185e+03	19.9
LPAR	207.6	0	0.27	19.2	365.2	0	—	—	−1.178e+03	3.2
TPAR	215.9	—	0.26	19.0	365.2	0	−0.045	7.81e-6	−1.178e+03	3.8

BDLM: Bayesian dynamic linear model.

Figure 13.

BDLMs for cycling forecasting. The light blue area is the two standard deviations intervals. The black dash line indicates the expected value development.

Management model establishing

This section describes how the continuous POMDP is established for bridge welded details management. The aforementioned methods, such as the Markov chain model for fatigue, BDLM for prediction, and GMM for observation are integrated into continuous observation POMDP to develop a comprehensive management system based on monitoring data. POMDP includes 7-tuple arguments $〈 S, A, O, T, O, R, γ 〉$ . To conform to the fatigue failure distributions of F2 class steel, 5 states are defined based on Table 2 results. Those five states respectively correspond to reliability $S = [+ \infty, 4, 3, 2, 1]$ . Four routine maintenance treatments recommended by the manual^56,60 are employed, which are “Do Nothing,”“Descaling and Painting,”“Rehabilitation,” and “Replacing” $(| A | = 4)$ . It should be noted that the “Rehabilitation” action is post-weld maintenance techniques, such as burr grinding or hammer peening. This action aims to maintain the component condition and avoid further deterioration, while it does not fundamentally enhance the performance of the component. The set of actions $A$ determine different prior state transition matrices T . For the “Do Nothing” action, if the coating is valid for corrosion resistance, the prior state transition matrix per year $T_{1}^{0}$ is given according to the statistic of welded details degradation.^49,61 The superscript of $T_{1}^{0}$ donates the corrosion phase.

T_{1}^{0} = (\begin{matrix} 0.964 & 0.036 & 0 & 0 & 0 \\ 0 & 0.949 & 0.051 & 0 & 0 \\ 0 & 0 & 0.974 & 0.026 & 0 \\ 0 & 0 & 0 & 0.978 & 0.022 \\ 0 & 0 & 0 & 0 & 1 \end{matrix})

The corrosion will accelerate the deterioration of welded joints at three different rates. In phase 1, the accumulative exposure time is assumed less than 15 years and the corrosion accelerates the degradation 1.8 times. In POMDP, this acceleration coefficient is quantified by discounting the parameter of STR $r_{i}$ 1.8 times in the transition matrix $T_{1}^{0}$ . The prior state transition matrix changes to:

T_{1}^{1} = (\begin{matrix} 0.937 & 0.063 & 0 & 0 & 0 \\ 0 & 0.912 & 0.088 & 0 & 0 \\ 0 & 0 & 0.954 & 0.046 & 0 \\ 0 & 0 & 0 & 0.961 & 0.039 \\ 0 & 0 & 0 & 0 & 1 \end{matrix})

For phase 2, the joint is cumulatively exposed to an erosive environment for 15–30 years, and the accelerated coefficient is assumed as 2.6. For phase 3 (>30 years), this value increases to 3.4. The “Descaling and Painting” action shares the state transition matrix $(T_{2} = T_{1})$ as the “Do nothing” action because the coating will not affect the cross-section. The utility of this action is to avoid welded joint erosion in the next 4 years. But the cumulative exposure time will not reset to 0 since the corrosion process is irreversible. The “Rehabilitation” can effectively prevent further degradation. This property is quantified by the large value of STR $(r_{i} = 99)$ in the state transition matrix $T_{3}$ which indicates the component can maintain the current state for a long time. This action also recovers the coating to its initial condition.

T_{3} = (\begin{matrix} 0.99 & 0.01 & 0 & 0 & 0 \\ 0 & 0.99 & 0.01 & 0 & 0 \\ 0 & 0 & 0.99 & 0.01 & 0 \\ 0 & 0 & 0 & 0.99 & 0.01 \\ 0 & 0 & 0 & 0 & 1 \end{matrix})

The state transition matrix $T_{4}$ in the “Replacing” action is assumed the component state has a 90% probability to restore the initial state and 10% to state 2. Additionally, the accumulative exposure time in a corrosive environment will be reset to 0.

T_{4} = (\begin{matrix} 0.9 & 0.1 & 0 & 0 & 0 \\ 0.9 & 0.1 & 0 & 0 & 0 \\ 0.9 & 0.1 & 0 & 0 & 0 \\ 0.9 & 0.1 & 0 & 0 & 0 \\ 0.9 & 0.1 & 0 & 0 & 0 \end{matrix})

The observation parameters in continuous POMDPs are evolved from the probability matrix to the probability density functions. After the BDLMs are established, a stochastic observation value $o_{t + 1}$ is obtained based on GMM in Equation (31). The observed belief state is updated by Equation (32), where parameter $P_{DC}$ in Equation (32) is given via Equation (5) and STRs in Table 2. For instance, if an observed value $o = 74, 000$ $(N = 740)$ from the SHM system is predicted and the previous belief state is $b_{t} = [0.7, 0.3, 0, 0, 0]$ , then the observed belief vector based on observation is given by Equation (32):

b_{t + 1}^{'} = b_{t} (P_{DC})^{N} = [0.6855, 0.3052, 0.0093, 0.0001, 0]

It should be mentioned that the different maintenance actions also affect the scope of observation results. When the welded joint has been cumulatively exposed to the erosive atmosphere for t years, the possible distribution cycling impacts during a time step are amplified as $[Λ^{m} μ, Λ^{m} σ]$ based on Equation (4). For the “Rehabilitation” action, the $(P_{DC})^{N}$ is assumed identical with the $T_{3}$ , which maintains the component current state. When the manager executes the ’Replacing’ action, the $(P_{DC})^{N}$ equals $T_{4}$ and the cumulative exposure time resets to zero. Finally, the posterior belief state $b_{t + 1}$ is updated by Equations (34) and (24). For the value updating $V (b)$ , the probability density of $p (o_{t + 1} | s_{i}, a_{t})$ in Equation (29) is calculated through Bayes’ theorem (Equation (33)). The last two arguments $〈 R, γ 〉$ are associated with the economic indicators. Herein, two categories of rewards are assumed based on the reasonable cost of maintenance and the consequence of failure. The details of the reward R are listed in Table 5. In this paper, the reward discounting factor γ equals 0.95, which represents the decreasing value of future rewards.

Table 5.

Reward matrix table ($).

Conditional level	$s_{1}$	$s_{2}$	$s_{3}$	$s_{4}$	$s_{5}$
Do Nothing	0	−2k	−4k	−8k	−400k
Descaling and Painting	−2k	−4k	−6k	−10k	−400k
Rehabilitation	−10k	−12k	−14k	−18k	−400k
Replacing	−200k	−200k	−200k	−200k	−400k

Results and discussions

Before gaining further insight into the maintenance behavior in optimal policy, the characteristics of modified POMDP will be emphasized again because these concepts are commonly overlooked and significant. It is worth noting that the conventional POMDP assumes that the hidden state transmission only depends on the prior state transition matrix T . The observation is utilized to reduce the uncertainty of the belief state, which can avoid unnecessary maintenance actions. A concept named value of information is proposed as a metric to quantify the saving when the manager makes decisions based on observation. The different POMDPs can effectively quantify the benefit of different “accuracy level” SHM systems. Nevertheless, the accuracy of a real-world SHM system is difficult to evaluate because it is affected by several factors, such as sensor accuracy, sampling frequency, and damage evaluation algorithms. Additionally, the prior state transition matrix representing the deteriorated mechanism of a specific welded detail is difficult to obtain because most of the inspection database does not classify the component according to external factors. Consequently, the observation system in continuous POMDP is an intermediate to consider the external stochastic influences. The prior state transition matrix T summarizes the general deteriorated regular from historical statistics of similar components. The observation matrix $(P_{DC})^{N}$ is used to update the prior state transition matrix T in the specific ambient. The behavior of hidden state transformation is more in line with the law expressed by the posterior state transition matrix. The confidence factor $η$ in Equation (34) quantifies how much information from observation is used in updating. In this paper, this factor is assumed as $η = 0.5$ considering the error from three aspects: the fatigue failure is prominent in metal welded detail but not unique, the cumulative forecast error from BDLMs, and the empirical formula.

To illustrate how the welded joint degradation, the expected belief state deteriorated curves based on the prior state transition matrix and stress cycles from observation are separately plotted in Figure 9. Afterwards, the belief state evolution is obtained based on the posterior transition matrix $(T + {P_{DC}}^{N}) / 2$ , as depicted by the black lines in Figure 14. The blue area between corrosion and protection is the reachable belief state domain. In bridge management, the manager has a risk threshold which is quantified by the likelihood times consequence. The optimal scheme pursues an appropriate trade-off between failure risk and the cost of the replacement component. Therefore, the optimal reachable belief state domain will be restricted in the gray area to avoid excessive risk. The size of this area is controlled by the punishment $r (a, s_{5})$ value (failure consequence).

Figure 14.

A schematic of component expected deteriorated curve. Notice that the dash-dot line refers to steel with perfect painting protection. The solid line is the deterioration in natural corrosion. The blue area B is the belief reachable space. The dark gray area $B^{*}$ shows the reachable belief points in the optimal policy.

According to the reward parameters in Table 6, the comparison between the SARSOP algorithm and proposed discretization strategy in Appendix A is listed in Table 6. Since the maintenance fees will dramatically affect the optimal management behaviors, the sensitivity of price is studied by altering the reward parameter R in POMDPs. The optimal welded detail management is displayed in Figures 15 and 16. Three maintenance actions are marked with different symbols, “Descaling and painting” marked with a triangle, “Rehabilitation” with a square, and “Replacing” with a diamond. The background of Figures 15 and16 is the belief state cumulative histogram, which reflects the agent’s confidence about the current component condition. The dashed line denotes the hidden state transition which is not accessible to the agent in practical engineering. The hidden state plotted in figures is regarded as a reference to check the reasonable maintenance actions adopted. For instance, the “Replacing” action should be implemented when a component is working in state 3 for a long period or deteriorates to state 4. Figure 15 unfolds the optimal intervened behaviors of the welded joint based on reward Table 5. Three phenomenons can be observed: (1) The “Descaling and painting” action is periodically executed per 4 years to avoid the cross-section reduction of welded joints by corrosion. Although the influence of cross-section reduction is slight in the early stage, the additional cumulative cycling will significantly affect the useful lifespan, according to Equation (4). (2) It is interesting information to know that preventive maintenance is not adopted when the component approaches its end life, because the cost of action is larger than the expected return. (3) “Rehabilitation” is not adopted since the low-cost performance is not competitive among other maintenance.

Table 6.

Solver comparison table.

Problem size: $\| S \| = 5^{4}$ , $\| A \| = 4$ , $\| O \| = 7840$
Solvers	Reward	Times (h)
SARSOP	$- 1.84 e 3 \pm 0.06 e 3$	5.7
Adaptive discretization	$- 1.87 e 3 \pm 0.08 e 3$	1

SARSOP: successive approximations of the reachable space under optimal policies.

Figure 15.

Policy realization of welded detail management. “Rehabilitation” is not adopted as an uneconomical action (10 k$).

Figure 16.

Policy realization of welded detail management with economical “Rehabilitation” action. With the decrement of “Rehabilitation” cost, the adopted frequency of this action has significant growth: (a) ``Rehabilitation'' cost is –5, (b) ``Rehabilitation'' cost is –4, (c) ``Rehabilitation'' cost is –3, and (d) ``Rehabilitation'' cost is –2.

The effects of cost performance in infrastructure management are investigated. The change in optimal maintenance strategy is observed when the cost of “Rehabilitation” is decreased. For the case that the cost of “Rehabilitation” decreases to −5, it will first be adopted in the mid-life time of the component, as shown in Figure 16(a). This behavior can be explained by the feature of the prior state transition matrices. In a welded detail naturally degradation (“Do nothing”), the capability of the component to maintain state 2 is the weakest ( $r_{2} = 18.61$ in $T_{1}^{0}$ ) compared with other states. Meanwhile, the “Rehabilitation” action has the identical capability to maintain every state ( $r_{i} = 99$ in T ₃ ), which indicates this action achieves the maximum utility in state 2 when it substitutes the “Do nothing” action. Therefore, once the “Rehabilitation” has a higher cost performance than other maintenance actions, it will be adopted. For case II, the expenditure of “Rehabilitation” decrease to −4, and it will gradually substitute the “Descaling and painting” action, as shown in Figure 16(b). For the last case, the cost of the “Rehabilitation” action decreases to −2 and becomes the most cost-effective action in maintenance. As indicated in Figure 16(c) and (d), this action will be frequently used. It is also worth noting that the deterioration rate will fluctuate due to the uncertainty in the prior state transition matrix and prediction of BDLM. This randomness induces a slight difference of sequential actions in the same POMDP, as shown in Figures 15 and 16. In summary, the reward matrix is a key parameter that could significantly affect the optimal policy $π^{*}$ when the prior state transition matrices, maintenance utility, and the prediction of observation are established.

The different maintenance actions present a competitive and complementary relationship based on their cost performance. To illustrate the effect of the cost-performance on the optimal maintenance policy, the concept of the “barrel effect” is utilized to present the competitive and complementary relationship among different interventions (Figure 17). When the maintenance utility ( $T, O, A$ ) is determined in all POMDPs, the cost performance of action is inversely proportional to the reward parameter ( R ). The cost of action determines the length of the barrel board, and the capacity of the barrel is the average maintenance fee per year. Four barrels correspond to study cases in Figures 15 and 16(a) to (c), and the number on the board is the action reward from state 1 (bottom) to state 5 (top). Then, uneconomical action as the longboard would not affect the capacity of the barrel. With the decrement of the “Rehabilitation” action cost, the “Rehabilitation” action changed from a longboard to a short one, and the expected maintenance fee per year decreased from 9.2k to 6.8k.

Figure 17.

Barrel effect in decision-making. The value in the board denotes the cost of action in the different states according to Table 5.

However, the continuous POMDP has some limitations in application. The computation requirements considerably increase with solving large discretization observation spaces. For a welded joint, a computer equipped with an AMD 3900X and 64 GB memory still needs 1 hour to converge to the optimal solution. For large multi-component systems, individual component failure may not significantly affect the reliability of the entire structure. It requires the model to consider the synergy effect of component degradation act on structural safety. The offline solver cannot be applied to this scenario as the state and action spaces dramatically increase for the system-level management. The online solver, such as Actor-Critic, can alleviate the computation for optimal policy searching, but it cannot tackle the time-varying problem. Future work in sequential decision-making for complex engineering systems includes using dedicated deep reinforcement learning methods to solve the non-homogeneous deterioration problem and explorative training algorithms to ensure the model convergences.

Conclusions

A comprehensive decision-making framework for addressing the intelligent life-cycle management based on a real-world SHM system is presented in this paper. To realize this framework, the continuous observation POMDP is chosen as the theoretical foundation on the basis of its capability of formulating optimal sequential decision-making under stochastic environments with uncertain action outcomes and noisy observations. Through the Markov chain, the fatigue process is quantified as the discrete state transition in POMDP, which provides an alternative way to assess the condition of welded details based on long-term monitoring data. Further, to forecast the cumulative stress cycles of welded details in the life-cycle, the BDLM is used on the basis of its mathematical properties for a flexible combination of different blocks to characterize the data features and uncertainty in dynamic responses. Finally, based on the characteristic of infrastructure management, this study demonstrates the value function is concave and monotone decreasing with the expected belief state increasing. Using this property, an adaptive discretization strategy is designed to efficiently compute the optimal or near-optimal policy for continuous observation POMDPs.

A constructive concept about state transition in POMDP is proposed in this paper. The hidden state degradation is not merely reliant on the prior state transition matrix because stationary matrices cannot reflect the dynamic environmental evolution and uncertainty in external impacts. The state transition is determined by the posterior matrix which simultaneously considers the prior state transition matrix (general law) and observations (actual situation). According to the statistical results of maintenance behavior on a minimum budget constraint, the following specific conclusions can be drawn: (i) In infrastructure management, three factors will significantly affect the maintenance policy in the life-cycle: maintenance utility, deterioration rate, and cost of action. The optimal solution is to find the trade-off between long-short-term utility and cost. (ii) The available maintenance actions present a competitive and complementary relationship based on their cost and utility. The decision-making follows the “barrel effect,” which means that maintenance action with low cost-performance is the longboard and it would not affect the barrel capacity (cost). (iii) When a welded detail approaches the end of its lifespan, interventions to extend the lifetime are uneconomical. The “Do nothing” action is executed until replacing the detail to save the cost as much as possible.

Footnotes

Appendix A

In this part, we introduce how to incorporate the properties of infrastructure management to aggregate many observation domains, effectively decreasing the complexity of observation space. Since the deterioration of the component is a slow process, the upper triangular matrix with unit-jump, such as Equation (5), is recommended to represent this process.⁶² Each STR ( $r_{i}$ ) in Equation (5) is greater than 1 to ensure the slow deterioration rate. It is also noted that the final state transition probability equal to 100% means the state’s probability would monotonically increase in service. In fact, the final state probability is consistent with the S-shape of the growth curve. Another feature of the maintenance problem is that the large cost value in the final state is set because of the unacceptable consequence of failure. Compared with failure cost, the maintenance fee is much smaller, such as Table 5 shows. Based on these features in infrastructure management, the mathematical properties of value coefficient ( $α$ ) in POMDPs can be derived as:

(35)

α = (α_{1}, α_{2}, \dots, α_{d}), 0 > α_{1} > α_{2} > . . . >> α_{d}

The proof of this property proceeds by induction. If $α_{e}$ represents the terminating value coefficient in the final belief state $b_{e}$ , the expected terminal value is given as:

(36)

V_{e} (b_{e}) = \sum_{s_{j} = 1}^{S} r (s_{j}, a) b_{e} (s_{j}) = α_{e} b_{e}

Since the failure state cost is much larger than others $r (s_{d}, a) << r (s_{j}, a) (d \neq j)$ , Equation (36) proves that $V_{e} (b_{e})$ conforms to function (36) at the final time step $n = e$ . Herein, assuming the cluster $α_{n + 1} \in Γ_{n + 1}$ follows the form of Equation (35), we need to prove that $α_{n}$ has the same form. Based on Equation (29) and unit-jump state transition matrix (5), the $α_{n} (s_{d})$ and $α_{n} (s_{d - 1})$ are given:

(37)

\begin{matrix} α_{n} (s_{d}) = r (s_{d}, a) \\ + γ \sum_{o_{n + 1} \in O} p (o_{n + 1} | s_{d}, a) p (s_{d} | s_{d}, a) α_{n + 1} (s_{d}) \end{matrix}

(38)

\begin{matrix} α_{n} (s_{d - 1}) = r (s_{d - 1}, a) \\ + γ \sum_{o_{n + 1} \in O} p (o_{n + 1} | s_{d}, a) p (s_{d} | s_{d - 1}, a) α_{n + 1} (s_{d}) \\ \times γ \sum_{o_{n + 1} \in O} p (o_{n + 1} | s_{d - 1}, a) p (s_{d - 1} | s_{d - 1}, a) α_{n + 1} (s_{d - 1}) \end{matrix}

With the definition of the state transition matrix, we primarily focus on the quantity in Equation (37). Since the $p (s_{d} | s_{d}, a) = 1$ , $γ = 0.95$ , and $\sum p (o_{n + 1} | s_{d}, a) = 1$ , the order of magnitude $α_{n} (s_{d})$ is almost same with the $α_{n + 1} (s_{d})$ . For Equation (38), the value of $p (s_{d} | s_{d - 1}, a)$ is always set as a small value, such as <0.1.^62–64 Therefore, the order of magnitude $r (s_{d - 1}, a)$ , $p (s_{d} | s_{d - 1}, a) α_{n + 1} (s_{d})$ , and $p (s_{d - 1} | s_{d - 1}, a) α_{n + 1} (s_{d - 1})$ is much smaller than $α_{n + 1} (s_{d})$ . Other value coefficients $α_{n} (s_{d - i})$ can be demonstrated in a similar procedure. After the property of value coefficient is proved, Equation (28) can be divided into two parts:

(39)

V (b) = \sum_{s_{j} = 1}^{S - 1} α (s_{j}) b (s_{j}) + α (s_{d}) b (s_{d})

Two mathematical characters could be summarized from Equation (39). First, in real-world infrastructure management, the condition of the component would gradually deteriorate before the replacement. Therefore, the value $V (b)$ monotonically decreases with component service time, as Figure 18 shows. Second, since the second item $α (s_{d}) b (s_{d})$ is dominant in the value function, Equation (39) will keep the same property with the $b (s_{d})$ which expresses the convexity at the initial phase. We can conclude that the value expression $V (b)$ will possess the characteristic of monotonicity and concavity. Based on these two properties, the mathematical principle that two monotonically decreasing concavity functions have at most two intersections in the plane is utilized to simplify the computation.

In continuous observation POMDPs, the rich observation spaces pose heavy computation for a standard solver that is required for explicit enumeration of the observations, since it needs a lossless discretization observation space.³⁵ However, explicit discretization is difficult in SHM since the distributions of sensor readings are dynamic altering with interaction with external factors. Back to the POMDPs, the solver is to find the upper boundary line of the value function $V_{max} (b)$ . It is equivalent to finding entire intersection points of functions $V (b) = α_{i} \cdot b$ in the belief domain. Because of the property of concavity and monotony, few intersection points among value functions indicate that a large proportion of the belief state value can be represented by a single value coefficient $α_{a}^{o}$ which is unnecessary to discretize the entire observation. In other words, the intersection points define the belief domain in which the continuous observation space needs to be finely discretized. The procedures are listed as follows:

In infrastructure management, replacing action is equivalent to resetting the POMDPs, and the boundary of this action is defined in steps 4–5. The approximate discretization of continuous observation can reduce the computational time, but it may cause another issue as Figure 18 shows. The value coefficient $α_{4}$ is missed due to the points between 1.56–1.68 belief domain are not sampled. This problem is tackled by step 6.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study has been supported by the Research Grant Council of Hong Kong (project no. PolyU 15219819 and PolyU 15221521). The support is gratefully acknowledged. The opinions and conclusions presented in this paper are those of the authors and do not necessarily reflect the views of the sponsoring organizations.

ORCID iD

You Dong

References

Wardhana

Hadipriono

FC.

Analysis of recent bridge failures in the United States. J Perform Constr Facil 2003; 17(3): 144–150.

Monitoring-based fatigue reliability assessment of steel bridges: analytical model and application. J Struct Eng 2010; 136(12): 1563–1573.

Wong

, et al. Statistical analysis of stress spectra for fatigue life assessment of steel bridges with structural health monitoring data. Eng Struct 2012; 45: 166–176.

Farreras-Alcover

Chryssanthopoulos

Andersen

JE.

Data-based models for fatigue reliability of orthotropic steel bridge decks based on temperature, traffic and strain monitoring. Int J Fatigue 2017; 95: 104–119.

Sohn

Czarnecki

Farrar

CR.

Structural health monitoring using statistical process control. J Struct Eng 2000; 126(11): 1356–1363.

Prakash

Sadhu

Narasimhan

, et al. Initial service life data to wards structural health monitoring of a concrete arch dam. Struct Control Health Monit 2018; 25(1): e2036.

Carden

Brownjohn

JM.

Arma modelled time-series classification for structural health monitoring of civil infrastructure. Mech Syst Signal Process 2008; 22(2): 295–314.

Dong

Shi

, et al. An improved fractal prediction model for forecasting mine slope deformation using gm (1, 1). Struct Health Monit 2015; 14(5): 502–512.

Yang

Bai

Forecasting structural strains from long-term monitoring data of a traditional Tibetan building. Struct Control Health Monit 2019; 26(1): e2300.

10.

Goulet

JA.

Bayesian dynamic linear models for structural health monitoring. Struct Control Health Monit 2017; 24(12): e2035.

11.

Solhjell

IK.

Bayesian forecasting and dynamic models applied to strain data from the göta river bridge. MS Thesis, University of Oslo, Norway, 2009.

12.

Wang

Zhang

A Bayesian approach for condition assessment and damage alarm of bridge expansion joints using long-term structural health monitoring data. Eng Struct 2020; 212: 110520.

13.

Wang

Bayesian dynamic forecasting of structural strain response using structural health monitoring data. Struct Control Health Monit 2020; 27(8): e2575.

14.

Frangopol

Kallen

Noortwijk

JMV.

Probabilistic models for life-cycle performance of deteriorating structures: review and future directions. Progress Struct Eng Mater 2004; 6(4): 197–212.

15.

Straub

Faber

MH.

Risk based inspection planning for structural systems. Struct Safety 2005; 27(4): 335–355.

16.

van Noortwijk

JM.

A survey of the application of gamma processes in maintenance. Reliability Eng System Safety 2009; 94(1): 2–21.

17.

Biondini

Frangopol

, et al. Life-cycle performance of deteriorating structural systems under uncertainty. J Struct Eng 2016; 142(9): F4016001.

18.

Srinivasan

Parlikad

AK.

Value of condition monitoring in infrastructure maintenance. Comput Ind Eng 2013; 66(2): 233–241.

19.

Memarzadeh

Pozzi

Value of information in sequential decision making: Component inspection, permanent monitoring and system-level scheduling. Reliab Eng Syst Saf 2016; 154: 137–151.

20.

Song

Zhang

Shafieezadeh

, et al. Value of information analysis in non-stationary stochastic decision environments: a reliability-assisted POMDP approach. Reliab Eng Syst Saf 2022; 217: 108034.

21.

Furuta

Kameda

Nakahara

, et al. Optimal bridge maintenance planning using improved multi-objective genetic algorithm. Struct Infrastruct Eng 2006; 2(1): 33–41.

22.

Liu

Frangopol

DM.

Multiobjective maintenance planning optimization for deteriorating bridges considering condition, safety, and life-cycle cost. J Struct Eng 2005; 131(5): 833–842.

23.

Dong

Frangopol

DM.

Risk-informed life-cycle optimum inspection and maintenance of ship structures considering corrosion and fatigue. Ocean Eng 2015; 101: 161–171.

24.

Yehia

Abudayyeh

Fazal

, et al. A decision support system for concrete bridge deck maintenance. Adv Eng Software 2008; 39(3): 202–210.

25.

Shani

Pineau

Kaplow

A survey of point-based POMDP solvers. Auton Agents Multi-Agent Syst 2013; 27(1): 1–51.

26.

Andriotis

Papakonstantinou

Chatzi

Value of structural health monitoring quantification in partially observable stochastic environments. arXiv preprint arXiv: 1912.12534, 2019.

27.

Papakonstantinou

Andriotis

Gao

, et al. Quantifying the value of structural monitoring for decision making. In: 13th international conference on applications of statistics and probability in civil engineering (ICASP13), Seoul, South Korea, 26–30 May 2019. Seoul: Seoul National University.

28.

Papakonstantinou

Andriotis

Shinozuka

POMDP and MOMDP solutions for structural life-cycle cost minimization under partial and mixed observability. Struct Infrastruct Eng 2018; 14(7): 869–882.

29.

Papakonstantinou

Andriotis

Shinozuka

Point-based POMDP solvers for life-cycle cost minimization of deteriorating structures. In: Life-cycle of engineering systems. CRC Press, 2016. pp. 427–434.

30.

Andriotis

Papakonstantinou

Chatzi

EN.

Value of structural health information in partially observable stochastic environments. Struct Saf 2021; 93: 102072.

31.

Yaylali

Ivy

JS.

Partially observable MDPs (POMDPs): introduction and examples. In: Wiley encyclopedia of operations research and management science. New York: Wiley, 2010.

32.

Jaulmes

Pineau

Precup

Learning in non-stationary partially observable markov decision processes. In: ECML workshop on reinforcement learning in non-stationary environments, 2005, Vol. 25, pp. 26–32.

33.

Chatzis

Kosmopoulos

. A non-stationary infinite partially-observable markov decision process. In: International conference on artificial neural networks, Hamburg, Germany, 15–19 September 2014, pp.355–362. Springer International Publishing.

34.

Garg

Hsu

Lee

WS.

Despot-alpha: online POMDP planning with large state and observation spaces. Robotics: science and systems. Freiburg im Breisgau, June 22–26, 2019.

35.

Hoey

Poupart

Solving POMDPs with continuous or large discrete observation spaces. In: IJCAI, 2005, pp. 1332–1338.

36.

Lim

Tomlin

Sunberg

ZN.

Sparse tree search optimality guarantees in POMDPs with continuous observation spaces. arXiv preprint arXiv:1910043322019.

37.

Hoerger

Kurniawati

An on-line pomdp solver for continuous observation spaces. In: 2021 IEEE international conference on robotics and automation (ICRA), Xi–An, China, pp.7643–7649. IEEE.

38.

Lassen

The effect of the welding process on the fatigue crack growth. Welding J 1990; 69: 75S–81S.

39.

Newman

Jr Raju

An empirical stress-intensity factor equation for the surface crack. Eng Fract Mech 1981; 15(1–2): 185–192.

40.

Bowness

Lee

Prediction of weld toe magnification factors for semi-elliptical cracks in t–butt joints. Int J Fatigue 2000; 22(5): 369–387.

41.

Lassen

Recho

Fatigue life analyses of welded structures. London: ISTE Ltd, 2006.

42.

Yang

Liu

, et al. Approach for fatigue damage assessment of welded structure considering coupling effect between stress and corrosion. Int J Fatigue 2016; 88: 88–95.

43.

De la Fuente

Díaz

Simancas

, et al. Long-term atmospheric corrosion of mild steel. Corrosion Sci 2011; 53(2): 604–617.

44.

Melchers

RE.

Probabilistic models for corrosion in structural reliability assessment—part 1: empirical models. J Offshore Mech Arct Eng 2003; 125(4): 264–271.

45.

Garbatov

Soares

Parunov

Fatigue strength experiments of corroded small scale steel specimens. Int J Fatigue 2014; 59: 137–144.

46.

Elsner

Cavalcanti

Ferraz

, et al. Evaluation of the surface treatment effect on the anticorrosive performance of paint systems on steel. Progr Org Coat 2003; 48(1): 50–62.

47.

Chandler

Bayliss

. Corrosion Protection of Steel Structures. London: Elsevier Applied Science, 1985.

48.

Agrawal

Kawaguchi

Chen

Deterioration rates of typical bridge elements in New York. J Bridge Eng 2010; 15(4): 419–429.

49.

Agrawal

Kawaguchi

Chen

, et al. Bridge element deterioration rates. Technical Report, Department of Transportation, New York, 2008.

50.

Bogdanoff

Kozin

Probabilistic models of cumulative damage. New York: Wiley-Interscience, 1985, p. 350.

51.

Petris

Petrone

Campagnoli

Dynamic linear models. In: Dynamic Linear Models with R. Springer, 2009, pp. 31–84.

52.

Nguyen

Gaudot

Goulet

JA.

Uncertainty quantification for model parameters and hidden state variables in Bayesian dynamic linear models. Struct Control Health Monit 2019; 26(3): e2309.

53.

Hauskrecht

Value-function approximations for partially observable markov decision processes. J Artif Intell Res 2000; 13: 33–94.

54.

Pineau

Gordon

Thrun

, et al. Point-based value iteration: an anytime algorithm for pomdps. In: IJCAI, vol. 3, 2003, pp. 1025–1032. Citeseer.

55.

Smallwood

Sondik

EJ.

The optimal control of partially observable markov processes over a finite horizon. Oper Res 1973; 21(5): 1071–1088.

56.

Reynolds

DA.

Gaussian mixture models. Encycl Biom 2009; 741: 659–663.

57.

Kurniawati

Hsu

Lee

WS.

Sarsop: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Robotics: Science and systems. CiteSeer. vol. 2008, 2008.

58.

Tom

Naman

Fatigue life analyses of welded structure. London: ISTE Ltd., 2006.

59.

SAR

HK.

Monthly transport information. https://www.td.gov.hk/tc/transport_in_hong_kong/transport_figures/monthly_traffic_and_transport_digest/2020/202011/index.html, 2019.

60.

Haagensen

Maddox

Specifications for weld toe improvement by burr grinding, tig dressing and hammer peening for transverse welds. IIW Commission XIII-Working Group 2. Genoa, Italy: International Institute of Welding, 2001.

61.

Cesare

Santamarina

Turkstra

, et al. Modeling bridge deterioration with markov chains. J Transp Eng 1992; 118(6): 820–833.

62.

Cavalline

Whelan

Tempest

, et al. Determination of bridge deterioration models and bridge user costs for the NCDOT bridge management system. Technical Report No. FHWA/NC/2014-07, 2015.

63.

Wei

Bao

Optimal policy for structure maintenance: a deep reinforcement learning framework. Struct Saf 2020; 83: 101906.

64.

Lei

Xia

Deng

, et al. A deep reinforcement learning framework for life-cycle maintenance planning of regional deteriorating bridges using inspection data. Struct Multidiscip Optim 2022; 65(5): 1–18.

SHM-informed life-cycle intelligent maintenance of fatigue-sensitive detail using Bayesian forecasting and Markov decision process

Abstract

Keywords

Introduction

Methodology

Markov-chain-based fatigue assessment

Fatigue model

Markov Chain model

Bayesian dynamic linear models

Block component

Sequential updating

Model parameters estimation

Partially observable Markov decision processes

Discrete model

Continuous observation model

Point-based algorithms in solving POMDP

Illustrative example

Data processing

BDLM establishing

Management model establishing

Results and discussions

Conclusions

Footnotes

Appendix A

Declaration of conflicting interests

Funding

ORCID iD

References