Deep recurrent-convolutional neural network learning and physics Kalman filtering comparison in dynamic load identification

Abstract

The dynamic structural load identification capabilities of the gated recurrent unit, long short-term memory, and convolutional neural networks are examined herein. The examination is on realistic small dataset training conditions and on a comparative view to the physics-based residual Kalman filter (RKF). The dynamic load identification suffers from the uncertainty related to obtaining poor predictions when in civil engineering applications only a low number of tests are performed or are available, or when the structural model is unidentifiable. In considering the methods, first, a simulated structure is investigated under a shaker excitation at the top floor. Second, a building in California is investigated under seismic base excitation, which results in loading for all degrees of freedom. Finally, the International Association for Structural Control-American Society of Civil Engineers (IASC-ASCE) structural health monitoring benchmark problem is examined for impact and instant loading conditions. Importantly, the methods are shown to outperform each other on different loading scenarios, while the RKF is shown to outperform the networks in physically parametrized identifiable cases.

Keywords

Gated recurrent unit long short-term memory artificial neural networks deep one-dimensional convolutional networks machine learning-intelligence Kalman filter-based structural force identification-estimation unknown input-load structural health monitoring

Introduction

The four fundamental structural identification problems in civil structures¹ are (i) computing the dynamic responses with known structure parameters and loads,^2–12 (ii) solving or recovering the structural parameters based on known responses and excitations,^13–16 (iii) identifying or estimating the structure input loads using some structure parameters and responses,^17–28 and (iv) identifying or estimating the structure input loads using only the structure responses.^29–41 The first one is a forward problem, while the rest are inverse problems. While the forward problem has been researched extensively for a long time, the inverse ones, and in particular the load identification or estimation, attracted attention only recently.

The structural load, specifically, is an integral part of the system identification and monitoring process as the analytical or the numerical structural models, inevitably, are much better calibrated by input–output identification processes.^42–44 It is particularly useful for civil engineering structures since the loading is difficult to be estimated or measured due to the stochastic environment. In the input–output identification scenario, though, the required input cannot always be measured, or the measurement of the input may be more unreliable than what is demanded. For instance, there is not a reliable means of accurately measuring the traffic and wind load on large structural systems.^45–47

To this end, several methodologies have been created to provide the structural load identification, but often they are either refer to linear systems or the methodologies are examined only on systems with no input at some degrees of freedom (DOFs). This is not the case for all civil structures; for instance, the ones which are excited at the base. Furthermore, many works examine inputs which have zero mean value, that is, white noise or some seismic excitations, excluding cases such as a hammer dynamic test scenario. As a result, these methods do not always succeed on realistic complex civil structures and a need for further investigation arises.

The importance of the dynamic structural load identification, specifically, is highlighted by the fact that a more detailed model needs to fit the parameters with even greater accuracy. This requires proper parameter sensitivity in order for the parameters to be estimated correctly. Furthermore, the load identification is also beneficial for the optimal sensor placement.^48–55 The philosophy behind those approaches is to minimize the information entropy after quantifying the uncertainty in the system parameters. This is used as a sensor configuration performance measure. The knowledge of the structural loading significantly improves the uncertainty quantification in the structural identification and, as a result, leads to a better estimation of the information gained during the model updating process.

Output-only system identification techniques, on the other hand, have also a long history of assessing the structural condition when performed during their normal operation with ambient vibration data. In this direction, the stochastic modal identification techniques are introduced from output-only data, combining high computational robustness efficiency with high estimation accuracy. To address the nonautomated identification issue in output-only procedures extensive research is performed, and it is still ongoing. Rainieri and Fabbrocino⁵⁶ presented a literature review for the most common automated output-only dynamic identification techniques. However, those methodologies are very sensitive to the noise level which often results in inaccurate estimates. This highlights the importance of identifying the loading.

To address the challenge of load identification, the deep learning architecture libraries are employed in this work. In the last few years, machine and deep learning resulted in a substantial impact on a variety of civil engineering problems,^57–59 or other problems such as visual recognition, speech recognition, and natural language processing. Among different types of deep neural networks, convolutional neural networks are studied the most.^60–74 The convolutional neural network (CNN) is a deep learning architecture inspired by the natural visual perception mechanism of the living creatures. Hubel and Wiesel⁷⁵ noticed that cells in animal visual cortex are responsible for detecting light in receptive fields. Based on this, Kunihiko Fukushima proposed the neocognitron,⁷⁶ which could be regarded as the predecessor of CNN. LeCun et al.,⁷⁷ later, developed a multilayer artificial neural network called LeNet-5 which could classify handwritten digits.⁷⁷

To overcome the shortcoming of deep neural networks of being difficult to be trained^78,79 due to the exploding-vanishing gradient issue when learning long-term dependencies, the long short-term memory (LSTM) architecture⁸⁰ is introduced. Importantly, the LSTM network is designed to capture long-range data dependencies on modeling sequential data such as the dynamic load, and shows a great potential in modeling structural loading time series,⁸¹ or in other applications.^6,82,83

In the same direction, the gated recurrent unit (GRU) neural networks⁸⁴ have shown success in several applications involving sequential or temporal data.⁸⁵ GRU success is attributed to the gating network signaling. This controls how the present input and previous memory is used to update the current activation and produce the current state. These gates have their own sets of weights which are adaptively updated in the learning phase.

Intelligent methods for dynamic load identification⁸⁶ currently focus on vehicle loads,^87–89 component and mechanical structures loading,^81,90–103 bridge cables loading,¹⁰⁴ and power loads.¹⁰⁵ They have been recently investigated in structural dynamics, but with pseudo-experimental data at the Pirelli Tower in Milan, Italy.¹⁰⁶

In this work which focuses on full scale building structures with real experimental data, the structural response and loading are employed to train the neural networks, and finally predict unseen loading data. In doing so, the work contributes to the dynamic load identification research assessing the networks in the uncertain outcome related to obtaining poor predictions when in civil engineering applications only a low number of tests are performed or are available, or when the structural system is unidentifiable with physical parameter-based modeling. The networks are compared when overcoming those issues, while Kalman filter physics-based alternatives, without assuming more information such as known system parameter of the model,⁴¹ are also employed. A realistic small dataset scenario prone to outliers is investigated for civil engineering applications in contrast to hundred or even thousands of available data which are assumed for other applications. This problem is crucial since it potentially leads to overfitting the model when it is adjusted excessively to the training data, seeing patterns that do not exist, and consequently performing poorly in predicting new data.

The work is organized as follows: the LSTM network is overviewed in section “Dynamic load identification using the LSTM neural networks,” while in section “Dynamic load identification using the GRU neural networks,” the improved and faster GRU is presented. The standard CNN architecture is provided in section “Dynamic load identification using 1D CNNs,” as well as a discussion on the one-dimensional and the multidimensional CNN versions. The physics-based RKF is then presented in section “Dynamic load identification using physics-based residual Kalman filtering.” Importantly, sections “Structural loading identification in a 6-story building,”“Structural loading identification for a hotel in San Bernardino,” and “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem” investigate applications on both simulated and real structures, as well as on both continuous and impact loading cases. Subsequently, section “Discussion” presents a discussion and section “Future research” future research suggestions. Finally, the conclusions are provided in section “Conclusion.”

Dynamic load identification using the LSTM neural networks

The LSTM neural networks are a type of recurrent neural networks (RNNs) which are designed to handle the vanishing gradient problem in the traditional RNNs. The vanishing gradient problem occurs when the gradients of the error function with respect to the weights in the RNN become very small. This makes it difficult for the network to learn long-term dependencies.

The standard LSTM architecture consists of several memory cells that can store information for long periods of time, as well as several gates which regulate the flow of information into and out of the cells, seen in Figure 1. The gates are controlled by sigmoid activation functions and can either allow or prevent information from passing through. The LSTM cell has three gates: the forget gate, the input gate, and the output gate. The forget gate determines which information to discard from the previous cell state, the input gate determines which new information to add to the current cell state, and the output gate determines which information to output from the current cell state. The equations governing the LSTM cell are written as

\begin{matrix} f_{t} = σ_{g} (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f}) \\ i_{t} = σ_{g} (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i}) \\ o_{t} = σ_{g} (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o}) \\ {\tilde{c}}_{t} = σ_{c} (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c}) \\ c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t} \\ h_{t} = o_{t} ⊙ σ_{h} (c_{t}) \end{matrix}

(1)

where $W_{f}, W_{i}, W_{o}, W_{c}, U_{f}, U_{i}, U_{o} U_{c}$ , are the weight matrices, $b_{f}, b_{i}, b_{o}, b_{c}$ are the bias vectors, and $σ$ is the sigmoid function. The initial values are $c_{0} = 0$ and $h_{0} = 0$ , the operator ⊙ denotes the Hadamard product, and the subscript $t$ indexes the time step. Additionally, $x_{t} \in R^{d}$ is the input vector to the LSTM unit, $f_{t} \in (0, 1)^{h}$ is the forget gate’s activation vector, $i_{t} \in (0, 1)^{h}$ is the input/update gate’s activation vector, $o_{t} \in (0, 1)^{h}$ is the output gate’s activation vector, $h_{t} \in (- 1, 1)^{h}$ is the hidden state vector also known as output vector of the LSTM unit, ${\tilde{c}}_{t} \in (- 1, 1)^{h}$ is the cell input activation vector, $c_{t} \in R^{h}$ is the cell state vector, and the superscript $h$ refers to the number of hidden units. Finally, $σ_{g}$ is the sigmoid function, while $σ_{c}$ and $σ_{h}$ are the hyperbolic tangent functions.

Figure 1.

Examined network architecture for all applications as described in section “Structural loading identification in a 6-story building.” The LSTM–GRU–Conv layer is replaced each time by each one of the three considered layers. The dropout layer is removed in the 1D-CNN case.

To train a LSTM neural network, the structural load and response set of input–output pairs are provided, also known as training data. During training, the network adjusts its weights and biases to minimize a loss function, which measures the difference between the predicted and actual output values. The loss function used to train the LSTM depends on the specific task. In the context of predicting structural loading based on the structure response, a common choice is the mean squared error between the predicted and actual loading values, mathematically written as

L = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - y_{i, true})}^{2}

(2)

where $N$ is the number of samples, $y_{i}$ is the predicted loading value at position $i$ , and $y_{i, true}$ is the true loading value at position $i$ . The LSTM network is finally trained using backpropagation through time, which involves computing the gradient of the loss function with respect to the weights and biases at each time step. These are adjusted using an optimization algorithm such as the stochastic gradient descent.

Dynamic load identification using the GRU neural networks

GRU neural networks, on the other hand, are another type of RNN which, similar to the LSTM case, are designed to handle the vanishing gradient problem in traditional RNNs. GRUs are similar to LSTMs in that they also use gating mechanisms to regulate the flow of information. They are simpler and more computationally efficient, though.

The GRU architecture also consists of memory cells that can store information for long periods of time, as well as several gates that regulate the flow of information into and out of the cells (Figure 1). The gates are controlled by sigmoid activation functions and can either allow or prevent information from passing through. The GRU cell has two gates: the reset gate and the update gate. The reset gate determines how much of the previous state to forget, while the update gate determines how much of the new state to add to the current state. The equations governing the GRU cell are written as

\begin{matrix} z_{t} = σ (W_{z} [h_{t - 1}, x_{t}] + b_{z}) \\ r_{t} = σ (W_{r} [h_{t - 1}, x_{t}] + b_{r}) \\ {h'}_{t} = \tan h (W_{h} [r_{t} ⊙ h_{t - 1}, x_{t}] + b_{h}) \\ h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ {h'}_{t} \end{matrix}

(3)

where $x_{t}$ is the input vector, $h_{t}$ is the output vector, $h'_{t}$ is the candidate activation vector, $z_{t}$ is the update gate vector, $r_{t}$ is the reset gate vector, and $W$ and $b$ are the parameter matrices and vector, respectively. Finally, $σ$ is the original logistic function.

To train a GRU neural network, the structural load and response set of input–output pairs are also provided. It adjusts its weights and biases to minimize a loss function which measures the difference between the predicted and actual output values. The loss function used to train the GRU also depends on the specific task. In the context of predicting structural loading based on structural response, a common choice is the mean squared error between the predicted and actual loading signals. The GRU network is trained using backpropagation through time, similar to LSTMs, which involves computing the gradient of the loss function with respect to the weights and biases at each time step.

Dynamic load identification using 1D CNNs

Finally, the one-dimensional convolutional neural networks (1D CNNs) have been proven to be highly effective in a variety of signal processing tasks. The fundamental building block of a 1D CNN is the convolutional layer (Figure 1). A convolutional layer applies a set of filters to the input signal, producing a set of feature maps. The filters have a fixed size and slide over the input signal, computing a dot product at each location. The resulting feature maps capture different aspects of the input signal, such as local trends and patterns.

The applied 1D CNN compares to the multidimensional counterparts as follows: A one-dimensional configuration fuses the feature extraction and the learning phases of the dynamic states. One-dimensional arrays are used instead of two-dimensional matrices for both the kernels and the feature maps. Additionally, the network architecture has the hidden neurons of the convolution layers which perform both the convolution and the subsampling operations. Accordingly, the convolution and the lateral rotation are replaced by their one-dimensional counterparts, namely the convolution and the reverse operations. Finally, the parameters for the kernel size and the subsampling are scalars. Importantly, this simplified structure of the convolution neural network requires only one-dimensional convolutions and therefore, a mobile and low-cost hardware implementation for near real-time applications. The convolution operation is represented mathematically as

h_{i} = f (\sum_{j = 0}^{m - 1} w_{j} x_{i + j} + b)

(4)

where $h_{i}$ is the output at position $i$ , $w_{j}$ is the weight of the jth filter, $x_{i + j}$ is the input signal at position $i + j$ , $b$ is the bias term, and $f$ is the activation function.

In practice, a 1D CNN may have multiple convolutional layers with different filter sizes and numbers of filters. Each layer can apply a different set of filters to the input signal, allowing the network to capture different aspects of the signal at different scales. No additional layers are assumed in this work for the 1D CNN (such as pooling layers) to compare fairly all networks.

To train the 1D CNN, the structural load and response set of input–output pairs are also provided. During training, the network adjusts its weights and biases to minimize the loss function, which measures the difference between the predicted and actual loading values. This is done using an optimization algorithm such as the stochastic gradient descent. This updates the weights and biases based on the gradient of the loss function.

For all three networks discussed in this work, in addition to the training data, it is important to have a separate set of validation data to monitor the training performance of the network to avoid overfitting. The validation data are used to evaluate the network’s performance on unseen data, and the training process can be stopped early if the performance on the validation data starts to deteriorate.

Dynamic load identification using physics-based residual Kalman filtering

For the mathematical implementation of the unknown input residual-based Kalman filter⁴¹ consider the process equation in the continuous-time and the state-space format:

\overset{\cdot}{z} = A z + B u

(5)

where $A (θ)$ is the system matrix depended on the unknown parameter vector $θ$ , $B$ is the distribution matrix of the input $u$ , and $z$ is the dynamic state vector.

The discrete-time transformation of the system and the input matrices is provided by the zero-order hold assumption for the input in between the time instants $k Δ t$ , as:

A_{d} = e^{A Δ t} \approx I_{2 n \times 2 n} + Δ t A + \frac{Δ t}{2} A^{2}

(6)

and

B_{d} = \int_{0}^{Δ t} e^{A τ} B d τ = A^{- 1} [A_{d} - I_{2 n \times 2 n}] B \approx Δ t B

(7)

The state-space model of Equation (5) in the discrete-time, including the noise term $w_{k}$ , is written as

z_{k + 1} = A_{d} z_{k} + B_{d} u_{k}^{e} + w_{k}

(8)

where $u_{k}^{e}$ and $A_{d} (θ_{k})$ are the estimated input and system matrix of the prior step which is considered as known quantities at the $k + 1$ step.

The equation which relates the measurements $y$ to the estimated dynamic states is written as

y_{k + 1} = H z_{k + 1} + w_{k + 1}^{y}

(9)

where $H$ is the observation matrix mapping the measurements to the dynamic states. Here, it is chosen to not depend on the unknown parameters and input. To this end and for limited information applications, $y$ consists of displacement and velocity pseudo-measurements; the integrated of the actual acceleration measurements. Additionally, the accelerations which are not measured are assumed to be equal to the estimated accelerations of the previous step. Specifically, the matrix $H$ is introduced as the observation matrix mapping measurements to dynamical states. This matrix only accommodates displacements and velocities observed from all DOFs with real or pseudo-measurements.

It may seem here that the acceleration responses are not covered by the observation matrix. However, this is chosen intentionally since it addressees two problems. First, the unknown input and parameters have not yet been estimated for the step $k + 1$ . Second, the prior step parameters and input possibly affect negatively the observation equation when they are inaccurate.

More importantly, the presented observation model reflects the model for the pseudo-measurements rather than the actual measured quantities. In that case, the actual observation model, which relates the observed quantities to the state vector, is not defined. To clarify how different measurement scenarios are accommodated within this approach and at which step they weigh in, the reader is referred to Impraimakis and Smyth.⁴¹

The predicted covariance matrix $P_{k + 1}$ of the dynamic states is then written as

P_{k + 1} = A_{d} P_{k} A_{d}^{T} + Q_{d (k)}

(10)

where the discretized process and observation covariance matrices are

Q_{k - 1} \approx \frac{Q ((k - 1) Δ t)}{Δ t}, R_{k} = \frac{R (k Δ t)}{Δ t}

(11)

It is assumed, though, that the matrices are constant during the whole process, where being constant does not harm the estimation success. An investigation of their exact value, which importantly highly affects the success of the estimation,¹⁰⁷ is shown in Refs. 41 and 108.

Having provided the posterior prediction model for the dynamic states and their covariances, the update process starts according to the Kalman filter. The updated dynamic state estimate is specifically derived by a correction of the predicted dynamic states using the measurement pre-fit residual. This is multiplied and controlled by the optimal Kalman gain $J$ , given as

J_{k + 1} = P_{k + 1} H^{T} N_{k + 1}^{- 1}

(12)

where the pre-fit residual covariance $N$ is

N_{k + 1} = H P_{k + 1} H^{T} + R_{d}

(13)

The final estimation of the posterior dynamic states is then given by

z_{k + 1} = z_{k + 1} + J_{k + 1} (y_{k + 1} - H z_{k + 1})

(14)

while the final estimation of the covariance of the dynamic states is given by

P_{k + 1} = (I_{n \times n} - J_{k + 1} H) P_{k + 1}

(15)

For Equations (14) and (15), the same quantity on the right and left hand side implies that they are re-calculated at the same time step. The a priori estimate of the right hand side is used for the calculation of the a posteriori estimate on the left hand side.

Once the dynamic states are filtered using the pseudo-measurements and with the use of the parameters of the prior step, the input at the current step is approximated by the system model at the time instant $(k + 1) d t$ as

u_{k + 1}^{e} \approx G ({\overset{\cdot\cdot}{x}}_{k + 1}^{m}, z_{k}, θ_{k})

(16)

where $G (•)$ is the linear or nonlinear system model, which contains the prior step estimated parameters. Importantly, the predicted states are estimated using Equation (14); with the prior input and parameters only. The known input rows of $u_{k + 1}^{e}$ are replaced by the potential known zero or nonzero valued inputs.

For instance, the full expression of $G (•)$ function for a linear structural system model is written as

G ({\overset{\cdot\cdot}{x}}_{k + 1}^{m}, z_{k}, θ_{k}) = M^{m} a_{k + 1} + [K_{k} C_{k}] z_{k + 1}

(17)

where ${{}^{m}a}_{k + 1}$ are the acceleration measurements, and $M$ , $K_{k}$ , $C_{k}$ are the mass, stiffness, and damping matrices, respectively.

For the parameter estimation, a sensitivity analysis approach is implemented by the Taylor series expansion truncated after the linear term. To provide a real-time estimation specifically, the measured outputs are chosen to be accelerations instead of the modal parameters, written as

ϵ_{k + 1} =^{m} a_{k + 1} - a_{k + 1} \approx r_{k + 1} + U_{k + 1} (θ - θ_{k + 1})

(18)

where $ϵ_{k + 1}$ , ${{}^{m}a}_{k + 1}$ , and $a_{k + 1}$ denote the error, the acceleration measurements, and the predicted output, respectively, at the step $k + 1$ . The sensitivity matrix $U_{k + 1}$ , which does not need an initial value or prior information, is written as

U_{k + 1} = - {[\frac{\partial {{}^{m}a}_{k + 1}}{\partial θ}]}_{θ = θ_{k + 1}}

(19)

where the error $ϵ_{k + 1}$ is assumed to be small for the parameter vector $θ$ in the vicinity of $θ_{k + 1}$ .

At each step, Equation (18) is solved by a Gauss–Newton gradient approach. The prior parameter estimates are corrected as

θ_{k + 1} = θ_{k} + Δ θ_{k + 1} \cdot e^{- μ {‖ ρ_{k + 1} ‖}_{2}}

(20)

where $μ$ is a scaling parameter and $‖ ρ_{k + 1} ‖_{2}$ is the Euclidean norm of the residual of the system model estimation. In practice, $e^{- μ {‖ ρ_{k + 1} ‖}_{2}}$ acts as a control factor for the convergence speed and fluctuation range. An investigation of this scaling parameter is shown in Refs. 41 and 108. A similar investigation can be done to define it for different types of model parameters within various dynamic systems.

For Equation (20), the residual of the system model estimation is

ρ_{k + 1} = u_{k + 1}^{e} - G ({\overset{\cdot\cdot}{x}}_{k + 1}^{m}, {\overset{\cdot}{x}}_{k + 1}, x_{k + 1}, θ_{k})

(21)

where $u_{k + 1}^{e}$ is the estimated input for the step $k + 1$ , and the dynamic states are provided by the Kalman filter.

For the objective function, the least square approach is formulated based on an additional scaling parameter $λ^{2}$ . This balances the contribution of the parameter estimates. The final optimal $Δ θ_{k + 1}$ correction is provided by

Δ θ_{k + 1} = {[U_{k + 1}^{T} U_{k + 1} + λ^{2} I]}^{- 1} U_{k + 1}^{T} ρ_{k + 1}

(22)

where $λ^{2}$ remains constant during the real-time procedure. An investigation of this scaling parameter is shown in Refs. 41 and 108. A similar investigation can be done to set both scaling parameters for different types of model parameters within various dynamic systems. Importantly, it is seen here that the transitional model assumed for the system parameters is involved in the full input-parameter-state estimation. Taking partial derivatives is then required. Also, the scaling factor is tied to the difference between the estimated and the predicted input forces. The nature, the order of magnitude, and the governing equations for the input and model parameters are different, but this approach shows to be beneficial in yielding stable estimates for the model parameters.

Regarding the derivation process of Equation (22), it is provided as the optimal solution of the objective function minimization. Here, the scaling parameter $λ^{2}$ , which balances the contribution of the parameter estimation and the importance of the error $ϵ$ , is written as

F (θ_{k + 1}) = ϵ_{k + 1}^{T} W_{ϵ} ϵ_{k + 1} + λ^{2} Δ θ_{k + 1}^{T} W_{θ} Δ θ_{k + 1}

(23)

where a penalization exists for the differences between the estimated parameters and the output error. Further derivation details are provided in Impraimakis and Smyth.⁴¹

Structural loading identification in a6-story building

For the numerical application of the GRU network, LSTM network, convolutional network, and residual-based Kalman filter for structural load identification with small datasets, consider the 6-story shear-type structure of Figure 2. The structure is described by the following equation:

M {\begin{matrix} {\overset{\cdot\cdot}{y}}_{1} (t) \\ {\overset{\cdot\cdot}{y}}_{2} (t) \\ {\overset{\cdot\cdot}{y}}_{3} (t) \\ {\overset{\cdot\cdot}{y}}_{4} (t) \\ {\overset{\cdot\cdot}{y}}_{5} (t) \\ {\overset{\cdot\cdot}{y}}_{6} (t) \end{matrix}} + C {\begin{matrix} {\overset{\cdot}{y}}_{1} (t) \\ {\overset{\cdot}{y}}_{2} (t) \\ {\overset{\cdot}{y}}_{3} (t) \\ {\overset{\cdot}{y}}_{4} (t) \\ {\overset{\cdot}{y}}_{5} (t) \\ {\overset{\cdot}{y}}_{6} (t) \end{matrix}} + K {\begin{matrix} y_{1} (t) \\ y_{2} (t) \\ y_{3} (t) \\ y_{4} (t) \\ y_{5} (t) \\ y_{6} (t) \end{matrix}} = {\begin{matrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ F_{6} (t) \end{matrix}}

(24)

for a shaker-type load input $F_{6} (t)$ at the top floor, namely at DOF 6, where the structure matrices to generate the simulated response measurements are

M = [\begin{matrix} m_{1} & 0 & 0 & 0 & 0 & 0 \\ 0 & m_{2} & 0 & 0 & 0 & 0 \\ 0 & 0 & m_{3} & 0 & 0 & 0 \\ 0 & 0 & 0 & m_{4} & 0 & 0 \\ 0 & 0 & 0 & 0 & m_{5} & 0 \\ 0 & 0 & 0 & 0 & 0 & m_{6} \end{matrix}] = [\begin{matrix} 100 & 0 & 0 & 0 & 0 & 0 \\ 0 & 100 & 0 & 0 & 0 & 0 \\ 0 & 0 & 100 & 0 & 0 & 0 \\ 0 & 0 & 0 & 100 & 0 & 0 \\ 0 & 0 & 0 & 0 & 100 & 0 \\ 0 & 0 & 0 & 0 & 0 & 100 \end{matrix}],

\begin{matrix} C = [\begin{matrix} c_{1} + c_{2} & - c_{2} & 0 & 0 & 0 & 0 \\ - c_{2} & c_{2} + c_{3} & - c_{3} & 0 & 0 & 0 \\ 0 & - c_{3} & c_{3} + c_{4} & - c_{4} & 0 & 0 \\ 0 & 0 & - c_{4} & c_{4} + c_{5} & - c_{5} & 0 \\ 0 & 0 & 0 & - c_{5} & c_{5} + c_{6} & - c_{6} \\ 0 & 0 & 0 & 0 & - c_{5} & c_{6} \end{matrix}] \\ = [\begin{matrix} 25 + 25 & - 25 & 0 & 0 & 0 & 0 \\ - 25 & 25 + 50 & - 50 & 0 & 0 & 0 \\ 0 & - 50 & 50 + 50 & - 50 & 0 & 0 \\ 0 & 0 & - 50 & 50 + 75 & - 75 & 0 \\ 0 & 0 & 0 & - 75 & 75 + 75 & - 75 \\ 0 & 0 & 0 & 0 & - 75 & 75 \end{matrix}], \end{matrix}

\begin{matrix} K = & [\begin{matrix} k_{1} + k_{2} & - k_{2} & 0 & 0 & 0 & 0 \\ - k_{2} & k_{2} + k_{3} & - k_{3} & 0 & 0 & 0 \\ 0 & - k_{3} & k_{3} + k_{4} & - k_{4} & 0 & 0 \\ 0 & 0 & - k_{4} & k_{4} + k_{5} & - k_{5} & 0 \\ 0 & 0 & 0 & - k_{5} & k_{5} + k_{6} & - k_{6} \\ 0 & 0 & 0 & 0 & - k_{5} & k_{6} \end{matrix}] \\ = & [\begin{matrix} 900 + 900 & - 900 & 0 & 0 & 0 & 0 \\ - 900 & 900 + 1100 & - 1100 & 0 & 0 & 0 \\ 0 & - 1100 & 1100 + 1100 & - 1100 & 0 & 0 \\ 0 & 0 & - 1100 & 1100 + 1300 & - 1300 & 0 \\ 0 & 0 & 0 & - 1300 & 1300 + 1300 & - 1300 \\ 0 & 0 & 0 & 0 & - 1300 & 1300 \end{matrix}] \end{matrix}

with initial conditions $y (0) = [0 0 0 0 0 0]^{T}$ and $\overset{\cdot}{y} (0) = [0 0 0 0 0 0]^{T}$ . The input load at floor six is a harmonic loading decaying exponentially with various amplitude levels, angular frequencies, and unknown time instant of application.

Figure 2.

6-story shear-type building structure of section “Structural loading identification in a 6-story building.”

In order to create synthetic measurements, the Runge Kutta fourth order method of integration is utilized to compute the system response for 200 s. The sampling frequency for the dynamic state measurements is considered to be 100 Hz. Therefore, the time discretization $Δ t$ used in the Runge–Kutta numerical solution is 0.01 s. Finally, to consider measurement noise, each response signal is contaminated by a Gaussian white noise sequence with a $5 %$ root-mean-square noise-to-signal ratio. Different levels of noise are investigated in section “Discussion.”

A total of 21 available datasets from the simulations are formatted and divided into three subsets, including 11 datasets for training, four datasets for validation, and six datasets for prediction of the structural loading. For the Kalman filter, the identification is performed in real time, without any training. Importantly, for the shear-type building study, despite being numerical, the datasets for training, validation and test are so small, for instance, only 11 datasets for training, to match and directly compare to section “Structural loading identification for a hotel in San Bernardino” case, where also 11 datasets for training are used.

The neural network architectures are defined as follows in Figure 1: An input layer with the 11 signals for each one of the three network types. A GRU, or a LSTM or a convolutional layer with 30 units. Therefore, the dimension of the output vector is 30, while the batch-size equals to 2. A rectifier layer, termed also as ReLu is also set, as well as a dropout layer of 0.3 for the first two networks. An additional GRU, or a LSTM or a convolutional layer with 30 units is set with an additional activation layer and dropout layer for the first two cases. Finally, 100 neural density is defined for all cases, along with activation and dense layers. The learning rate is defined as 0.0001. The Adaptive Momentum Estimation (Adam) algorithm is used for the network optimization¹⁰⁹ and the number of epochs is 10,000. It is generally known that the performance of deep neural network is overly dependent on the setting of hyperparameters. The author set the network parameters according to Kingma and Ba¹⁰⁹ without any special adjustments that would potentially favor the dynamic load identification problem. Importantly, this architecture and the number of hidden units were selected as they have been proven efficient in a number of structural engineering applications.^6,110,111 Last but not least, investigation on the dropout layer hyperparameter, or the number of layers is shown in section “Discussion.”

For the RKF, the process covariance $Q_{d}$ and the measurement covariance $R_{d}$ matrices are chosen to be constant during the identification process and equal to $10^{0} \cdot I_{6 \times 6}$ and $10^{- 10} \cdot I_{6 \times 6}$ , respectively. The parameter $λ^{2}$ is chosen to be $5 \times 10^{- 2}$ , while the parameter $μ$ is chosen to be $5 \times 10^{- 3}$ .

All three cases are examined on Figures 3 to 5. They show the true and identified structural load (first column) for the six unknown predicted datasets where the network never trained or validated. The load identification error is also seen at floor six (second column), as well as the comparison to the Kalman filter (third column). In all cases, acceleration measurement are only selected from story 3, 5, and 6. For a different combination or number of measurements, different convergence timing is observed.

Figure 3.

Structure of section “Structural loading identification in a 6-story building”: results for the 6-story shear-type building when the LSTM neural network is used. First column: true and estimated loading at floor 6. Second column: error at loading identification. Third column: Residual-based Kalman filter performance.

Figure 4.

Structure of section “Structural loading identification in a 6-story building”: results for the 6-story shear-type building when the gated recurrent unit neural network is used. First column: true and estimated loading at floor 6. Second column: error at loading identification. Third column: Residual-based Kalman filter performance.

Figure 5.

Structure of section “Structural loading identification in a 6-story building”: results for the 6-story shear-type building when the convolutional neural network is used. First column: true and estimated loading at floor 6. Second column: error at loading identification. Third column: Residual-based Kalman filter performance.

Figure 3 refers to the case where the LSTM neural network is used. The results are satisfactory for all six cases. The exemption of the first loading instances is related to the slightly wrong estimation of the load phase or amplitude.

Figure 4 refers to the case where the GRU neural network is used. The results are also satisfactory for all six cases. However, a convergence time improvement is seen compared to the LSTM neural network case, as discussed in Table 2 of section “Discussion.” Once more, but at a lower level, the first loading instances are not satisfactory for the same reasons as in Figuer 3.

Figure 5 refers to the case where the CNN is used. The results for all six cases are not as satisfactory as with the previous networks. However, a clear and significant convergence time reduction is observed. Importantly, for all cases, the Kalman filter approach provided a better accuracy.

So far, only time-historical error is provided to indicate the performance of presented approaches. To use a comprehensive evaluation metric, the accumulated error at each time instant is employed as

E (t) = \sum_{t_{k} = 0}^{t} | \frac{u_{pred} (t_{k}) - u_{true} (t_{k})}{u_{true} (t_{k})} |

(25)

where $t_{k}$ is the time instant at step k, and $u_{pred (t_{k})}$ and $u_{true} (t_{k})$ are the predicted and true load at $t_{k}$ .

Figure 6 refers to the case where the error metric E(t) of Equation (25) is used. The results show that the LSTM network performs better but with higher computation cost, which is shown in Table 2 of section “Discussion.” The Kalman filter has the lower error, summarized also in Table 1.

Figure 6.

Structure of section “Structural loading identification in a 6-story building”: results for the 6-story shear-type building when the error metric E(t) of Equation (25) is used. First column: LSTM neural network. Second column: GRU neural network. Third column: CNN. Fourth column: Residual-based Kalman filtering.

Table 1.

Final value of error metric E(t) of Equation (25) for the 6-story building of section “Structural loading identification in a6-story building.”

Case	LSTM network	GRU network	Conv network	Residual KF
• DOF 6 Loading 1	$358.8$	$546.3$	$933.7$	$15.9$
• DOF 6 Loading 2	$296.4$	$550.3$	$856.8$	$49.8$
• DOF 6 Loading 3	$307.6$	$171.8$	$849.8$	$42.5$
• DOF 6 Loading 4	$158.6$	$178.9$	$811.8$	$32.1$
• DOF 6 Loading 5	$428.3$	$686.9$	$923.0$	$51.5$
• DOF 6 Loading 6	$76.6$	$93.4$	$749.9$	$24.1$

LSTM: long short-term memory; DOF: degree of freedom; GRU: gated recurrent unit; KF: Kalman filter.

Structural loading identification for a hotel in San Bernardino

The methodologies are examined also in field sensing data. An examination is conducted on a 6-story hotel building in San Bernardino, California, sourced from the Center for Engineering Strong Motion Data.¹¹² The structure, a mid-rise concrete building designed in 1970, is equipped with nine accelerometers on the 1st floor, 3rd floor, and the roof level, as depicted in Figure 7. These sensors have captured seismic events from 1987 to 2018. The historical data serve as training inputs for the proposed neural network deep learning models, predicting the structural loading induced by the ground motions. In this scenario, methodologies such as the Kalman filter are vulnerable to identifiability issues, and they cannot be used efficiently to recover the input without assuming any known model parameter.¹¹³ Assuming known parameters leads to unfair comparison with the network as more information is provided. For that case, the load estimation unsurprisingly is better as already demonstrated in Eftekhar Azam et al.¹¹⁴

Figure 7.

Sensor layout of the 6-story hotel in San Bernardino of section “Structural loading identification for a hotel in San Bernardino” (Station No. 23287).

In this examination, the field data, characterized by varying sampling rates and high-frequency noise, undergo initial preprocessing involving resampling at 100 Hz and filtering. A total of 21 datasets are organized into three subsets: 11 for training, 4 for validation, and 6 for prediction. The seismic loading at the building base is considered over a duration of 70 s. Importantly, the neural network architectures are structured in a manner consistent to the approach detailed in section “Structural loading identification in a 6-story building.”

All three cases are examined on Figures 8 to 10. They show the true and identified structural load (first column) for the six unknown predicted datasets where the network never trained or validated. The load identification error at the building base is shown on the second figure column. In all cases, acceleration measurement are only selected from stories 3 and 6. For a different combination or number of measurements, different convergence timing is observed.

Figure 8.

Structure of section “Structural loading identification for a hotel in San Bernardino”: results for the 6-story hotel in San Bernardino when the long short-term memory neural network is used. First column: true and estimated loading at the base. Second column: error at loading identification.

Figure 9.

Structure of section “Structural loading identification for a hotel in San Bernardino”: results for the 6-story hotel in San Bernardino when the gated recurrent unit neural network is used. First column: true and estimated loading at the base. Second column: error at loading identification.

Figure 10.

Structure of section “Structural loading identification for a hotel in San Bernardino”: results for the 6-story hotel in San Bernardino when the convolutional neural network is used. First column: true and estimated loading at the base. Second column: error at loading identification.

Figure 8 refers to the case where the LSTM neural network is used. The results are satisfactory for all six cases. On the other side, the LSTM neural network has the highest computation cost, and it is seen as the least favorable (Table 2).

Table 2.

Computational time for training the presented networks in minutes per 1000 epochs.

Case	LSTM network	GRU network	Conv network
• Shaker loading of section “Structural loading identification in a 6-story building”	337.13 min/1000e	310.14 min/1000e	7.44 min/1000e
• Seismic excitation loading of section “Structural loading identification for a hotel in San Bernardino”	461.31 min/1000e	452.95 min/1000e	15.57 min/1000e
• Hammer loading of section “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem”	1058.69 min/1000e	1044.82 min/1000e	7.21 min/1000e

LSTM: long short-term memory; GRU: gated recurrent unit.

Figure 9 refers to the case where the GRU neural network is used. The results are often more satisfactory for all six cases, and with a convergence time reduction compared to the LSTM network (Table 2).

Figure 10 refers to the case where the 1D CNN is used. The results are the most satisfactory for all six cases compared to the previous networks. Along these lines, a clear and significant convergence time reduction is observed (Table 2). It can be then concluded that for the base excitation, the 1D CNN identifies and predicts the loading with better accuracy and with a less computation time. This result is not true for the top floor excitation of the previous investigation of section “Structural loading identification in a 6-story building.”

Finally, Figure 11 refers to the case where the error metric E(t) of Equation (25) is used. The results show that the LSTM network has (relatively) poor performance with higher computation cost, shown in section “Discussion.” In general, it seems that the performance of three neural networks is better compared to the building of section “Structural loading identification in a 6-story building.” This implies that base excitation results in a better learning for the networks when the responses are all related to the input, than exciting only a single DOF as in section “Structural loading identification in a 6-story building.”

Figure 11.

Structure of section “Structural loading identification for a hotel in San Bernardino”: results for the 6-story hotel in San Bernardino when the error metric E(t) of Equation (25) is used. First column: LSTM neural network. Second column: GRU neural network. Third column: CNN.

Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem

The proposed methodologies are also examined in a hammer-type loading scenario. This examination corresponds to the second phase of experiments conducted by the IASC-ASCE Structural Health Monitoring Task Group, tested at the University of British Columbia.^115–118 The study focuses on applying structural health monitoring techniques to data collected from a 4-story, 2-bay by 2-bay steel-frame structure, as shown in Figure 12. The structure, measuring 2.5 × 2.5 m in plan and 3.6 m tall, is mounted on a concrete slab outside the testing laboratory. To enhance realism, mass distribution was involved placing floor slabs in each bay per floor, with off-center masses on each floor.¹¹⁵ The experimental setup included three types of excitation: electrodynamic shaker, impact hammer, and ambient vibration. Accelerometers strategically placed across the structure facilitated the measurement of structural responses.

Figure 12.

IASC-ASCE structural health monitoring benchmark building of section “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem”, the Dynatron 5803A 12 lbf Impulse Hammer, and the monitor and console equipment.¹¹⁵

Fifteen accelerometers were positioned throughout the frame and the base to capture the responses of the test structure. The placement included sensors for measuring northsouth and eastwest motion.

The excitation and impact hammer tests employed a Dynatron 5803A 12 lbf Impulse Hammer. This hammer, equipped with a force transducer, recorded measurements during tests involving 3–5 hits. Impact locations were chosen on the south and east faces of the first floor in the southeast corner. A force transducer on the hammer tip measured the force input during impact tests. A 16-channel DasyLab acquisition system recorded structural responses, with sampling rates of 250 Hz for shaker and ambient tests, and 1000 Hz for hammer tests. Anti-aliasing filters were applied selectively, and the data acquisition system commenced before the first impact, recording a series of hits within each test.

Since there was a very limited amount of data for the same damage scenario, namely the same structure, the signals of multiple hammer impact split into smaller (four) signals of a single hammer impact in them. Finally, the networks are trained with two signals, validated with one signal, and finally tested on a final signal. This approach examines the capability of the networks on extremely limited datasets. Importantly, the neural network architectures are defined similarly to section “Structural loading identification in a 6-story building.”

All three network cases are examined on Figure 13. They show the true and the identified structural load (first column) for the unknown load where the network never trained or validated, and the load identification error on the second figure column. In all cases, acceleration measurement are selected from all stories.

Figure 13.

Structure of section “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem”: results for the IASC-ASCE structural health monitoring benchmark problem when the LSTM, GRU, and CNN are used. First column: true and estimated loading on the impact hammer scenario. Second column: error at loading identification.

Figure 13 top plots refer to the case where the CNN is used. The results are the most satisfactory compared to the other two models. At the same time, the CNN has the lowest computation cost, and it is seen as the most favorable. The error seen is attributed to the delay on the impact load time, and not to the wrong amplitude.

Figure 13 middle plots refer to the case where the LSTM neural network is used. The results are not satisfactory and with an additional convergence time compared to the previous case.

Figure 13 bottom plot refers to the case where the GRU neural network is used. The results are also not satisfactory, but with a shorter convergence time compared to the previous case.

Finally, Figure 14 refers to the case where the error metric E(t) of Equation (25) is used. The results show that the convolutional network performs better and with lower computation cost, shown in section “Discussion.”

Figure 14.

Structure of section “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem”: Results for the IASC-ASCE structural health monitoring benchmark problem when the error metric E(t) of Equation (25) is used.

The poor performance on this investigation of the LSTM and GRU neural network is expected. The intuition behind them is to create an additional module in a neural network that learns when to remember and when to forget some characteristic of the provided signal. In other words, the network, effectively learns which patterns might be needed in the signal and when that information is no longer needed. This poses a disadvantage for structural load identification in a hammer impact case as it seen as an unexpected excitation in the structure. This is wrongly assumed to not be attributed to structure response or play an important role in the final prediction, and therefore it is neglected. In the hammer test scenario, this “unexpected” excitation is correct and the networks wrongly forget and neglect it.

Discussion

The presented work provided a simple, yet effective, way to identify the load in structural dynamics. It did not aim to present a machine learning algorithm advancement, rather than to apply the vast capabilities of such tools to the structural load identification problem. To this end, the efficiency and robustness of the methods were tested to both simulated and real data, and in different loading types.

This work provided an assessment for the GRU networks, LSTM networks, and CNN in the framework of limited datasets. For the structural health monitoring case of civil structures, this is realistic. All the presented tools can perform much better in a big data availability scenario, but this is often impractical. Despite the small dataset investigation, all the tools shown a great capability.

Regarding the network algorithm parameters, the examinations so far showed a recommendation of high values for the filter size and the number of neurons in the layers. The first one defines the kernel where the data are multiplied by, while the second one determines the number of feature maps. However, for the case of the dropout layer parameter, using a large number may lead to a poorer performance. This is illustrated in Figure 15, where the system of section “Structural loading identification for a hotel in San Bernardino” under seismic loading was modeled using the dropout layer value of 0.75.

Figure 15.

Structure of section “Structural loading identification for a hotel in San Bernardino” in section “Discussion”: results for the 6-story hotel in San Bernardino when the gated recurrent unit neural network is used with dropout layer value of 0.75. First column: true and estimated loading at the base. Second column: error at loading identification.

The recommendation of high values for the filter size and the number of neurons in the layers sounds restrictive or suboptimal since it leads to higher weights for back-propagation, and ultimately to higher computational cost. Despite this, the computational cost of this approach is bearable. This is attributed to two main reasons: the one-dimensional nature of the data, and the small dataset training approach which was implemented. Future research is recommenced on the optimal value of them, or improved network architectures. The author investigated improving further the computational cost by removing layers from the architecture of section “Structural loading identification in a 6-story building.” Specifically a set of GRU, ReLu, and dropout layer is removed; however, this resulted in faster convergence but with a poorer performance (seen in Figure 16).

Figure 16.

Structure of section “Structural loading identification for a hotel in San Bernardino” in section “Discussion”: Results for the 6-story hotel in San Bernardino when the gated recurrent unit neural network is used with less network layers. First column: true and estimated loading at the base. Second column: error at loading identification.

Regarding the concern about the robustness of proposed approaches against the noise effect, simulation are provided. Here, the data are contaminated by a Gaussian white noise sequence with a $10 %$ , $15 %$ , and $20 %$ root-mean-square noise-to-signal ratio for the Kalman filter, and $15 %$ root-mean-square noise-to-signal ratio for the neural networks. The dynamic load identification accumulated error E(t) of Equation (25) is shown for all noise levels of the Kalman filter in Figure 17. The higher noise level results is higher error.

Figure 17.

Structure of section “Structural loading identification in a 6-story building” in section “Discussion”: results for the 6-story shear-type building when the data are contaminated by a Gaussian white noise sequence with a $10 %$ , $15 %$ , and $20 %$ root-mean-square noise-to-signal ratio for the residual-based Kalman filter. Accumulated error of equation for $10 %$ noise (left plot), $15 %$ noise (middle plot), and $20 %$ noise (right plot).

The presented Kalman filter approach performs joint input-state-parameter estimation. The results for parameter estimation in the section “Structural loading identification in a 6-story building” building study are also shown in Figure 18 for the fourth floor parameter and for all noise levels. More parameter and noise results are shown in Impraimakis and Smyth.⁴¹ The parameter estimation slowly convergences to the true values for $10 %$ noise, while for the higher noise cases, it takes even more time to convergence to the true values. The convergence may not be occur during the identification duration of 200 s in high-noise levels.

Figure 18.

Structure of section “Structural loading identification in a 6-story building” in section “Discussion”: Results for the 6-story shear-type building when the data are contaminated by a Gaussian white noise sequence with a $10 %$ , $15 %$ , and $20 %$ root-mean-square noise-to-signal ratio for the residual-based Kalman filter. True and estimated stiffness and damping DOF four parameters for $10 %$ noise (first column), $15 %$ noise (second column), and $20 %$ noise (third column).

An investigation is also made for the networks when the data are contaminated by a Gaussian white noise sequence with a $15 %$ root-mean-square noise-to-signal ratio. All networks underperformed compare to section “Structural loading identification in a 6-story building,” where the high noise is inserted to the prediction. Figure 19 shows the dynamic load identification for the first case of each network, the error compared to true loading, and the accumulated error of Equation (25).

Figure 19.

Structure of section “Structural loading identification for a hotel in San Bernardino” in section “Discussion”: Results for the 6-story hotel in San Bernardino when the data are contaminated by a Gaussian white noise sequence with a $15 %$ root-mean-square noise-to-signal ratio for all networks. True and estimated loading, error compared to true loading, and accumulated error of Equation (25) for the long short-term memory network (first row), the gated recurrent network (second row), and the convolutional network (third row).

Finally, Figure 20 shows the Kalman filter approach for the impact load identification case for the section “Structural loading identification in a 6-story building” building case. It is not applied directly to the IASC-ASCE structural health monitoring benchmark problem to avoid first creating a nonphysically parametrized reduced order model; a task suggested for future research.

Figure 20.

Structure of section “Structural loading identification in a 6-story building” in section “Discussion”: results for the 6-story shear-type building when the impact load of section “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem” is examined with the residual-based Kalman filter. True and estimated loading (first plot), error compared to true loading (middle plot), and accumulated error of Equation (25) (last plot).

Another concern is related to the data-driven only training of the presented tools. It has observed^{110,119–126} that by including a physics-aware constraint or a mathematical model, the training is improved. However, this is not always practical for large-scale structures as it requires full system identification, which finally results in the need of even greater data collection. The tools presented here do not require any parameter estimation in order to perform the structural load identification. It is important to mention that in the case where a physics-based model is available, the computational cost of the training is reduced for all neural networks, and the accuracy is improved. Specifically for the training time, it was shown in this study that the CNN has the lowest one, while between the rest two, GRU has the shorter convergence time. Table 2 shows the computational time for training the presented networks in minutes per 1000 epochs for all networks and applications. The computer used has processor 12th Gen Intel(R) Core(TM) i7-12850HX (24 CPUs), 2 NVIDIA RTX A2000 8 GB GPUs, and RAM 32.0 GB.

Future research

Relating to the joint input-state estimation, the methodologies require knowledge of the structural model and parameters. This may be infeasible, or it may require the collection of additional data to perform full system identification. A way to address this issue is by the use of the joint input-parameter-state estimation methodologies such as the RKF of section “Dynamic load identification using physics-based residual Kalman filtering.” However, these methodologies also have main deficiencies. For instance, the nature of the loading has to be of zero mean value to be filtered, or the requirement of having a known location of the loading, or known zero-values inputs at known location for identifiability reasons.^113,127,128 Contrastingly, for a different combination or number of measurements, different convergence timing is observed for all networks, but unidentifiability issues are not occurred.

By performing a data-driven only approach, the user also does not have to consider different model classes and the select the optimal one. Those approaches calculate the evidence of each candidate model given the available measured data, and they finally select the simpler ones over the unnecessarily complicated ones. The importance of those methods is highlighted by the fact that a more complicated model fits the data better than one which has fewer adjustable uncertain parameters. This is attributed to the parameter fitting which depends too much on the detail of the data and the measurement noise. On the other hand, the presented networks solve the structural load identification problem without a need to select the structural model class.

Another concern is related to the investigation of different structures than buildings. In reality, in another case such as in a bridge investigation, the loading may not be directly sensitive to all responses. As a result, the networks could perform poorly. Additional research is therefore suggested for civil structures different than buildings.

Another concern is related to the investigation into the extrapolation capabilities of the approach since only the inputs–outputs are used for the training and the load identification. The examinations so far showed the potential of the method when the structural model remains the same. However, this assumption may not be true if a change happen to the structure, some damage for instance, or any other modification on the structure. The author slightly changed the simulated structural model of section “Structural loading identification in a 6-story building,” keeping the same trained neural network models, and they all underperformed. This does not occur in the physics-based Kalman filter approach. As a result, the deep learning approaches are not capable of some form of extrapolation to predict structural load for structures with properties outside of the training dataset to ensure good performance. When employed on a real engineering system where the structure may change, one must have some prior belief about the expected model patterns in order to generate comprehensive training datasets. This will lead to retrain the network for future good prediction. This is a pertinent test for structural load approaches in engineering applications as there could be high-cost or safety critical ramifications if the loading is confidently predicted incorrectly.

Regarding applying the Kalman filter approach for input estimation in the case study of base-excited building section “Structural loading identification for a hotel in San Bernardino,” the current work did not assume the extra information of known model parameter in Kalman filtering for a fair comparison with the network in the same dataset. This case obviously results in an even better performance of Kalman filtering presented already by Eftekhar Azam et al.¹¹⁴ A limitation and future suggestion is then how to implement the residual-based Kalman filtering for scenario of base excitation (which excites all DOFs) where input-parameter-state estimation fails the identifiability tests.¹¹³ Similarly, for the case of section “Structural loading identification in the IASC-ASCE structural health monitoring benchmark problem,” it requires a reduced order modeling which results in a nonphysical parameter estimation not examined here, and it is suggested for future research. A future suggestion lies also in combing Kalman filtering and neural networks as exists for dynamic state estimation.³

A final concern is related to the uncertainty quantification where the structural load identification methodology should provide.^129–131 This is a desirable property for the structural load prediction approaches to possess that accurately representing the uncertainty around predictions. In the framework of GRU, LSTM, and CNN, this may be crudely achieved by retraining the model multiply times and take the average and the rest statistical properties of the network prediction, or by using a variational inference approach, while for the Kalman filter by incorporating the unknown input in the state vector.²³

Conclusion

The dynamic structural load identification capabilities of the GRU, LSTM, and CNNs were examined herein. The examination was on realistic small dataset training conditions, and on a comparative view to the physics-based RKF. The dynamic load identification suffers from the uncertainty related to obtaining poor predictions when in civil engineering applications only a low number of tests are performed or are available, or when the structural model is unidentifiable. In considering the methods, first, a simulated structure was investigated under a shaker excitation at the top floor. Second, a building in California was investigated under seismic base excitation, which results in loading for all DOFs. Finally, the IASC-ASCE structural health monitoring benchmark problem was examined for impact and instant loading conditions.

Overall, these network methods allowed for structural load identification with

No need for data filtering for reasonable noise levels.

No need for system identification, known structural parameters, or a structural model.

Real-time prediction when the networks are trained.

Capability of providing the structural load identification for all loading types, with respect to the use of the appropriate network each time.

Reasonable computational cost for small datasets scenarios.

Importantly, the methods were shown to outperform each other on different loading scenarios, while the RKF was shown to outperform the networks in physically parametrized identifiable cases.

Footnotes

Acknowledgements

The author would like to gratefully acknowledge the reviewers for their constructive comments, Editor T.C. for the friendly communication, Andrew W. Smyth for the previous insightful discussions on residual-based Kalman filtering, and the Center for Engineering Strong Motion Data and the Structural Health Monitoring Task Group for providing the data.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Marios Impraimakis

References

Liu

Dobriban

Hou

, et al. Dynamic load identification for mechanical systems: a review. Arch Comput Methods Eng 2022; 29(2): 831–863.

Ching

Beck

Porter

, et al. Bayesian state estimation method for nonlinear systems and its application to recorded seismic response. J Eng Mech 2006; 132(4): 396–410.

Liu

Lai

Bacsa

, et al. Neural extended Kalman filters for learning and predicting dynamics of structural systems. Struct Health Monit 2024; 23(2): 1037–1052.

Mansouri

Avci

Nounou

, et al. A comparative assessment of nonlinear state estimation methods for structural health monitoring. In: Model validation and uncertainty quantification, Volume 3: Proceedings of the 33rd IMAC, a conference and exposition on structural dynamics, Cham, 2015, pp. 45–54. Springer.

R-T

Jahanshahi

. Deep convolutional neural network for structural dynamic response estimation and system identification. J Eng Mech 2019; 145(1): 04018125.

Zhang

Chen

, et al. Deep long short-term memory networks for nonlinear structural seismic response prediction. Comput Struct 2019; 220: 55–68.

Glisic

Kim

, et al. Convolutional neural network-based wind-induced response estimation model for tall buildings. Comput Aided Civ Infrastruct Eng 2019; 34(10): 843–858.

Guarize

Matos

NAF

Sagrilo

LVS

, et al. Neural networks in the dynamic response analysis of slender marine structures. Appl Ocean Res 2007; 29(4): 191–198.

Ying

Chong

Hui

, et al. Artificial neural network prediction for seismic response of bridge structure. In: 2009 International conference on artificial intelligence and computational intelligence, Shanghai, China, 2009, vol. 2, pp. 503–506. IEEE.

10.

Papadrakakis

Papadopoulos

Lagaros

. Structural reliability analyis of elastic-plastic structures using neural networks and monte carlo simulation. Comput Methods Appl Mech Eng 1996; 136(1–2): 145–163.

11.

Christiansen

Høgsberg

Winther

. Artificial neural networks for nonlinear dynamic response simulation in mechanical systems. Nordic Seminar Comput Mech2011; 24: 77–80.

12.

Masri

Smyth

Chassiakos

, et al. Application of neural networks for detection of changes in nonlinear systems. J Eng Mech 2000; 126(7): 666–676.

13.

Olivier

Smyth

. A marginalized unscented Kalman filter for efficient parameter estimation with applications to finite element models. Comput Methods Appl Mech Eng 2018; 339: 615–643.

14.

Impraimakis

Smyth

. Integration, identification, and assessment of generalized damped systems using an online algorithm. J Sound Vibr 2022; 523: 116696.

15.

Ebrahimian

Astroza

Conte

. Extended Kalman filter for material parameter estimation in nonlinear structural finite element models using direct differentiation method. Earthquake Eng Struct Dynamics 2015; 44(10): 1495–1522.

16.

Chatzi

Hiriyur

Waisman

, et al. Experimental application and enhancement of the XFEM–GA algorithm for the detection of flaws in structures. Comput Struct 2011; 89(7–8): 556–570.

17.

Azam

Chatzi

Papadimitriou

. A dual Kalman filter approach for state estimation via output-only acceleration measurements. Mech Syst Signal Process 2015; 60: 866–886.

18.

Nayek

Chakraborty

Narasimhan

. A Gaussian process latent force model for joint input-state estimation in linear structural systems. Mech Syst Signal Process 2019; 128: 497–530.

19.

Anagnostou

Pal

. Derivative-free Kalman filtering based approaches to dynamic state estimation for power systems with unknown inputs. IEEE Trans Power Syst 2017; 33(1): 116–130.

20.

Liu

Wang

. Kalman filter–random forest-based method of dynamic load identification for structures with interval uncertainties. Struct Control Health Monit 2022; 29(5): e2935.

21.

Law

Yung

Yuan

. An interpretive method for moving force identification. J Sound Vibr 1999; 219(3): 503–524.

22.

Chan

THT

. Recent research on identification of moving loads on bridges. J Sound Vibr 2007; 305(1–2): 3–21.

23.

Lourens

Reynders

De Roeck

, et al. An augmented Kalman filter for force identification in structural dynamics. Mech Syst Signal Process 2012; 27: 446–460.

24.

Ghahremani

Kamwa

. Dynamic state estimation in power system by applying the extended Kalman filter with unknown inputs to phasor measurements. IEEE Trans Power Syst 2011; 26(4): 2556–2566.

25.

Vettori

Lorenzo

Peeters

, et al. An adaptive-noise augmented Kalman filter approach for input-state estimation in structural dynamics. Mech Syst Signal Process 2023; 184: 109654.

26.

Hassanabadi

Liu

Azam

, et al. A linear Bayesian filter for input and state estimation of structural systems. Comput Aided Civ Infrastruct Eng 2023; 38: 1749–1766.

27.

C-C

Liang

. A study on an estimation method for applied force on the rod. Comput Methods Appl Mech Eng 2000; 190(8–10): 1209–1220.

28.

Valikhani

Younesian

. Bayesian framework for simultaneous input/state estimation in structural and mechanical systems. Struct Control Health Monit 2019; 26(9): e2379.

29.

Naets

Croes

Desmet

. An online coupled state/input/parameter estimation approach for structural dynamics. Comput Methods Appl Mech Eng 2015; 283: 1167–1188.

30.

Dertimanis

Chatzi

Eftekhar Azam

, et al. Input-state-parameter estimation of structural systems from limited output information. Mech Syst Signal Process 2019; 126: 711–746.

31.

Ghorbani

Buyukozturk

Cha

Y-J

. Hybrid output-only structural system identification using random decrement and Kalman filter. Mech Syst Signal Process 2020; 144: 106977.

32.

Castiglione

Astroza

Azam

, et al. Auto-regressive model based input and parameter estimation for nonlinear finite element models. Mech Syst Signal Process 2020; 143: 106779.

33.

Maes

Karlsson

Lombaert

. Tracking of inputs, states and parameters of linear structural dynamic systems. Mech Syst Signal Process 2019; 130: 755–775.

34.

Lei

Xia

Erazo

, et al. A novel unscented Kalman filter for recursive state-input-system identification of nonlinear systems. Mech Syst Signal Process 2019; 127: 120–135.

35.

Song

. Generalized minimum variance unbiased joint input-state estimation and its unscented scheme for dynamic systems with direct feedthrough. Mech Syst Signal Process 2018; 99: 886–920.

36.

Rogers

Worden

Cross

. On the application of gaussian process latent force models for joint input-state-parameter estimation: with a view to Bayesian operational identification. Mech Syst Signal Process 2020; 140: 106580.

37.

Huang

Yuen

K-V

Wang

. Real-time simultaneous input-state-parameter estimation with modulated colored noise excitation. Mech Syst Signal Process 2022; 165: 108378.

38.

Teymouri

Sedehi

Katafygiotis

, et al. Input-state-parameter-noise identification and virtual sensing in dynamical systems: a Bayesian expectation-maximization (BEM) perspective. Mech Syst Signal Process 2023; 185: 109758.

39.

Capalbo

Gregoriis

Tamarozzi

, et al. Parameter, input and state estimation for linear structural dynamics using parametric model order reduction and augmented Kalman filtering. Mech Syst Signal Process 2023; 185: 109799.

40.

Impraimakis

Smyth

. An unscented Kalman filter method for real time input-parameter-state estimation. Mech Syst Signal Process 2022; 162: 108026.

41.

Impraimakis

Smyth

. A new residual-based Kalman filter for real time input–parameter–state estimation using limited output information. Mech Syst Signal Process 2022; 178: 109284.

42.

Lin

J-W

Betti

Smyth

, et al. On-line identification of non-linear hysteretic structural systems using a variable trace approach. Earthquake Eng Struct Dynamics 2001; 30(9): 1279–1303.

43.

De Angelis

Luş

Betti

, et al. Extracting physical parameters of mechanical models from identified state-space representations. J Appl Mech 2002; 69(5): 617–625.

44.

Sun

Luş

Betti

. Identification of structural models using a modified artificial bee colony algorithm. Comput Struct 2013; 116: 59–74.

45.

Zhi

Fang

. Identification of wind loads and estimation of structural responses of super-tall buildings by an inverse method. Comput Aided Civ Infrastruct Eng 2016; 31(12): 966–982.

46.

Impraimakis

Smyth

. Input-parameter-state estimation of limited information wind-excited systems using a sequential Kalman filter. Struct Control Health Monit 2022; 29: 1–16.

47.

Brewick

Smyth

. An investigation of the effects of traffic induced local dynamics on global damping estimates using operational modal analysis. Mech Syst Signal Process 2013; 41(1–2): 433–453.

48.

Papadimitriou

Argyris

. Bayesian optimal experimental design for parameter estimation and response predictions in complex dynamical systems. Procedia Eng 2017; 199: 972–977.

49.

Capellari

Chatzi

Mariani

. Structural health monitoring sensor network optimization through Bayesian experimental design. ASCE-ASME J Risk Uncertainty Eng Syst Part A: Civ Eng 2018; 4(2): 04018016.

50.

Ercan

Sedehi

Katafygiotis

, et al. Information theoretic-based optimal sensor placement for virtual sensing using augmented Kalman filtering. Mech Syst Signal Process 2023; 188: 110031.

51.

Papadimitriou

Beck

S-K

. Entropy-based optimal sensor location for structural model updating. J Vibr Control 2000; 6(5): 781–800.

52.

Cumbo

Mazzanti

Tamarozzi

, et al. Advanced optimal sensor placement for Kalman-based multiple-input estimation. Mech Syst Signal Process 2021; 160: 107830.

53.

Flynn

Todd

. A Bayesian approach to optimal sensor placement for structural health monitoring with application to active sensing. Mech Syst Signal Process 2010; 24(4): 891–903.

54.

Bagirgan

Mehrjoo

Moaveni

, et al. Iterative optimal sensor placement for adaptive structural identification using mobile sensors: numerical application to a footbridge. Mech Syst Signal Process 2023; 200: 110556.

55.

Mehrjoo

Song

Moaveni

, et al. Optimal sensor placement for parameter estimation and virtual sensing of strains on an offshore wind turbine considering sensor installation cost. Mech Syst Signal Process 2022; 169: 108787.

56.

Rainieri

Fabbrocino

. Automated output-only dynamic identification of civil engineering structures. Mech Syst Signal Process 2010; 24(3): 678–695.

57.

Bao

Tang

, et al. Computer vision and deep learning–based data anomaly detection method for structural health monitoring. Struct Health Monit 2019; 18(2): 401–421.

58.

Atha

Jahanshahi

. Evaluation of deep learning approaches based on convolutional neural networks for corrosion detection. Struct Health Monit 2018; 17(5): 1110–1128.

59.

Wang

, et al. A novel deep learning-based method for damage identification of smart building structures. Struct Health Monit 2019; 18(1): 143–163.

60.

Abdeljaber

Avci

Kiranyaz

, et al. Real-time vibration-based structural damage detection using one-dimensional convolutional neural networks. J Sound Vibr 2017; 388: 154–170.

61.

Abdeljaber

Avci

Kiranyaz

, et al. 1-D CNNs for structural damage detection: verification on a structural health monitoring benchmark data. Neurocomputing 2018; 275: 1308–1317.

62.

Kolappan Geetha

Yang

H-J

Sim

S-H

. Fast detection of missing thin propagating cracks during deep-learning-based concrete crack/non-crack classification. Sensors 2023; 23(3): 1419.

63.

Impraimakis

. A convolutional neural network deep learning method for model class selection. Earthquake Eng Struct Dynamics 2024; 53(2): 784–814.

64.

Cha

Y-J

Choi

Büyüköztürk

. Deep learning-based crack damage detection using convolutional neural networks. Comput Aided Civ Infrastruct Eng 2017; 32(5): 361–378.

65.

Zhai

Narazaki

Wang

, et al. Synthetic data augmentation for pixel-wise steel fatigue crack identification using fully convolutional networks. 2022; 29: 237–250.

66.

Eren

Ince

Kiranyaz

. A generic intelligent bearing fault diagnosis system using compact adaptive 1d cnn classifier. J Signal Process Syst 2019; 91: 179–189.

67.

Zhang

Peng

, et al. A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech Syst Signal Process 2018; 100: 439–453.

68.

Avci

Abdeljaber

Kiranyaz

, et al. A review of vibration-based damage detection in civil structures: from traditional methods to machine learning and deep learning applications. Mech Syst Signal Process 2021; 147: 107077.

69.

Guo

Chen

Shen

. Hierarchical adaptive deep convolution neural network and its application to bearing fault diagnosis. Measurement 2016; 93: 490–502.

70.

Janssens

Slavkovikj

Vervisch

, et al. Convolutional neural network based fault detection for rotating machinery. J Sound Vibr 2016; 377: 331–345.

71.

Liu

Tang

, et al. An ensemble deep convolutional neural network model with improved ds evidence fusion for bearing fault diagnosis. Sensors 2017; 17(8): 1729.

72.

Quqa

Martakis

Movsessian

, et al. Two-step approach for fatigue crack detection in steel bridges using convolutional neural networks. J Civ Struct Health Monit 2022; 12(1): 127–140.

73.

Diao

Wang

, et al. Structural damage identification based on variational mode decomposition–hilbert transform and CNN. J Civ Struct Health Monit 2023; 13: 1–15.

74.

Sun

S-B

Y-Y

Zhou

S-D

, et al. A data-driven response virtual sensor technique with partial vibration measurements using convolutional neural network. Sensors 2017; 17(12): 2888.

75.

Hubel

Wiesel

. Receptive fields and functional architecture of monkey striate cortex. J Physiol 1968; 195(1): 215–243.

76.

Fukushima

. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybernet 1980; 36(4): 193–202.

77.

LeCun

Boser

Denker

, et al. Handwritten digit recognition with a back-propagation network. Adv Neural Inform Process Syst 1989; 2: 396–404.

78.

Kolen

Kremer

. A field guide to dynamical recurrent networks. New York, NY: John Wiley & Sons, 2001.

79.

Houdt

Mosquera

Nápoles

. A review on the long short-term memory model. Artif Intell Rev 2020; 53: 5929–5955.

80.

Hochreiter

Schmidhuber

. Long short-term memory. Neural Comput 1997; 9(8): 1735–1780.

81.

Zhou

Dong

Guan

, et al. Impact load identification of nonlinear structures using deep recurrent neural network. Mech Syst Signal Process 2019; 133: 106292.

82.

Kamariotis

Tatsis

Chatzi

, et al. A metric for assessing and optimizing data-driven prognostic algorithms for predictive maintenance. Reliab Eng Syst Saf 2024; 242: 109723.

83.

Cetiner

, et al. Real-time regional seismic damage assessment framework based on long short-term memory neural network. Comput Aided Civ Infrastruct Eng 2021; 36(4): 504–521.

84.

Cho

Merriënboer

Gulcehre

, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.

85.

Dey

Salem

. Gate-variants of gated recurrent unit (GRU) neural networks. In: 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS), 2017, pp. 1597–1600. IEEE.

86.

Wang

, et al. Advances in dynamic load identification based on data-driven techniques. Eng Appl Artif Intell 2023; 126: 106871.

87.

Zhang

Ying

, et al. A deep learning method for heavy vehicle load identification using structural dynamic response. Comput Struct 2024; 297: 107341.

88.

Zheng

Lin

, et al. A novel deep learning architecture and its application in dynamic load monitoring of the vehicle system. Measurement 2024; 229: 114336.

89.

Moradi

Duran

Azam

, et al. Novel physics-informed artificial neural network architectures for system and input identification of structural dynamics pdes. Buildings 2023; 13(3): 650.

90.

Guo

Jiang

Yang

, et al. An intelligent impact load identification and localization method based on autonomic feature extraction and anomaly detection. Eng Struct 2023; 291: 116378.

91.

Liu

Wang

. A support vector regression (SVR)-based method for dynamic load identification using heterogeneous responses under interval uncertainties. Appl Soft Comput 2021; 110: 107599.

92.

Zhang

Cui

, et al. Wavloadnet: dynamic load identification for aeronautical structures based on convolution neural network and wavelet transform. Appl Sci 2024; 14(5): 1928.

93.

Liu

Wang

, et al. Artificial neural network (ANN)-Bayesian probability framework (BPF) based method of dynamic force reconstruction under multi-source uncertainties. Knowl Based Syst 2022; 237: 107796.

94.

Zhang

Feng

, et al. Random dynamic load identification with noise for aircraft via attention based 1D-CNN. Aerospace 2022; 10(1): 16.

95.

Khosrowpour

Hematiyan

. Distributed load identification for hyperelastic plates using gradient-based and machine learning methods. Acta Mech 2021; 235: 3271–3291.

96.

Yang

Jiang

Chen

, et al. Dynamic load identification based on deep convolution neural network. Mech Syst Signal Process 2023; 185: 109757.

97.

Yang

Jiang

Chen

, et al. A recurrent neural network-based method for dynamic load identification of beam structures. Materials 2021; 14(24): 7846.

98.

Baek

Park

Jung

. Impact load identification method based on artificial neural network for submerged floating tunnel under collision. Ocean Eng 2023; 286: 115641.

99.

Wang

Huang

Xie

, et al. A new regularization method for dynamic load identification. Sci Progress 2020; 103(3): 0036850420931283.

100.

Holmes

Sartor

Reed

, et al. Prediction of landing gear loads using machine learning techniques. Struct Health Monit 2016; 15(5): 568–582.

101.

Zhao

Noori

Altabey

, et al. Deep learning-based damage, load and support identification for a composite pipeline by extracting modal macro strains from dynamic excitations. Appl Sci 2018; 8(12): 2564.

102.

Feng

Duan

Guo

, et al. Deep learning based load and position identification of complex structure. In: 2021 IEEE 16th conference on industrial electronics and applications (ICIEA), 2021, pp. 1358–1363. IEEE.

103.

Guo

Jiang

Yang

, et al. Impact load identification and localization method on thin-walled cylinders using machine learning. Smart Mater Struct 2023; 32(6): 065018.

104.

Gai

Zeng

, et al. An optimization neural network model for bridge cable force identification. Eng Struct 2023; 286: 116056.

105.

Ren-Mu

Germond

. Comparison of dynamic load modeling using neural network and traditional method. In: [1993] Proceedings of the second international forum on applications of neural networks to power systems, Yokohama, Japan, 1992, pp. 253–258. IEEE.

106.

Rosafalco

Manzoni

Mariani

, et al. An autoencoder-based deep learning approach for load identification in structural dynamics. Sensors 2021; 21(12): 4207.

107.

Yuen

K-V

Liang

P-F

Kuok

S-C

. Online estimation of noise parameters for Kalman filter. Struct Eng Mech 2013; 47(3): 361–381.

108.

Impraimakis

. A kullback–leibler divergence method for input–system–state identification. J Sound Vibr 2024; 569: 117965.

109.

Kingma

. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

110.

Rao

Sun

Yang Liu . Physics-informed deep learning for computational elastodynamics without labeled data. J Eng Mech 2021; 147(8): 04021043.

111.

Wang

Z-W

X-F

Zhang

W-M

, et al. Deep learning-based reconstruction of missing long-term girder-end displacement data for suspension bridge health monitoring. Comput Struct 2023; 284: 107070.

112.

Haddadi

Shakal

Stephens

, et al. Center for engineering strong-motion data (CESMD). In: Proceedings of the 14th World conference on earthquake engineering, Beijing, October, pp. 12–17, 2008.

113.

Maes

Chatzis

Lombaert

. Observability of nonlinear systems with unmeasured inputs. Mech Syst Signal Process 2019; 130: 378–394.

114.

Eftekhar Azam

Chatzi

Papadimitriou

, et al. Experimental validation of the Kalman-type filters for online and real-time state and input estimation. J Vibr Control 2017; 23(15): 2494–2519.

115.

Dyke

Bernal

Beck

, et al. Experimental phase ii of the structural health monitoring benchmark problem. In: Proceedings of the 16th ASCE engineering mechanics conference, 2003.

116.

Huang

Cheng

YongZhi Lei . Structural damage identification based on substructure method and improved whale optimization algorithm. J Civ Struct Health Monit 2021, 11: 351–380.

117.

Giraldo

Song

Dyke

, et al. Modal identification through ambient vibration: comparative study. J Eng Mech 2009; 135(8): 759–770.

118.

Jian-ye Ching Beck

. Bayesian analysis of the phase II IASC–ASCE structural health monitoring experimental benchmark data. J Eng Mech 2004; 130(10): 1233–1244.

119.

Raissi

Perdikaris

Karniadakis

. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 2019; 378: 686–707.

120.

Zhu

Zabaras

Koutsourelakis

P-S

, et al. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J Comput Phys 2019; 394: 56–81.

121.

Olivier

Mohammadi

Smyth

, et al. Bayesian neural networks with physics-aware regularization for probabilistic travel time modeling. Comput Aided Civ Infrastruct Eng 2023; 38: 2614–2631.

122.

Lai

Liu

Jian

, et al. Neural modal ordinary differential equations: integrating physics-based modeling with neural ordinary differential equations for modeling high-dimensional monitored structures. Data-Centric Eng 2022; 3: e34.

123.

Lai

Mylonas

Nagarajaiah

, et al. Structural identification with physics-informed neural ordinary differential equations. J Sound Vibr 2021; 508: 116196.

124.

Zhang

Liu

Sun

. Physics-guided convolutional neural network (PhyCNN) for data-driven seismic response modeling. Eng Struct 2020; 215: 110704.

125.

Chen

Liu

Sun

. Physics-informed learning of governing equations from scarce data. Nat Commun 2021; 12(1): 6136.

126.

Eshkevari

Takáč

Pakzad

, et al. Dynnet: physics-based neural architecture design for nonlinear structural response modeling and prediction. Eng Struct 2021; 229: 111582.

127.

Chatzis

Chatzi

Smyth

. On the observability and identifiability of nonlinear structural and mechanical systems. Struct Control Health Monit 2015; 22(3): 574–593.

128.

Shi

Williams

Chatzis

. A robust algorithm to test the observability of large linear systems with unknown parameters. Mech Syst Signal Process 2021; 157: 107633.

129.

Eshkevari

Cronin

Eshkevari

, et al. Input estimation of nonlinear systems using probabilistic neural network. Mech Syst Signal Process 2022; 166: 108368.

130.

Sarego

Zaccariotto

Galvanetto

. Artificial neural networks for impact force reconstruction on composite plates and relevant uncertainty propagation. IEEE Aerospace Electron Syst Mag 2018; 33(8): 38–47.

131.

Wang

Liu

, et al. A radial basis function artificial neural network (RBF ANN) based method for uncertain distributed force reconstruction considering signal noises and material dispersion. Comput Methods Appl Mech Eng 2020; 364: 112954.