Sage Journals: Discover world-class research

Abstract

The load spectrum is the basis of performing the reliability and fatigue life analysis for the structures of tracked vehicles. In order to obtain the load spectrum, the load cycles should be extracted from the measured or simulated load time history using rainflow counting method. After that, the distribution of the load cycles can be modeled by a continuous distribution function. For the purpose of finding a common modeling method and effective parameters’ estimation method for the load spectrum, we used a mixture of multivariate Gaussian functions to model the probability density function of general load time history on the basis of extracted load cycles. Additionally, we proposed an approach for unknown parameters’ estimation based on variational Bayesian inference. This parameter estimation method can automatically infer the number of components from the observed data set. Numerical examples were given to illustrate the effectiveness of our proposed modeling method and unknown parameters’ estimation method. We compared the distributions of the load cycles reconstructed by the load spectrum models with those of the original load cycles. At the same time, we obtained the quantitative optimal results of the parameters for the load cases. The results showed that the mixture Gaussian functions can model complex distribution of the rainflow load cycles for tracked vehicles by choosing suitable number of components and suitable parameters of them, and the variational Bayesian inference is an effective unknown parameters’ estimation method for the mixture models which have latent variables.

Keywords

Load spectrum mixture Gaussians expectation–maximization algorithm variational Bayesian inference

Introduction

Tracked vehicles are widely used in the construction machinery and military machinery fields. In the research and development process of tracked vehicles, it is necessary to perform fatigue life analysis for the key structures. At the same time, bench test is an important way to check and improve the quality and performance of the tracked vehicles. Before fatigue life analysis and bench test, we need to accurately predict the loads applied on the structures of tracked vehicles. But the load time history of operating tracked vehicles is a stochastic process. A fragment of measured load time history cannot reflect all the characters of the loads. But the load spectrum can reflect the distribution of the loads. It represents the dynamic loads applied on the structures.¹

In order to determine the load spectrum of tracked vehicles, the load cycles must be extracted from the load time history using suitable counting method. After that, a continuous distribution function should be used to model the distribution of load cycles according to their characteristics. The continuous distribution function is what we called load spectrum. The load spectrum cannot be directly determined by the distribution of load cycles extracted from measured load time history using some counting method. Because of the costs of measurements, the length of measured load time history is usually limited. And the sample data set cannot reflect all the characteristics of the real tracked vehicles’ loads. If the load spectrum is directly determined by the distribution of load cycles extracted from measured load time history. It is not possible to evaluate the influence of the load cycles whose amplitudes and means are higher than those realized in the original measured load time series. This problem is particularly acute when the measured load time series are relatively short. In addition, the distributions of load cycles are different when the load cycles come from different measured load time histories under actual operating conditions. That makes the determination of the load spectrum much more complex.² As a consequence of this, the distribution of the load cycles should be modeled by a continuous distribution function.

Because the rainflow counting method takes into account the amplitude and the number of closed hysterias loops simultaneously, it is often used for extracting the load cycles from load time histories.² And the load cycles extracted by this method are described in two-dimensional (2D) space. So they have two parameters: the amplitude and the mean, which can be denoted by Y = (Y_a, Y_m ). In order to model the load spectrum, a proper continuous probability density function (PDF) is often assigned to the extracted load cycles.

Nagode and Fajdiga² presented a method of using a multi-model Weibull distribution to model load spectrum. And they developed an unknown parameters’ evaluation method based on histograms of loading frequencies to estimate the parameters of their multiple model. The load spectrum that has more than one basic distribution can be modeled using their method. But because of the solution accuracy, their method cannot model the PDFs of tracked vehicles’ loads very well. Nagode et al.³ developed a model to simulate random load history by multilayer perceptron. However, it has the shortage of needing much longer load time history. Their method has not been widely used in practice. Furthermore, multi-model Weibull distribution^4,5 and multilayer perceptron⁶ are not appropriate for modeling most of load time histories

As we all know, the Gaussian has many important analytical properties. It is widely used to model the distribution of continuous variables. But a simple Gaussian has significant limitations and is unable to model the real complex data set. By taking linear combinations of more than one basic Gaussians, we can get the Gaussian mixture model. With a sufficient number of Gaussians, and by adjusting the mean and covariance, as well as the coefficients of the Gaussian components, almost any PDF can be approximated to arbitrary accuracy.⁷ Klemenc and Fajdiga⁸ applied the mixtures of Gaussian functions to model the different shapes of distribution of the load cycles. They make it possible to describe very different and non-symmetrical PDFs of random variables through application of the mixtures of Gaussian functions. However, the unknown parameters’ estimation is a big problem in this kind of mixture model because of the latent variables. One way to get the estimations of these parameters is the maximum likelihood (ML) estimation method.^9,10 Due to the presence of the summation inside the logarithm, we cannot get the closed-form analytical solution by maximum logarithms of likelihood functions. A general approach to find ML estimations for the models which have latent variables is the expectation–maximization (EM) algorithm.^11,12 But the EM algorithm cannot detect the proper number of components in the mixture model. We have to select a number of the components in advance. Additionally, the quality of the solution depends on the initial conditions and the number of the components we selected. Furthermore, there will be over-fitting if we choose a large number of components in the mixture models.¹³ Markov chain Monte Carlo (MCMC)¹⁴ is another efficient approach to estimate the unknown parameters of mixture Gaussian models. It can use the parameters’ prior distribution information when evaluating the posterior distributions. But the convergence rate of MCMC is relatively slow and the computation is too heavy.¹⁵ It is necessary to develop a general load spectrum model and an efficient unknown parameters’ estimation method for the mixture Gaussian models.

We presented a mixture of multivariate Gaussian functions to model the PDF of tracked vehicles’ load time history on the basis of extracted load cycles or rainflow matrix, as well as proposed an approach for unknown parameters’ estimation based on variational Bayesian inference, which can automatically infer the number of components in the mixture Gaussian model from the observed data set.

Model structure and parameters’ estimation

Load spectrum model structure

The mixture Gaussian distribution is a linear combination of basic Gaussians, which is used to describe the load cycles extracted from load time history of tracked vehicles with rainflow counting method. The PDF can be written as

\begin{matrix} p (Y) = \sum_{k = 1}^{K} π_{k} N (Y | μ_{k}, Σ_{k}) \\ = \sum_{k = 1}^{K} π_{k} \frac{1}{{(2 π)}^{- D / 2}} \frac{1}{{| Σ_{k} |}^{1 / 2}} \\ \exp {- \frac{1}{2} {(Y - μ_{k})}^{T} Σ_{k}^{- 1} (Y - μ_{k})} \end{matrix}

(1)

where Y is the 2D random variable which represents the load cycles and Y = (Y_a, Y_m ); K is the components number; π_k is the mixing coefficient of the Kth Gaussian component, and π_k satisfies 0 ≤ π_k ≤ 1 and $\sum_{k = 1}^{K} π_{k} = 1$ ; N(Y|µ_k , Σ_k) is the Kth Gaussian density of component of the mixture model, and each of them has its own mean µ_k and covariance Σ_k.

Suppose we have a set of observed data $Y = {y_{1}, \dots, y_{N}}$ which are the measured or simulated load cycles, and y = (y_a, y_m ). For each load cycle y_n , we denote a corresponding vector z_n , which is a 1 − of − K binary vector with elements z_nk for $k = 1, \dots, K$ . In these elements, a particular element is equal to 1 and the other elements are equal to 0. So, the values of z_nk satisfy z_nk ∈ {0, 1} and $\sum_{k = 1}^{K} z_{nk} = 1$ . The latent variables can be represented by the vector $Z = {z_{1}, \dots, z_{N}}$ .

The load spectrum represents the statistical distribution of the rainflow load cycles and is the dependence of entire load cycles’ amplitudes and means. It is obtained from the cumulative density function (CDF) of load cycles. And the load cycle is described by a 2D variable y

H (Y) = N_{y} (1 - F (Y))

(2)

F (Y) = \int_{y_{\min}}^{y} p (Y) dy

(3)

where H(Y) is the load spectrum. N_y stands for the total number of load cycles. p(Y) is the PDF of the load cycles which has the form of equation (1). F(Y) is the cumulative distribution function of the load cycles.

In this mixture model of load spectrum, π_k, µ_k , and Σ_k are the unknown parameters. $Z = {z_{1}, \dots, z_{N}}$ are the latent variables. Our goal is to estimate the unknown parameters and the latent variables with the observed data set. Because of the existence of the latent variables and because the data are explained by multiple variables, it is infeasible to estimate the parameters directly. We have to rely on some approximate method to get approximate solutions. As the limitations of the existing methods, we proposed an unknown parameter estimation method based on variational Bayesian inference.

Variational Bayesian inference

The main task in the process of parameters’ estimation for the mixture model of the tracked vehicles’ load spectrum is the evaluation of posterior distribution of the latent variables given the observed data set (measured or simulated load cycles). Because the posterior distribution has a highly complex form, the evaluation of the posterior distribution is usually infeasible. In such situation, we can rely on some approximate schemes. Bayesian theorem leads us to apply the prior knowledge when calculating the posterior distribution with some new observed data sets. At the same time, the variational inference can provide an approximation method to find approximate solutions in the parameters’ evaluation process. In a Bayesian model, all the latent variables and parameters are considered random variables and given prior distributions. And then, the corresponding posterior distribution can be calculated using the Bayes’ theorem.

Supposing we have a probabilistic model with latent variables and some deterministic parameters. The model has two kinds of unknowns: the latent variables and the parameters. We use vector Z to represent all latent variables and parameters, and $Z = {z_{1}, \dots, z_{N}}$ . It is worth noting that all the parameters and latent variables were absorbed into the vector Z. This is for the convenience of representation. Assuming that we have a set of N independent, identically distributed (i.i.d.) observed data, and $Y = {y_{1}, \dots, y_{N}}$ . The joint distribution of the probabilistic model is p(Y, Z). Our goal is to find an approximation for the posterior distribution p(Z|Y) with the model evidence p(Y). Now, we defined q(Z) as the distribution over the latent variables and parameters. For any choice of q(Z), we can decompose the log marginal probability by

\ln p (y) = L (q) + KL (q | | p)

(4)

where

L (q) = \int q (Z) \ln {\frac{p (Y, Z)}{q (Z)}} dZ

(5)

KL (q | | p) = - \int q (Z) \ln {\frac{p (Z | Y)}{q (Z)}} dZ

(6)

Here, we use integrations in formulating the decomposition, since assuming that Z is continuous. KL(q|p) is the Kullback–Leibler (KL) divergence between q(Z) and the posterior distribution p(Z|Y). KL(q|p) ≥ 0, with equality if, and only if, q(Z) = p(Z|Y). So we can obtain L(q) ≤ ln p(Y), it means that L(q) is a lower bound on ln p(Y).

Notice working with the true posterior distribution directly is intractable in this kind of model. We can change the way and seek a member of the family of distribution q(Z), which is tractable and can provide a good approximation to the true posterior distribution, to minimize the KL divergence. Based on this point, Z can be partitioned into disjoint group and denoted by $Z_{i} (i = 1, \dots, M)$ . Then according to the approximation framework of mean field theory, the factorized form of variational inference of q(Z) can be written as

q (Z) = Π_{i = 1}^{M} q_{i} (Z_{i})

(7)

In order to make the algorithm tractable, we make a crucial assumption for the factorization between latent variables and parameters. There is no problem in other factorizations, since the latent variables are i.i.d. on the parameters. Substitute equation (7) into equation (5), we can obtain

\begin{matrix} L (q) & = \int \underset{i}{Π} q_{i} (Z_{i}) {\ln p (Y, Z) - \sum_{i} \ln q_{i} (Z_{i})} dZ \\ = \int q_{j} (Z_{j}) {\int \ln p (Y, Z) \underset{i \neq j}{Π} q_{i} (Z_{i}) d Z_{i}} d Z_{j} \\ - \int q_{j} (Z_{j}) \ln q_{j} (Z_{j}) d Z_{j} + const \\ = \int q_{j} (Z_{j}) \ln \tilde{p} (Y, Z_{j}) d Z_{j} \\ - \int q_{j} (Z_{j}) \ln q_{j} (Z_{j}) d Z_{j} + const \end{matrix}

(8)

where

\ln \tilde{p} (Y, Z_{j}) d Z_{j} = E_{i \neq j} [\ln p (Y, Z)] + const

(9)

$E_{i \neq j} [\dots]$ denotes an expectation with respect to the distributions over all variables Z_i for i ≠ j

E_{i \neq j} [\ln p (Y, Z)] = \int \ln p (Y, Z) \underset{i \neq j}{Π} q_{i} (Z_{i}) d Z_{i}

(10)

In equation (8), $- \int q_{j} (Z_{j}) d Z_{j}$ is the KL divergence between q_j (Z_j ) and $\tilde{p} (Y, Z_{j})$ , and its value is negative. So maximizing L(q) keeping the {q_i _≠j} fixed is equivalent to minimizing the KL divergence, and the minimum occurs when $q_{j} (Z_{j}) = \tilde{p} (Y, Z_{j})$ . We can obtain a general form for the optimal solution $q_{j}^{*} (Z_{j})$

\ln q_{j}^{*} (Z_{j}) = E_{i \neq j} [\ln p (Y, Z)] + const

(11)

This equation is the basis for application of variational methods. But we cannot use it to get the optimal solution of the lower bound L(q) directly. Because the optimal $q_{j}^{*} (Z_{j})$ depends on the expectations computed with respect to the other factors {q_i (Z_i )} for i ≠ j. At the beginning, we have to seek an appropriate consistent solution for all the factors {q_i (Z_i )} to initialize them. And then repeat the process of replacing each of them in turn with a revised estimate which is evaluated using current estimates for all the other factors through equation (9) until convergence. The general variational Bayesian inference method can be summarized as follows:

Establish an appropriate mixture probability model over observed variables.

Choose an initial setting for all the parameters and latent variables by setting a prior probability over them.

Get a set of observed data.

Cycle over the factors, revising each of them given the current estimations of others until convergence.

Parameters’ estimation for the load spectrum model

In this section, we introduce how to use the variational Bayesian inference to perform unknown parameters’ estimation for the mixture Gaussian model built in section “Load spectrum model structure” over the load spectrum of the tracked vehicles. In order to formulate a variational treatment of the mixture model, we write down the joint distribution of all the random variables including latent variables and parameters in the form of

p (Y, Z, π, μ, Λ) = p (Y | Z, μ, Λ) p (Z | π) p (π) p (μ | Λ) p (Λ)

(12)

where $Y = {y_{1}, \dots, y_{N}}$ are the only observed variables; p(Y|Z, µ, Λ) is the conditional distribution of the observed data set given the latent variables and the parameters. It can be written as

p (Y | Z, μ, Λ) = Π_{n = 1}^{N} Π_{k = 1}^{K} N {(y_{n} | μ_{k}, Λ_{k}^{- 1})}^{z_{nk}}

(13)

where µ_k and Λ_k are the mean matrix and the precision matrix of the Gaussian component, respectively. z_nk = 1 if the data point n belongs to the component k, and z_nk = 0 otherwise.

In equation (12), p(Z|π) is the conditional distribution of the latent variable Z given the mixing coefficient π, and it can be written as

p (Z | π) = Π_{n = 1}^{N} Π_{k = 1}^{K} π_{k}^{z_{nk}}

(14)

In order to use the prior knowledge in the Bayesian treatment, we assign a prior distribution over each of the parameters in our mixture model. Since the conjugate priors have the same functional forms with the posteriors, as well as it can greatly simplify the Bayesian analysis. We choose the conjugate priors over the parameters µ, Λ, and π.

Therefore, we choose a Dirichlet distribution over the mixing coefficient π

p (π) = Dir (π | α_{0}) = C (α_{0}) Π_{k = 1}^{K} π_{k}^{α_{0} - 1}

(15)

where C(α ₀) is the normalization constant for the Dirichlet distribution. And for convenience, we assume that all the coefficients have the same prior. That means $α_{1} = \dots α_{k} = α_{0}$ .

Similarly, we choose an independent Gaussian–Wishart distribution as the conjugate prior distribution over mean and precision of each Gaussian component, which is given by

\begin{matrix} p (μ, Λ) = p (μ | Λ) p (Λ) = Π_{k = 1}^{K} N (μ_{k} | m_{0}, \\ {(β_{0} Λ_{k})}^{- 1}) W (Λ_{k} | w_{0}, v_{0}) \end{matrix}

(16)

In order to obtain a tractable practical solution to the Bayesian mixture model, we write down a variational distribution which factorizes between the latent variables and parameters

q (Z, π, μ, Λ) = q (Z) q (π, μ, Λ)

(17)

Then, making use of the general optimal result (equation (11)), the log of the optimized factor q(Z) can be given as

\ln q^{*} (Z) = E_{π, μ, Λ} [\ln p (Y, Z, π, μ, Λ)] + const

(18)

Using decomposition (12) and absorbing the terms that are not dependent on the latent variable Z into the additive normalization constant, the log of the optimal result to latent variables can be written as

\ln q^{*} (Z) = E_{π} [\ln p (Z | π)] + E_{μ, Λ} [\ln p (Y | Z, μ, Λ)] + const

(19)

Substituting equations (13) and (14) into the right side of equation (19), and then absorbing the terms that do not depend on Z into the additive constant again, we can obtain

\ln q^{*} (Z) = \sum_{n = 1}^{N} \sum_{k = 1}^{K} Z_{nk} \ln ρ_{nk} + const

(20)

where

\ln ρ_{nk} = E [\ln π_{k}] + \frac{1}{2} E [\ln | Λ_{k} |] - \frac{D}{2} \ln (2 π) - \frac{1}{2} E_{μ_{k}, Λ_{k}} [{(y_{n} - μ_{k})}^{T} Λ_{k} (y_{n} - μ_{k})]

(21)

D is the dimensionality of the data variable Y (in our case, D = 2). Taking the exponential of both sides of equation (20), we have

q^{*} (Z) \propto Π_{n = 1}^{N} Π_{k = 1}^{K} ρ_{nk}^{z_{nk}}

(22)

For each value of n, the quantities z_nk are binary and sum to 1 over all values of k, we obtain

q^{*} (Z) \propto Π_{n = 1}^{N} Π_{k = 1}^{K} γ_{nk}^{z_{nk}}

(23)

where we defined

γ_{nk} = \frac{ρ_{nk}}{\sum_{j = 1}^{K} ρ_{nj}}

(24)

Note that because ρ_nk is given by the exponential of a real quantity, the quantities γ_nk will be non-negative and will sum to 1 as required. For the discrete distribution q ^*(Z), we have the standard result

E [Z_{nk}] = γ_{nk}

(25)

The quantities ρ_nk represent the responsibilities. For convenience, we defined three statistics of the observed data set evaluated with respect to the responsibilities, given by

N_{k} = \sum_{n = 1}^{N} γ_{nk}

(26)

{\bar{y}}_{k} = \frac{1}{N_{k}} \sum_{n = 1}^{N} γ_{nk} y_{n}

(27)

S_{k} = \frac{1}{N_{k}} \sum_{n = 1}^{N} γ_{nk} (y_{n} - {\bar{y}}_{k}) {(y_{n} - {\bar{y}}_{k})}^{T}

(28)

Similarly for the factor q(π, µ, Λ), using the general result (equation (11)), we have

\ln q^{*} (π, μ, Λ) = \ln π + \sum_{k = 1}^{K} \ln p (μ_{k}, Λ_{k}) + E_{z} [\ln p (Z | π)] + \sum_{k = 1}^{K} \sum_{n = 1}^{N} E [Z_{nk}] \ln N (y_{n} | μ_{k}, Λ_{k}^{- 1}) + const

(29)

The factor q(π, µ, Λ) has further factorization of

q (π, μ, Λ) = q (π) Π_{k = 1}^{K} q (μ_{k}, Λ_{k})

(30)

Identifying the terms on the right side of equation (29) that depends on π and using equation (25), we have

\begin{matrix} \ln q^{*} (π) = \ln p (π) + E_{z} [\ln p (Z | π)] + const \\ = \ln C (α_{0}) + (α_{0} - 1) \sum_{k = 1}^{K} \ln π_{k} \\ + \sum_{n = 1}^{N} \sum_{k = 1}^{K} E [Z_{nk}] \ln π_{k} + const \\ = (α_{0} - 1) \sum_{k = 1}^{K} \ln π_{k} + \sum_{k = 1}^{K} \sum_{n = 1}^{N} γ_{nk} \ln π_{k} + const \end{matrix}

(31)

Taking the exponential of both sides and recognizing q ^*(π) as a Dirichlet distribution, we can write down

q^{*} (π) = Dir (π | α)

(32)

where α has component α_k given by

α_{k} = α_{0} + N_{k}

(33)

Finally, the variational posterior distribution q ^*(µ_k , Λ_k) can be written in the form of q ^*(µ_k , Λ_k) = q ^*(µ_k , |Λ_k)q ^*(Λ_k). The two factors can be found by inspecting equation (29) and reading off those terms that involve µ_k and Λ_k. The result is a Gaussian–Wishart distribution and is given by

q^{*} (μ_{k}, Λ_{k}) = N (μ_{k} | m_{k}, {(β_{k} Λ_{k})}^{- 1}) W (Λ_{k} | w_{k}, v_{k})

(34)

where we defined

β_{k} = β_{0} + N_{k}

(35)

m_{k} = \frac{1}{β_{k}} (β_{0} m_{0} + N_{k} {\bar{y}}_{k})

(36)

W_{k}^{- 1} = W_{0}^{- 1} + N_{k} S_{k} + \frac{β_{0} N_{k}}{β_{0} + N_{k}} ({\bar{y}}_{k} - m_{0}) {({\bar{y}}_{k} - m_{0})}^{T}

(37)

v_{k} = v_{0} + N_{k}

(38)

We can see that the update equations (35)∼(38) are similar to the M-step equations of the EM algorithm, and the computations of them need the expectations E[Z_nk ] = γ_nk which are similar to the E-steps of EM algorithm. γ_nk represents the responsibilities and can be obtain by normalizing ρ_nk which is given by equation (21). Whereas the computation of ρ_nk involves expectations with respect to the variational distributions of the parameters (Πk, μk, and Σk). At the same time, according to the properties of Gaussian, Wishart, and Dirichlet distributions, we can get the following results

E_{μ_{k}, Λ_{k}} [{(y_{n} - μ_{k})}^{T} Λ_{k} (y_{n} - μ_{k})] = D β^{- k} + v_{k} {(y_{n} - m_{k})}^{T} W_{k} (y_{n} - m_{k})

(39)

E [\ln | Λ_{k} |] = \sum_{n = 1}^{D} Ψ (\frac{v_{k} + 1 - i}{2}) + D \ln 2 + \ln | W_{k} |

(40)

E [\ln π_{k}] = Ψ (α_{k}) - Ψ (\hat{α})

(41)

where ψ() is the digamma function, with $\hat{α} = Σ_{k} α_{k}$ . Substituting equations (39)–(41) into equation (21), it is easy to evaluate the ρ_nk .

Considering the process of evaluations, we can see that the optimal results for the responsibilities and parameters depend on each other. We cannot obtain the closed-form solutions with the optimizations of variational posterior distributions. But we can obtain the approximate solutions by cycling two stages. In the first stage, we use the current distribution over the model parameters to evaluate the parameters with equations (39)–(41). And then evaluate the responsibilities with equation (25). In the second stage, we keep these responsibilities fixed and use them to re-compute the variational distribution over the parameters using equations (32) and (34). This process seems like the E-step and M-step in the EM algorithm. After convergence or the iterations meet the required number, we can obtain a certain precise solution. As for the determination of convergence, it can be considered convergence when the lower bound does not change significantly.

For the variational Gaussian mixture model, the lower bound is given by

\begin{matrix} L & = \sum_{Z} \int \int \int q (Y, π, μ, Λ) \ln {\frac{p (Y, Z, π, Λ)}{q (Z, π, μ, Λ)}} d π d μ d Λ \\ = E [\ln p (Y, Z, π, μ, Λ)] - E [\ln q (Z, π, μ, Λ)] \\ = E [\ln p (Y | Z, μ, Λ)] + E [\ln p (Z | π)] + E [\ln p (π)] \\ + E [\ln p (μ, Λ)] \\ - E [\ln q (Z)] - E [\ln q (π)] - E [\ln q (μ, Λ)] \end{matrix}

(42)

As for the results of the various terms in the lower bound, please refer the study by Bishop.⁷ The value of the lower bound should increase monotonically with each step of the iteration. So monitoring the value of the lower bound can help us to test the convergence and check the correctness of the mathematical derivation of the update equations.

In order to determine the suitable value for K, we treat the mixing coefficients π as parameters and make point estimations of their values by maximizing the lower bound with respect to π instead of maintaining a probability distribution over them as in the fully Bayesian approach. The re-estimation equation of π_k is given by

π_{k} = \frac{1}{N} \sum_{n = 1}^{N} γ_{nk}

(43)

This maximization is involved in the variational updates for the q distribution over the remaining parameters. The components that take essentially no responsibility for explaining the data points will have the mixing coefficients driver to 0 during the optimization. At the beginning, we set a relatively large initial value for K, and the superfluous components can be “killed off” from the mixture model through setting the suitable threshold value for π_k .

Experiments and results

In order to verify the applicability and the effectiveness of the proposed method for modeling the load spectrum of the tracked vehicles, the simulated and measured rainflow load cycles were used in the following four load cases. We call the load cases that used the simulated rainflow load cycles EXAM_1S, EXAM_2S, and EXAM_3S separately. The load case that used measured rainflow load cycles was named EXAM_4M. The load cycles of traced vehicles in the simulated examples were generated from 2D Gaussian mixture distribution with different parameters.

In order to verify the generalized ability of the load spectrum model and the corresponding parameters’ estimation method, we set the component numbers of the mixture model as 2, 3, and 5 in the three simulated load cycle cases, respectively. Meanwhile, for the purpose of convenience, we assume that the single Gaussian component has the same weight coefficient in the same simulated load cycle case. The mean and covariance matrix are not very important and they are set arbitrarily.

In the load case EXAM_1S, the load cycles of the traced vehicles were generated from a 2D Gaussian mixture distribution which has two components. The real parameters of it are as follows: the component numbers K = 2; the weight coefficients π = [0.5 0.5]; the means of the components µ ₁ = [1 2], µ ₂ = [−3 −5]; the covariance matrix Σ₁ = [2 0; 0 3], Σ₂ = [2 0; 0 2]; and the sample size N = 500.

Similarly, the load cycles of the traced vehicles used in the load case EXAM_2S were also generated from a 2D Gaussian mixture distribution. But the mixture Gaussian mixture distribution has three components. The real parameters are as follows: the component numbers K = 3; the weight coefficients π = [0.3 0.3 0.4]; the means of the components µ ₁ = [−2 −3], µ ₂ = [1 4], µ ₃ = [8 10]; the covariance matrix Σ₁ = [2 0; 0 3], Σ₂ = [2 0; 0 2], Σ₃ = [2 0; 0 2]; and the sample size N = 500.

In the load case of EXAM_3S, we used a 2D Gaussian mixture distribution which has five components to generate the load cycles. The real parameters of it are as follows: the component numbers K = 5; the weight coefficients π = [0.2 0.2 0.2 0.2 0.2]; the means of the components µ ₁ = [0.5 0.5], µ ₂ = [3 2.5], µ ₃ = [4 5.5], µ ₄ = [5.5 2], µ ₅ = [7 5]; the covariance matrix Σ₁ = [0.5 0; 0 0.5], Σ₂ = [0.4 0; 0 0.3], Σ₃ = [0.3 0; 0 0.3], Σ₄ = [0.3 0; 0 0.3], Σ₅ = [0.5 0; 0 0.5]; and the sample size is the same as other examples N = 500.

In the measured load case EXAM_4M, the load time series on the loading axis of a certain type of tank was measured under the real operating conditions. The tested tank traveled 100 km on a gravel road with the four gear speed. The load cycles were extracted from a fragment of the whole sample with rainflow counting method.¹³ And the units of the load cycle amplitudes and the means are Volts. The number of the measured rainflow load cycles is 2740.

The continuous distributions of rainflow load cycles in those load cases were modeled using a mixture of multivariate Gaussian functions. The rainflow load cycles were represented by a 2D variable denoted by Y = (Y_a, Y_m ). Y_a and Y_m represent the means and the amplitudes of the load cycles relatively. The latent variables and parameters were estimated with the variational Bayesian inference method developed in the previous sections. Before the optimal iterations, we have to initialize the prior values of the parameters. In this article, we choose the same prior values for all the load cases. We set α ₀ = 0.001, β ₀ = 1, m ₀ = 0, ν ₀ = 20, and w ₀ = 200I at the beginning of the iterations. We also set a relatively bigger initial component numbers for the mixture models, K = 10. It was considered convergence when the iteration reaches 2000 steps or the difference of the lower bound value between adjacent iteration steps is less than 1 × 10⁻⁵. After convergence, those components whose weight coefficient is less than 1% × 1/K will be ignored. And they are considered taking essentially no responsibility for the data set. So we can get the optimal value of the component number k. Then, we reset the initial value for K with the estimated result of it and continue the optimal process until the estimated value of K equals the initial value of it. Using this method, we can obtain the final estimated parameters of the mixture Gaussian models. Table 1 shows the optimal results of component numbers of the mixture models for the load cases. The quantitative estimated parameters of the mixture Gaussian models for the load cycle cases are shown in Table 2.

Table 1.

Optimal results of component numbers.

K	EXAM_1S	EXAM_2S	EXAM_3S	EXAM_4M
Real value	2	3	5	Unknown
Initial value	10	10	10	10
Optimal value	2	3	5	5

Table 2.

Estimated parameters of the mixture Gaussian models.

Load cases	Mixing coefficient (π_k )	Mean (µ_k )	Covariance (Σ_k)
EXAM_1S	π ₁ = 0.5496	$μ_{1} = [\begin{matrix} 1.0813 & 1.7037 \end{matrix}]$	$Σ_{1} = [\begin{matrix} 1.9813 & 0.0236 \\ 0.0236 & 3.3096 \end{matrix}]$
	π ₂ = 0.4504	$μ_{2} = [\begin{matrix} - 2.9435 & - 4.8944 \end{matrix}]$	$Σ_{2} = [\begin{matrix} 1.6470 & - 0.0847 \\ - 0.0847 & 1.8833 \end{matrix}]$
EXAM_2S	π ₁ = 0.3302	$μ_{1} = [\begin{matrix} - 1.9197 & - 2.8777 \end{matrix}]$	$Σ_{1} = [\begin{matrix} 2.1037 & 0.4351 \\ 0.4351 & 3.3999 \end{matrix}]$
	π ₂ = 0.2661	$μ_{2} = [\begin{matrix} 1.1019 & 4.1459 \end{matrix}]$	$Σ_{2} = [\begin{matrix} 1.9093 & 0.1279 \\ 0.1279 & 1.5171 \end{matrix}]$
	π ₃ = 0.4038	$μ_{3} = [\begin{matrix} 8.0464 & 9.9096 \end{matrix}]$	$Σ_{3} = [\begin{matrix} 1.7568 & - 0.0727 \\ - 0.0727 & 1.9463 \end{matrix}]$
EXAM_3S	π ₁ = 0.1832	$μ_{1} = [\begin{matrix} 0.4560 & 0.3376 \end{matrix}]$	$Σ_{1} = [\begin{matrix} 0.4206 & - 0.0852 \\ - 0.0852 & 0.4162 \end{matrix}]$
	π ₂ = 0.2002	$μ_{2} = [\begin{matrix} 2.9611 & 2.4486 \end{matrix}]$	$Σ_{2} = [\begin{matrix} 0.5113 & 0.1248 \\ 0.1248 & 0.4286 \end{matrix}]$
	π ₃ = 0.2064	$μ_{3} = [\begin{matrix} 3.9578 & 5.4808 \end{matrix}]$	$Σ_{3} = [\begin{matrix} 0.2929 & 0.0120 \\ 0.0120 & 0.2940 \end{matrix}]$
	π ₄ = 0.2156	$μ_{4} = [\begin{matrix} 5.5039 & 1.9420 \end{matrix}]$	$Σ_{4} = [\begin{matrix} 0.2810 & 0.0257 \\ 0.0257 & 0.2923 \end{matrix}]$
	π ₅ = 0.1947	$μ_{5} = [\begin{matrix} 6.9564 & 4.9589 \end{matrix}]$	$Σ_{5} = [\begin{matrix} 0.4779 & 0.0730 \\ 0.0730 & 0.6149 \end{matrix}]$
EXAM_4M	π ₁ = 0.0810	$μ_{1} = [\begin{matrix} 1.6370 & 0.4741 \end{matrix}]$	$Σ_{1} = [\begin{matrix} 0.3658 & 0.0880 \\ 0.0880 & 4.8315 \end{matrix}]$
	π ₂ = 0.3381	$μ_{2} = [\begin{matrix} 1.1266 & 2.4821 \end{matrix}]$	$Σ_{2} = [\begin{matrix} 0.1859 & - 0.0755 \\ - 0.0755 & 0.4023 \end{matrix}]$
	π ₃ = 0.1334	$μ_{3} = [\begin{matrix} 2.4629 & 0.9853 \end{matrix}]$	$Σ_{3} = [\begin{matrix} 1.1766 & - 0.0821 \\ - 0.0821 & 0.8648 \end{matrix}]$
	π ₄ = 0.2675	$μ_{4} = [\begin{matrix} 1.0470 & 0.9470 \end{matrix}]$	$Σ_{4} = [\begin{matrix} 0.2109 & - 0.1892 \\ - 0.1892 & 0.2004 \end{matrix}]$
	π ₅ = 0.1800	$μ_{5} = [\begin{matrix} 0.9494 & - 0.1116 \end{matrix}]$	$Σ_{5} = [\begin{matrix} 0.1675 & - 0.1323 \\ - 0.1323 & 0.6451 \end{matrix}]$

In order to assess the agreement between the original distribution and modeled load spectrum for the load cases, we compared the original load cycle distribution and the reconstructed load cycle distribution. It is difficult to assess the goodness of agreement between modeled PDFs and the original distribution of the rainflow load cycles in 2D space.⁶ So we made the comparisons between the modeled one-dimensional (1D) load spectrum with corresponding 1D original marginal PDFs on the load cycle amplitudes and means relatively.

From Table 1, we can see that the agreements between the estimated component numbers and the real component numbers of the mixture models for the simulated load cases are very good. The real component numbers of the mixture model for the measured load case are unknown. And the optimal number of it is 5. The distribution comparisons of the original load cycles and the reconstructed load cycles for the cases of EXAM_1S, EXAM_2S, EXAM_3S, and EXAM_4M are presented in Figures 1 –8. Figures 9 –16 are the comparisons between the marginal PDFs of original load cycles and marginal PDFs of the reconstructed load cycles on amplitudes and means about the simulated and measured load cases, respectively.

Figure 1.

Distribution of the original load cycles for the load case EXAM_1S.

Figure 2.

Distribution of the reconstructed load cycles for the load case EXAM_1S.

Figure 3.

Distribution of the original load cycles for the load case EXAM_2S.

Figure 4.

Distribution of the reconstructed load cycles for the load case EXAM_2S.

Figure 5.

Distribution of the original load cycles for the load case EXAM_3S.

Figure 6.

Distribution of the reconstructed load cycles for the load case EXAM_3S.

Figure 7.

Distribution of the original load cycles for the load case EXAM_4M.

Figure 8.

Distribution of the reconstructed load cycles for the load case EXAM_4M.

Figure 9.

Comparison of the modeled and original load cycles’ marginal PDF on amplitudes for case EXAM_1S.

Figure 10.

Comparison of the modeled and original load cycles’ marginal PDF on means for case EXAM_1S.

Figure 11.

Comparison of the modeled and original load cycles’ marginal PDF on amplitudes for case EXAM_2S.

Figure 12.

Comparison of the modeled and original load cycles’ marginal PDF on means for case EXAM_2S.

Figure 13.

Comparison of the modeled and original load cycles’ marginal PDF on amplitudes for case EXAM_3S.

Figure 14.

Comparison of the modeled and original load cycles’ marginal PDF on means for case EXAM_3S.

Figure 15.

Comparison of the modeled and original load cycles’ marginal PDF on amplitudes for case EXAM_4M.

Figure 16.

Comparison of the modeled and original load cycles’ marginal PDF on means for case EXAM_4M.

From distribution comparison figures, we can see that the agreements between the original load cycle distributions and the reconstructed load cycle distributions are very well in the shape and distribution density. And the boundaries of those scatters are clear. The marginal PDF figures show that the reconstructed load cycles’ PDF curves have good coincidences with the corresponding ones of original load cycles for all simulated and measured load cases, especially in the regions of low probability. But the reconstructed load cycles’ marginal PDFs tend to that the convex regions are more convex, as well as the concave regions are more concave compared to that of the original load cycles. And it is more evident in the load cases of EXAM_2S and EXAM_3S. This situation may be caused by the estimation error of the means and amplitudes of the Gaussian components of the mixture model. However, from Table 2, we can see that the parameter estimation error of those mixture models is in the acceptable ranges. Thus, the modeled load spectrum for the load cycles agree with the distribution of the original load cycles very well.

Conclusion

In this article, we used the mixture multivariate Gaussian functions to model the load spectrum of tracked vehicles based on rainflow load cycles and presented a parameter estimation method for the mixture distribution model based on variational Bayesian inference. The detailed mathematical procedures were derived when developing the parameter estimation method. Three simulated load cycle cases and one measured load cycle case were used to verify the effectiveness of the load spectrum model and the parameter estimation method. The component numbers can be correctly inferred in all the three simulated load cycle cases given different initial values. Through comparing the marginal PDFs of the original load cycles and the reconstructed load cycles in the cases, the distributions of them agree very well. It was proved that the new algorithm of parameter estimation method can automatically detect the component numbers of the mixture Gaussian model from the observed data set, and it does not depend on the initial conditions. It has good practical value for the engineering applications

Footnotes

Declaration of conflicting interests

The authors declare that there is no conflict of interest.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

References

Luts

Ormerod

. Mean field variational Bayesian inference for support vector machine classification. Comput Stat Data An 2014; 73: 163–176.

Nagode

Fajdiga

. A general multi-modal probability density function suitable for the rainflow ranges of stationary random processes. Int J Fatigue 1998; 20(3): 211–223.

Nagode

Klemenc

Fajdiga

. Parametric modelling and scatter prediction of rainflow matrices. Int J Fatigue 2001; 23(6): 525–532.

El-Adl

. Predicting future lifetime based on random number of three parameters Weibull distribution. Math Comput Simulat 2011; 81(9): 1842–1854.

Mahmoudi

Sepahdar

. Exponentiated Weibull–Poisson distribution: model, properties and applications. Math Comput Simulat 2013; 92: 76–97.

Klemenc

Fajdiga

. A neural network approach to the simulation of load histories by considering the influence of a sequence of rainflow load cycles. Int J Fatigue 2002; 24(11): 1109–1125.

Bishop

. Pattern recognition and machine learning. Berlin, Heidelberg: Springer, 2007.

Klemenc

Fajdiga

. Improved modeling of the loading spectra using a mixture model approach. Int J Fatigue 2008; 30(7): 1298–1313.

Hongzhu

Chris

. Estimation of the three parameter Weibull probability distribution. Math Comput Simulat 1995; 39: 173–185.

10.

Ling

Pan

. A maximum likelihood method for estimating P-S-N curves. Int J Fatigue 1997; 19(5): 415–419.

11.

Chen

S-C

Lindsay

. Improving mixture tree construction using better EM algorithms. Comput Stat Data An 2014; 74: 17–25.

12.

HKT

Chan

Balakrishnan

. Estimation of parameters from progressively censored data using EM algorithm. Comput Stat Data An 2002; 39(4): 371–386.

13.

Wenting

. Pattern recognition and machine learning. Beijing, China: National Defense Industry Press, 2008.

14.

Yuan

Wei

. An efficient Monte Carlo EM algorithm for Bayesian lasso. J Stat Comput Sim 2014; 84(10): 2166–2186.

15.

Yang

J-l

H-w

. An improved multi-target tracking algorithm based on CBMeMBer filter and variational Bayesian approximation. Signal Process 2013; 93(9): 2510–2515.

Load spectrum modeling for tracked vehicles based on variational Bayesian inference

Abstract

Keywords

Introduction

Model structure and parameters’ estimation

Load spectrum model structure

Variational Bayesian inference

Parameters’ estimation for the load spectrum model

Experiments and results

Conclusion

Footnotes

Declaration of conflicting interests

Funding

References