Sage Journals: Discover world-class research

Abstract

This article proposes the exploitation of the Kullback–Leibler divergence to characterise the uncertainty of the tracking error for general stochastic systems without constraints of certain distributions. The general solution to the fully probabilistic design of the tracking error control problem is first stated. Further development then focuses on the derivation of a randomised controller for a class of linear stochastic Gaussian systems that are affected by multiplicative noise. The derived control solution takes the multiplicative noise of the controlled system into consideration in the derivation of the randomised controller. The proposed fully probabilistic design of the tracking error of the system dynamics is a more legitimate approach than the conventional fully probabilistic design method. It directly characterises the main objective of system control. The efficiency of the proposed method is then demonstrated on a flexible beam example where the vibration quenching in flexible beams is shown to be effectively suppressed.

1. Introduction

In control systems, the tracking error between the system output and a predefined desired output is the most commonly used optimisation signal for the tuning of the parameters of the system controller (Gaudio et al., 2019; Gerasimov et al., 2019; Humaidi and Hameed, 2019; Wu and Du, 2019; Zhou et al., 2020; Zhou et al., 2017). When accompanied with adaptive control (Chen and Jiao, 2010; Narendra and Annaswamy, 2005; Tao, 2003), the approach has been particularly proven useful to control systems that are affected by model uncertainty, random noises, and that are operating under changing environments and have unforeseen variations in their overall structure. Despite being adaptive and therefore are expected to deal with the underlying system uncertainty, many of the aforementioned methods are based on the minimisation of the mean square tracking error to optimise the controller parameters. The minimisation of the mean square tracking error, also known as tracking error variance, on the other hand, is based on the assumption of certainty equivalence; therefore, it does not generally yield a good performance. Thus, for more general stochastic systems and for systems with functional and model uncertainty, the variance of the tracking error cannot be used alone to represent the performance of the closed-loop system (Herzallah, 2007; Herzallah and Lowe, 2003; Herzallah and Lowe, 2004; Herzallah and Lowe, 2006; Yue and Wang, 2003; Zhang et al., 2016). As a result, the Kullback–Leibler divergence (Cliff et al., 2018; Kulback, 1959; Yu and Mehta, 2009) measure has been proposed recently in several control literatures to characterise the uncertainty of the stochastic systems dynamics. This is because the Kullback–Leibler divergence measures the discrepancies between the stochastic system distributions to their desired distributions rather than characterising them by their means or variances.

An efficient control approach, known as fully probabilistic design (FPD), that uses the Kullback–Leibler divergence as a performance measure for designing randomised controllers has been proposed in Karny (1996) and Herzallah and Karny (2011). In this approach, the Kullback–Leibler divergence is used to measure the discrepancy between the joint pdf of the closed-loop description of the system dynamics and an ideal joint pdf. The main advantage of the FPD control approach is that it provides a closed-form solution for general description of stochastic systems without constraints of certain distribution. However, although a closed-form solution can be obtained, the solution cannot be evaluated analytically because of the multivariate integration involved in the optimisation process. Besides, in its original form the FPD control method considers the design of a randomised controller that shapes the pdf of the system dynamics. Nonetheless, the characterisation of the pdf of the system dynamics can be difficult for many real-world systems that work under high levels of uncertainty and stochasticity. Furthermore, in many real engineering systems the controller objective is to make the output of the system dynamics follow a predefined desired output value, thus emphasising the importance of the tracking error rather than the actual system output.

As such, this study follows an alternative approach where the Kullback–Leibler divergence is defined to be the distance between the pdf of the joint distribution of the tracking error and the randomised controller of the controlled system to an ideal joint distribution function. Therefore, the randomised controller is designed here to reshape the pdf of the tracking error of the controlled system rather than the pdf of its dynamics. Compared with the existing results on the topic and the conventional approach of FPD, this alternative approach has several advantages that have not been reported in the literature. First, the characterisation of the pdf of tracking error of the controlled system is normally easier than that of the pdf of its dynamics. This is because when the stochastic dynamics of the controlled system are estimated accurately, the resulting tracking error of the system will be small and most likely can be characterised by a Gaussian pdf. The aforementioned in turn simplifies the optimisation of the sought randomised controller. Second, the ideal distribution of the tracking error can be naturally specified by a zero mean distribution. In particular, a Gaussian distribution with zero mean and a prespecified covariance matrix that determines the allowed fluctuations of the tracking error around its zero mean value would be ideal. Furthermore, the FPD method in its original form considers additive noise only to the system dynamics. Our alternative solution considers stochastic systems with multiplicative noises which represent conditions under which most real-world systems operate. Therefore, an additional contribution of the study is the consideration of the multiplicative noise of the stochastic system in the derivation of the randomised optimal control law. Moreover, the proposed probabilistic minimisation of the tracking error will be shown to be particularly useful for solving the vibration control problem associated with mechanical systems. The vibration control problem is particularly challenging and is relevant to many real-world control problems, including robotic manipulators, aerospace structures, and biomechanical systems (Flores and Barbieri, 2006; Pappalardo et al., 2016; Simone et al., 2018; Sohn et al., 2009; Song and Gu, 2007).

To reemphasise, this alternative solution of the tracking error and the extension of the FPD to stochastic systems with multiplicative noises have not been discussed previously in the literature. Its theoretical development and numerical demonstration will be presented for the first time in this article.

2. Problem statement

In the original formulation of the FPD, the aim is to derive a randomised controller that shapes the joint probability density function of the stochastic system dynamics and the controller. This joint probability density function of the controller and the dynamics of the stochastic system represents the complete description of the closed-loop behaviour of the controlled system. However, in some control applications, the system is required to track a predefined desired trajectory. Thus, for these control applications, it would be more convenient to design the controller such that it reshapes the pdf of the tracking error as opposed to the original formulation of reshaping the pdf of the system dynamics. For the system to be able to track the desired signal, the controller should be designed such that the pdf of the tracking error is centred around zero with small variations. This objective of achieving a narrow distribution of the tracking error centred around zero error state implies that the system has tracked the desired trajectory and at the same time indicates that the uncertainty in the tracked trajectory is small. To be more specific, assume that the stochastic system can be described at each time instant k by the following conditional pdf

s (x_{k} | x_{k - 1}, u_{k - 1})

(1)

where

x_{k} \in ℜ^{n}

is the system state and

u_{k} \in ℜ^{m}

is the system input. Defining the reference state that the system will be required to track as

x_{r} \in ℜ^{n}

, then the system tracking error is given by

e_{k} = x_{k} - x_{r}

(2)

Because the considered system in this study is stochastic and subject to random forces and functional uncertainties, only the probability density function of the state values defined in equation (1) can be specified. On the other hand, because the objective of this study is to design a randomised controller that shapes the pdf of the tracking error as a result of the requirements that the system state tracks a desired set point, the pdf of the tracking error needs to be assumed to be known which may be an unrealistic assumption for many real-world control problems. However, the density function of the tracking error can be obtained from the density function of the system dynamics using the probability theory as follows

s_{e} (x_{k}, x_{r}) = s (e_{k} + x_{r} | e_{k - 1} + x_{r}, u_{k - 1})

(3)

In general, s(x_k|.) is not known in reality, thus needs to be estimated online using the observed data of the controlled system. The estimation process of this pdf is explained in Section 3.2.

Once the pdf of the tracking error is estimated, the randomised controller can be derived by redefining the Kullback–Leibler divergence such that the discrepancy between the joint pdf of the tracking error and the controller and a predefined ideal joint pdf is minimised

D (f ‖ f^{I}) = \int f (D) ln (\frac{f (D)}{f^{I} (D)}) d (D)

(4)

where

f (D) = \prod_{k = 1}^{H} s (e_{k} | e_{k - 1}, u_{k - 1}) c (u_{k - 1} | e_{k - 1})

f^{I} (D) = \prod_{k = 1}^{H} s^{I} (e_{k} | e_{k - 1}, u_{k - 1}) c^{I} (u_{k - 1} | e_{k - 1})

D = (e_{0}, \dots, e_{H}, u_{0}, \dots, u_{H - 1})

, and H is the control horizon. Following the same approach of the original FPD, the minimisation of the Kullback–Leibler divergence defined in equation (4) can be achieved by recursively solving the backward recurrence equation that is given in the following proposition.

Proposition 1

The optimal randomised controller c(u_k−1|e_k−1) can be obtained by recursively solving the following recurrence equation (Herzallah and Karny, 2011)

\begin{array}{l} - ln (γ (e_{k - 1})) = \min_{c (u_{k - 1} | e_{k - 1})} \int s (e_{k} | u_{k - 1}, e_{k - 1}) c (u_{k - 1} | e_{k - 1}) \\ \times \underset{\equiv partial cost \Rightarrow U (e_{k}, u_{k - 1})}{\underset{︸}{[ln (\frac{s (e_{k} | u_{k - 1}, e_{k - 1}) c (u_{k - 1} | e_{k - 1})}{{}^{I}s (e_{k} | u_{k - 1}, e_{k - 1}) {}^{I}c (u_{k - 1} | e_{k - 1})})}} \\ - \underset{optimal cost - to - go}{\underset{︸}{ln (γ (e_{k}))}}] d (e_{k}, u_{k - 1}) \end{array}

(5)

Proof

The derivation of the above result can be found in Herzallah and Karny (2011).

The optimal randomised controller that minimises the recurrence equation specified in equation (5) can then be shown to be given as specified in the following proposition.

Proposition 2

The pdf of the optimal randomised controller that minimises cost-to-go function (5) is given by

c (u_{k - 1} | e_{k - 1}) = \frac{c^{I} (u_{k - 1} | e_{k - 1}) \exp [- β_{1} (u_{k - 1}, e_{k - 1}) - β_{2} (u_{k - 1}, e_{k - 1})]}{γ (e_{k - 1})}

(6)

where

\begin{array}{l} γ (e_{k - 1}) = \int c^{I} (u_{k - 1} | e_{k - 1}) \exp [- β_{1} (u_{k - 1}, e_{k - 1}) \\ - β_{2} (u_{k - 1}, e_{k - 1})] d u_{k - 1} \\ β_{1} (u_{k - 1}, e_{k - 1}) = \int s (e_{k} | u_{k - 1}, e_{k - 1}) [ln \frac{s (e_{k} | u_{k - 1}, e_{k - 1})}{s^{I} (e_{k} | u_{k - 1}, e_{k - 1})}] d e_{k} \\ β_{2} (u_{k - 1}, e_{k - 1}) = - \int s (e_{k} | u_{k - 1}, e_{k - 1}) ln (γ (e_{k})) d e_{k} \end{array}

(7)

Proof

This proposition can be proven by adapting the proof of Proposition 2 in Karny and Guy (2006).

Note that the solution of the optimal randomised controller as specified in this proposition is not restricted by the pdf that characterises the error or the controller. It provides the general solution for the randomised controller without constraints on the required pdfs. However, the evaluation of the analytic solution for this randomised controller is not possible except for the special case of linear and Gaussian pdfs. Therefore, to facilitate the understanding and the analytical solution of the proposed tracking error–based FPD, the next section will demonstrate the solution to the probabilistic tracking control for a class of linear stochastic systems with multiplicative noise.

3. Solution of the probabilistic tracking control for linear stochastic systems

The theory developed in the previous section will be applied here to derive the analytic solution of the probabilistic tracking control for linear stochastic systems with multiplicative noise. Stochastic systems with multiplicative noises arise naturally in networked control systems where multiplicative noises are used to model packet loss. Previous works have considered this class of stochastic systems where the multiplicative noise is used to model packet loss (Wei et al., 2013) and time delay (Zhang et al., 2015) that happens during packet transmission in communication networks. This is different to parameters uncertainty (Lee et al., 2001; Liu et al., 2010; Xie et al., 1992) where the uncertainty of the parameters is usually grouped with the parameters of the state and can be considered stochastic or deterministic. The development of a robust control solution for these systems has been a long standing and still unsolved problem.

3.1. Model description

Consider a stochastic linear discrete-time system with multiplicative Gaussian noise described by

x_{k} = \tilde{A} x_{k - 1} + \tilde{B} u_{k - 1} + \tilde{D} x_{k - 1} v_{k - 1}

(8)

where

x_{k} \in ℜ^{n}

is the system state, and

u_{k} \in ℜ^{m}

is the system input as defined before, A, B, and D are system matrices with appropriate dimensions, and

v_{k} \in ℜ

is an independent Gaussian noise with zero mean and covariance Q.

It should be noted that in real-world situations the parameters of stochastic model (8) are not known in general, thus need to be estimated. However, because the current value of the system state is affected by noise, its value cannot be completely specified by the previous control and previous state values. Therefore, the probabilistic description of stochastic model (8) needs be estimated online using observed data from the stochastic system dynamics to describe the probabilistic evolution of the system state. The online estimation process of the stochastic system parameters and consequently the system state distribution will be discussed next.

3.2. Estimation of the probabilistic description of the system tracking error

As discussed in the previous section, because of the stochastic nature of the system dynamics, only the probabilistic description of the system state can be specified. This can be obtained by estimating the system parameters of the stochastic equation of the system state given in equation (8). Therefore, given our prior knowledge of the linear dynamics of the system and the fact that it is driven by multiplicative noise, the required model of system (8) can be assumed to have the following form

x_{k} = A x_{k - 1} + B u_{k - 1} + D x_{k - 1} v_{k - 1}

(9)

where A, B, and D are the estimates of the matrices

\tilde{A}, \tilde{B}

, and

\tilde{D}

, respectively. Then these parameters can be estimated online by updating their values at each time instant, k, when a new measurement of the state value becomes available. In particular, rewrite equation (9) as follows

\begin{array}{l} x_{k} = [\begin{matrix} A & B & D \end{matrix}] [\begin{matrix} x_{k - 1} \\ u_{k - 1} \\ x_{k - 1} v_{k - 1} \end{matrix}] \\ = ϑ χ_{k - 1} \end{array}

(10)

where

ϑ = [\begin{matrix} A & B & D \end{matrix}]

and

χ_{k - 1} = {[\begin{matrix} x_{k - 1} & u_{k - 1} & x_{k - 1} v_{k - 1} \end{matrix}]}^{T}

. Here, χ_k−1 has dimension (2n + m) × 1 and ϑ has dimension n × (2n + m), where n and m are the dimensionality of the state vector and control input, respectively, as stated earlier. Then given a new observation of the system state x_k, the parameter vector ϑ can be estimated. Because the matrix χ_k−1 is not a square matrix, the estimation of the parameter vector can be achieved by first multiplying both sides of equation (10) by

χ_{k - 1}^{T}

and then solving for the parameter vector ϑ

x_{k} χ_{k - 1}^{†} = ϑ

(11)

where

χ_{k - 1}^{†}

is a 1 × (2n + m) matrix known as the pseudo inverse of χ_k−1 and is given by

χ_{k - 1}^{†} = χ_{k - 1}^{T} {(χ_{k - 1} χ_{k - 1}^{T})}^{- 1}

(12)

Remark 1

As can be seen from equation (12), the pseudo-inverse matrix does have the property that $χ_{k - 1} χ_{k - 1}^{†} = I$ , where I is the identity matrix. However, note that $χ_{k - 1} χ_{k - 1}^{†} \neq I$ in general. If the matrix $χ_{k - 1} χ_{k - 1}^{T}$ is singular, then equation (11) does not have a unique solution. In this case, if the pseudo inverse is defined as

χ_{k - 1}^{†} = lim_{ι \to 0} χ_{k - 1}^{T} {(χ_{k - 1} χ_{k - 1}^{T} + ι I)}^{- 1}

(13)

then the limit can be shown to always exist and that the limiting value guarantees the optimal solution of equation (11).

Following the estimation of these parameters, the conditional distribution of the system state is shown to be Gaussian described by

s (x_{k} | x_{k - 1}, u_{k - 1}) \sim N (A x_{k - 1} + B u_{k - 1}, D x_{k - 1} Q x_{k - 1}^{T} D^{T})

(14)

where Ax_k−1 + Bu_k−1 is the mean of the state calculated using the estimated parameters A and B, and

D x_{k - 1} Q x_{k - 1}^{T} D^{T}

is the covariance of the state calculated using the estimated parameter D.

For the objective of deriving a randomised controller that will achieve a narrow tracking error distribution centred around zero, thus guaranteeing an accurate tracking of the system state to the desired value, the tracking error distribution needs to be specified. This can be obtained from the definition of the tracking error given in equation (2).

The dynamical description of the tracking error can then be obtained by substituting equation (9) into (2), which yields

\begin{array}{l} e_{k} = A x_{k - 1} + B u_{k - 1} + D x_{k - 1} v_{k - 1} - x_{r} \\ = A e_{k - 1} + B u_{k - 1} + D x_{k - 1} v_{k - 1} + F x_{r} \end{array}

(15)

where we have introduced the definition F = A − I. From equations (3), (14), and (15), the distribution of the tracking error is Gaussian with mean μ_k and covariance Σ_k specified as follows

s (e_{k} | u_{k - 1}, e_{k - 1}) \sim N (μ_{k}, Σ_{k})

(16)

where

μ_{k} = A e_{k - 1} + B u_{k - 1} + F x_{r}

(17)

\begin{array}{l} Σ_{k} = cov (e_{k} | u_{k - 1}, e_{k - 1}) \\ = E {(e_{k} - μ_{k}) {(e_{k} - μ_{k})}^{T}} \\ = D x_{k - 1} Q x_{k - 1}^{T} D^{T} \end{array}

(18)

3.3. Randomised control solution

In this section, the generalised fully probabilistic control solution of the tracking problem for the stochastic linear system with multiplicative noise defined in equation (8) is derived. As discussed in earlier sections, the pdf of the system tracking error is assumed to be unknown, thus estimated online as explained in Section 3.2. The purpose of the designed controller here is to make the pdf of the tracking error $s (e_{k} | u_{k - 1}, e_{k - 1})$ follow a predefined ideal pdf s^I(e_k|u_k−1,e_k−1) and bring the tracking error to zero. Thus, the ideal distribution of the system tracking error described by equation (16) is specified as

s^{I} (e_{k} | u_{k - 1}, e_{k - 1}) \sim N (0, Σ_{2})

(19)

where Σ₂ specifies the allowed fluctuations of the tracking error around its zero mean value. In addition, the ideal distribution of the sought randomised controller,

c (u_{k - 1} | e_{k - 1})

, is taken to be Gaussian with the following form

c^{I} (u_{k - 1} | e_{k - 1}) \sim N (μ_{u}, Γ)

(20)

where Γ is the covariance matrix of the ideal distribution of the control input and μ_u is the mean of the ideal distribution of the control input. To achieve the objective that the optimised randomised controller brings the tracking error between the system state and its desired value to zero, the mean value of the ideal distribution of the controller, μ_u, is calculated from equation (15) to be

\begin{array}{l} lim_{k \to \infty} [E {e_{k}}] = lim_{k \to \infty} [E {A e_{k - 1}} + E {B u_{k - 1}} + E {D x_{k - 1} v_{k - 1}} \\ + F x_{r}] \\ 0 = 0 + lim_{k \to \infty} [E {B u_{k - 1}} + F x_{r}] \\ lim_{k \to \infty} [E {u_{k - 1}}] = μ_{u} = - {(B^{T} B)}^{- 1} B^{T} F x_{r} \end{array}

(21)

Given the pdf of the tracking error defined in equation (16) and the ideal pdfs of the tracking error and controller defined in equations (19) and (20), respectively, the performance index for the class of linear stochastic systems defined in equation (9) can then be shown to be given by the following theorem.

Theorem 1

Using the pdf description of the tracking error dynamics specified by equation (16), the ideal distribution of the tracking error dynamics given by equation (19) and the ideal distribution of the controller given by equation (20) in equations (6) and (7) give the following performance index

- ln (γ (e_{k})) = 0.5 (e_{k}^{T} S_{k} e_{k} + P_{k} e_{k} + w_{k})

(22)

where

S_{k - 1} = - A^{T} M_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} A + M_{2} + A^{T} M_{k} A

(23)

\begin{array}{l} P_{k - 1} = 2 x_{r}^{T} (M_{2} + F^{T} M_{k} A) + P_{k} A \\ + 2 μ_{u}^{T} Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} A \\ - 2 x_{r}^{T} F^{T} M_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} A \\ - P_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} A \end{array}

(24)

\begin{array}{l} w_{k - 1} = x_{r}^{T} (M_{2} + F^{T} M_{k} F - F^{T} M_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} F) \\ \times x_{r} + w_{k} + P_{k} F x_{r} + μ_{u}^{T} (Γ^{- 1} - Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} Γ^{- T}) \\ \times μ_{u} - 0.25 P_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} P_{k}^{T} \\ + 2 μ_{u}^{T} Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} F x_{r} \\ - P_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k}^{T} F x_{r} \\ + μ_{u}^{T} Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} P_{k}^{T} - ln (π) \\ + ln ({(Γ^{- 1} + B^{T} M_{k} B)}^{- 1}) \end{array}

(25)

and where

\begin{array}{l} M_{k} = Σ_{2}^{- 1} + S_{k} \\ M_{2} = D^{T} S_{k} Q D \end{array}

(26)

Proof

The claimed quadratic form of the optimal performance function specified in equation (22) can be verified subsequently by backward induction. The proof starts by evaluating γ in equation (7), repeated here

\begin{array}{l} γ (e_{k - 1}) = \int c^{I} (u_{k - 1} | e_{k - 1}) \exp [- β_{1} (u_{k - 1}, e_{k - 1}) \\ - β_{2} (u_{k - 1}, e_{k - 1})] d u_{k - 1} \end{array}

(27)

This evaluation requires the evaluation of β₁ and β₂. Starting with β₁

\begin{array}{l} β_{1} (u_{k - 1}, e_{k - 1}) = \int s (e_{k} | u_{k - 1}, e_{k - 1}) ln \frac{s (e_{k} | u_{k - 1}, e_{k - 1})}{s^{I} (e_{k} | u_{k - 1}, e_{k - 1})} d e_{k} \\ = \int N (μ_{k}, Σ_{k}) (- 0.5 ln (| Σ_{k} | {| Σ_{2} |}^{- 1}) - 0.5 {(e_{k} - μ_{k})}^{T} {(Σ_{k})}^{- 1} \\ \times (e_{k} - μ_{k}) + 0.5 e_{k}^{T} {(Σ_{2})}^{- 1} e_{k}) d e_{k} \end{array}

(28)

To solve (28), the following rule from Golub and Meurant (2009) is required

ln (det (A_{1})) = tr (ln (A_{1}))

(29)

where A₁ is a positive definite matrix. Because

(| Σ_{k} | {| Σ_{2} |}^{- 1})

is positive definite, the

ln (| Σ_{k} | {| Σ_{2} |}^{- 1})

term in equation (28) can be rewritten as

ln (| Σ_{k} | {| Σ_{2} |}^{- 1}) = ln (| Σ_{k} Σ_{2}^{- 1} |) = tr (ln (Σ_{k} Σ_{2}^{- 1}))

(30)

Assumption 1

Because the objective of the sought randomised optimal controller is to make the distribution of the tracking error of the system dynamics as close as possible to the specified ideal distribution, it is expected that at steady state the covariance of the tracking error dynamics will become close to the covariance of the specified ideal distribution. This means that

‖ Σ_{k} Σ_{2}^{- 1} - I ‖ < 1

(31)

Remark 2

Please note that the covariance of the noise, Q, affecting the system will not be too large in real-world systems. This in turn means that $Σ_{k} = D x_{k - 1} Q x_{k - 1}^{T} D^{T}$ will not be too large as well. Therefore, the above assumption is a valid assumption. This will be proven numerically in the numerical results section, Section 4.

Based on Assumption 1 and Lemma 2.6 from Hall (2015), equation (30) can be approximated as follows

tr (ln (Σ_{k} Σ_{2}^{- 1})) \approx tr (Σ_{k} Σ_{2}^{- 1} - I) \approx tr (Σ_{k} Σ_{2}^{- 1}) - n

(32)

where n is the dimension of e_k.

Using equation (32) in (28) and expanding the terms of equation (28), we get

\begin{array}{l} β_{1} (u_{k - 1}, e_{k - 1}) = \int N (μ_{k}, Σ_{k}) (- 0.5 tr (Σ_{k} Σ_{2}^{- 1}) + 0.5 n \\ + 0.5 e_{k}^{T} (Σ_{2}^{- 1} - Σ_{k}^{- 1}) e_{k} - 0.5 μ_{k}^{T} Σ_{k}^{- 1} μ_{k} + e_{k}^{T} Σ_{k}^{- 1} μ_{k}) d e_{k}, \\ = 0.5 μ_{k}^{T} Σ_{k}^{- 1} μ_{k} - 0.5 tr (Σ_{k} Σ_{2}^{- 1}) + 0.5 n + 0.5 \int N (μ_{k}, Σ_{k}) e_{k}^{T} \\ \times (Σ_{2}^{- 1} - Σ_{k}^{- 1}) e_{k} d e_{k} \end{array}

(33)

The last part in equation (33), $0.5 \int N (μ_{k}, Σ_{k}) e_{k}^{T} (Σ_{2}^{- 1} - Σ_{k}^{- 1}) e_{k} d e_{k}$ , can be evaluated as follows

\begin{array}{l} 0.5 \int N (μ_{k}, Σ_{k}) e_{k}^{T} (Σ_{2}^{- 1} - Σ_{k}^{- 1}) e_{k} d e_{k} \\ = 0.5 (tr [Σ_{2}^{- 1} Σ_{k}] - n) + 0.5 μ_{k}^{T} (Σ_{2}^{- 1} - Σ_{k}^{- 1}) μ_{k} \end{array}

(34)

Substituting equation (34) back into (33), we obtain

β_{1} (u_{k - 1}, e_{k - 1}) = 0.5 μ_{k}^{T} Σ_{2}^{- 1} μ_{k}

(35)

Similarly, $β_{2} (u_{k - 1}, e_{k - 1})$ can be evaluated as follows

\begin{array}{l} β_{2} (u_{k - 1}, e_{k - 1}) = - \int s (e_{k} | u_{k - 1}, e_{k - 1}) ln (γ (e_{k})) d e_{k} \\ = \int N (μ_{k}, Σ_{k}) [0.5 (e_{k}^{T} S_{k} e_{k} + P_{k} e_{k} + w_{k})] d e_{k} \\ = 0.5 μ_{k}^{T} S_{k} μ_{k} + 0.5 w_{k} + 0.5 x_{r}^{T} M_{2} x_{r} \\ + 0.5 e_{k - 1}^{T} M_{2} e_{k - 1} + x_{r}^{T} M_{2} e_{k - 1} + 0.5 P_{k} μ_{k} \end{array}

(36)

where we have used

\begin{array}{l} tr (S_{k} Σ_{k}) = x_{k - 1}^{T} M_{2} x_{k - 1} \\ = e_{k - 1}^{T} M_{2} e_{k - 1} + x_{r}^{T} M_{2} x_{r} + 2 x_{r}^{T} M_{2} e_{k - 1} \end{array}

(37)

with M₂ = D^TS_kQD. Thereupon, substituting equations (35) and (36) in (27) and collecting the terms that multiply the control input, u_k−1, together yields

\begin{array}{l} γ (e_{k - 1}) = \int c^{I} (u_{k - 1} | e_{k - 1}) \exp [- β_{1} (u_{k - 1}, e_{k - 1}) \\ - β_{2} (u_{k - 1}, e_{k - 1})] d u_{k - 1} \\ = {(2 π | Γ |)}^{- (1 / 2)} \exp [- 0.5 {e_{k - 1}^{T} [A^{T} M_{k} A + M_{2}] e_{k - 1} \\ + w_{k} + 2 x_{r}^{T} (F^{T} M_{k} A + M_{2}) e_{k - 1} + μ_{u}^{T} Γ^{- 1} μ_{u} \\ + P_{k} A e_{k - 1} + P_{k} F x_{r} + x_{r}^{T} (F^{T} M_{k} F + M_{2}) x_{r}}] \\ \times \int \exp [- 0.5 {u_{k - 1}^{T} (B^{T} M_{k} B + Γ^{- 1}) u_{k - 1} + 2 u_{k - 1}^{T} (- Γ^{- 1} μ_{u} \\ + B^{T} M_{k} A e_{k - 1} + B^{T} M_{k} F x_{r} + 0.5 B^{T} P_{k}^{T})}] d u_{k - 1} \end{array}

(38)

The integral in equation (38) can be calculated by completing the square with respect to u_k−1. Consequently, $γ (e_{k - 1})$ can be shown to be given by

\begin{array}{l} γ (x_{k - 1}) = \exp [- 0.5 {- ln (2 π) - 0.5 ln | {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} | \\ + e_{k - 1}^{T} (- A^{T} M_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} A \\ + M_{2} + A^{T} M_{k} A) e_{k - 1} + {2 x_{r}^{T} (M_{2} + F^{T} M_{k} A) \\ + P_{k} A + 2 μ_{u}^{T} Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} A \\ - 2 x_{r}^{T} F^{T} M_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} A \\ - P_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} A)} e_{k - 1} + w_{k} \\ + P_{k} F x_{r} + x_{r}^{T} (M_{2} + F^{T} M_{k} F \\ - F^{T} M_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} F) x_{r} \\ + μ_{u}^{T} (Γ^{- 1} - Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} Γ^{- 1}) μ_{u} \\ - 0.25 P_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} P_{k}^{T} \\ + 2 μ_{u}^{T} Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} F x_{r} \\ - P_{k} B {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} M_{k} F x_{r} \\ + μ_{u}^{T} Γ^{- 1} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} B^{T} P_{k}^{T}}] \end{array}

(39)

Note that according to Theorem 1, $- ln (γ (e_{k - 1})) = 0.5 (e_{k - 1}^{T} S_{k - 1} e_{k - 1} + P_{k - 1} e_{k - 1} + w_{k - 1})$ . Thus, equating quadratic terms, linear terms, and constant terms in equation (39) with S_k−1, P_k−1, and w_k−1, respectively, yields the definitions stated in equations (23)–(25). This completes the proof. □

Following the above verification of the quadratic performance index, the next step is to evaluate the parameters of the optimal controller distribution that will make the pdf of the tracking error follow the given ideal pdf. Based on equations (6) and (39), the randomised optimal controller that minimises the Kullback–Leibler divergence objective function is given by the following theorem.

Theorem 2

The optimal randomised controller that minimises the Kullback–Leibler divergence objective function subject to the probability density function of the tracking error defined in equation (16) and the ideal pdfs of the tracking error and controller defined in equations (19) and (20), respectively, is given by

c (u_{k - 1} | e_{k - 1}) \sim N (ν_{k - 1}, Γ_{k - 1})

(40)

where

ν_{k - 1} = - K_{k - 1} e_{k - 1} - L_{k - 1}

(41)

K_{k - 1} = Γ_{k - 1} B^{T} M_{k} A

(42)

L_{k - 1} = Γ_{k - 1} (B^{T} M_{k} F x_{r} + 0.5 B^{T} P_{k}^{T} - Γ^{- 1} μ_{u})

(43)

Γ_{k - 1} = {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1}

(44)

where ν_k−1 and Γ_k−1 are the mean and covariance of the optimal randomised controller, respectively, K_k−1 is the controller feedback gain, and L_k−1 is a linear shift which manifests from the considered tracking control problem.

Proof

Substituting equations (20), (35), (36), and (39) in (6) yields

\begin{array}{l} c^{*} (u_{k - 1} | e_{k - 1}) = {(2 π Γ)}^{- 0.5} \exp {- 0.5 μ_{k}^{T} M_{k} μ_{k} \\ - 0.5 e_{k - 1}^{T} M_{2} e_{k - 1} - 0.5 x_{r}^{T} M_{2} x_{r} - x_{r}^{T} M_{2} e_{k - 1} - 0.5 w_{k} \\ - 0.5 P_{k} μ_{k} - 0.5 {(u_{k - 1} - μ_{u})}^{T} Γ^{- 1} (u_{k - 1} - μ_{u}) \\ + 0.5 e_{k - 1}^{T} S_{k - 1} e_{k - 1} + 0.5 P_{k - 1} e_{k - 1} + 0.5 w_{k - 1}} \end{array}

(45)

Evaluating the terms independent of u_k−1 in equation (45), and completing the square with respect to the control input, u_k−1, equation (45) can be further expressed as

\begin{array}{l} c^{*} (u_{k - 1} | e_{k - 1}) = {(2 π Γ)}^{- 0.5} \exp [- 0.5 {u_{k - 1}^{T} (B^{T} M_{k} B \\ + Γ^{- 1}) u_{k - 1} + 2 u_{k - 1} [B^{T} M_{k} F e_{k - 1} + B^{T} M_{k} F x_{r} + 0.5 B^{T} P_{k} \\ - Γ^{- 1} μ_{u}] + {(Γ^{- 1} μ_{u} + B^{T} M_{k} F x_{r} + B^{T} M_{k} A e_{k - 1} + 0.5 B^{T} p_{k}^{T})}^{T} \\ \times {(B^{T} M_{k} B + Γ^{- 1})}^{- 1} (Γ^{- 1} μ_{u} + B^{T} M_{k} F x_{r} + B^{T} M_{k} A e_{k - 1} \\ + 0.5 B^{T} p_{k}^{T})}] \\ = {(2 π Γ)}^{- 0.5} \exp {- 0.5 [u_{k - 1} - (- B^{T} M_{k} A e_{k - 1} - B^{T} M_{k} F x_{r} \\ {- 0.5 B^{T} P_{k}^{T} + Γ^{- 1} μ_{u}) {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1}]}^{T} (Γ^{- 1} + B^{T} M_{k} B) \\ \times [u_{k - 1} - (- B^{T} M_{k} A e_{k - 1} - B^{T} M_{k} F x_{r} - 0.5 B^{T} P_{k}^{T} + Γ^{- 1} μ_{u}) \\ \times {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1}]} {(Γ^{- 1} + B^{T} M_{k} B)}^{- 1} \end{array}

(46)

It can be seen that the mean and covariance of the distribution given in equation (46) are the mean and covariance of the optimised randomised controller as stated in equations (41) and (44), respectively. This completes the proof. □

4. Numerical results

This section will demonstrate the effectiveness of the proposed probabilistic minimisation of the tracking error specified by Theorems 1 and 2 in driving the output of the system dynamics to a predefined desired output value. In particular, the theory developed in Section 3 is applied here to a flexible beam system (Flores and Barbieri, 2006) described by the following equation

\dot{x} = A x + B u

(47)

where

A = [\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 38.1425 & 0 & 239.0350 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & - 47.0569 & 0 - 271.9385 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & - 6.9241 & 0 & - 187.2933 & 0 \end{matrix}]

B = {[\begin{matrix} 0 & 9.4393 & 0 & - 10.7386 & 0 & - 1.7135 \end{matrix}]}^{T}

Also, $x = {[\begin{matrix} θ & \dot{θ} & q_{1} & {\dot{q}}_{1} & q_{2} & {\dot{q}}_{2} \end{matrix}]}^{T}$ is the beam system state vector, θ is the angle between the hubs frame and a global (stationary) reference frame, and q_i, i = 1, 2 represent the ith flexible mode.

Because the proposed framework is developed for discrete-time systems, equation (47) is discretised using the forward difference method, where the sampling time, h, is taken to be equal to 0.06. In addition, to demonstrate all aspects of the proposed method, a multiplicative noise is added to the original deterministic system equation after it has been discretised. Following the discretisation of equation (47) and the addition of the multiplicative noise, the following equivalent discrete-time description which is also modified by the addition of the multiplicative noise is obtained

x_{k} = (A h + I_{n \times n}) x_{k - 1} + B h u_{k - 1} + D x_{k - 1} v_{k - 1}

(48)

where v_k is the Gaussian noise with zero mean and variance 0.001,

v_{k} \sim N (0, 0.001)

, I is the identity matrix, and where

D = 10^{- 3} [\begin{matrix} 9.0 & 6.3 & - 0.2 & - 5.4 & - 21.2 & - 5.7 \\ - 4.3 & - 0.2 & 8.7 & 3.5 & - 1.5 & 13.1 \\ 9.0 & 19.1 & - 13.7 & - 4.4 & 5.2 & 2.5 \\ - 22.6 & 21.7 & - 10.7 & - 6.4 & 3.9 & - 8.1 \\ - 5.9 & - 0.7 & 11.3 & - 2.0 & 1.9 & - 3.4 \\ 1.7 & 8.4 & 10.3 & - 6.4 & - 3.8 & 7.0 \end{matrix}]

is randomly generated.

The objective of the sought randomised controller is then specified to be of suppressing the quenching vibration in the beam and stabilising the angle between the hubs frame and a global (stationary) reference frame, θ at the value of 1. Therefore, the reference value that the system state is required to track is taken to be x_r = [1,0,0,0,0,0]^T. In addition, the system state is assumed to start from the following initial state values x₀ = [22,0.3,1,0.4,0.5,2]^T.

As discussed in Section 3, the parameters of the flexible beam system equation as specified in equation (48) are assumed to be unknown, therefore are estimated online at each time step. The mean and covariance of the conditional distribution of the beam system dynamics are then specified using the estimated parameters as discussed in Section 3. These estimates of the mean and covariance of the beam system dynamics are then used in equations (23) and (24) to evaluate the Riccati equation, S_k, as well as P_k which are then both used in equation (41) to calculate the mean of the control input to be forwarded to the beam. Also, in the simulation experiment, the covariance, Σ₂, of the ideal distribution of the tracking error is taken to be 0.01 × I_n×n. The covariance, Γ, of the ideal distribution of control inputs is taken to be 1. The simulation results are shown in Figures 1 and 2. Figure 1 shows the various states of the flexible beam with their corresponding reference signals. As can be seen from this figure, all the flexible beam system states are accurately tracking their corresponding reference states. This can be confirmed from the magnified figures in Figure 1 which show the steady state values of the beam states. The tracking errors are presented in Figure 2(a), from which it can also be seen that all the state tracking errors go to zero. These figures, on the other hand, show large deviation of the beam state values from their corresponding reference values and large tracking errors in the transient period. This is expected as the parameters of the beam equation which are estimated online will not have converged to their true values in this transient period. Once the parameters converge to their true values, the beam states show good tracking to their corresponding reference values. Also, the control input as calculated from equation (41) is shown in Figure 2(b). The control input as can be seen from this figure is stable, thus yielding the required results. Finally, the feedback gain as calculated from equation (42) is shown in Figure 3. This figure shows that all the feedback gains have converged and reached steady state values. To reemphasise, the numerical results prove the efficacy of the proposed probabilistic tracking control method and show that the mean of tracking error can be minimised to reach zero value.

Figure 1.

State of the simulated flexible beam: (a) state x₁ (dotted line) and reference state $x_{r_{1}}$ (solid line), (b) state x₂ (dotted line) and reference state $x_{r_{2}}$ (solid line), (c) state x₃ (dotted line) and reference state $x_{r_{3}}$ (solid line), (d) state x₄ (dotted line) and reference state $x_{r_{4}}$ (solid line), (e) state x₅ (dotted line) and reference state $x_{r_{5}}$ (solid line), and (f) state x₆ (dotted line) and reference state $x_{r_{6}}$ (solid line). Small magnified figures show the steady state values of the beam states.

Figure 2.

Tracking error of the states and system input of the simulated flexible beam: (a) tracking error of the flexible beam states and (b) the control input to the flexible beam as calculated from equation (41).

Figure 3.

Feedback gain of the controller as calculated from equation (42).

5. Conclusion

This article presented a new framework for the design of randomised controllers for complex stochastic and uncertain systems that is based on the minimisation of the Kullback–Leibler divergence of the tracking error of the controlled system. The new proposed framework considers the design of randomised controllers that take the multiplicative noises that affect the dynamics of the controlled stochastic system into consideration in the optimisation process. The theoretical development of this framework is demonstrated on linear Gaussian stochastic systems that are affected by multiplicative noises. The theoretical findings were then validated on controlling the vibration quenching of flexible beams.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was funded by a Leverhulme Trust Research Project Grant number RPG-2017-337.

ORCID iD

Randa Herzallah

References

Chen

Jiao

(2010) Adaptive tracking for periodically timevarying and nonlinearly parameterized systems using multilayer neural networks. IEEE Transactions on Neural Networks and Learning Systems 21: 348–351.

Cliff

Prokopenko

Fitch

(2018) Minimising the Kullback–Leibler divergence for model selection in distributed nonlinear systems. Entropy 20(51): 2–28.

Flores

Barbieri

(2006) Real-time infinite horizon linear-quadratic tracking controller for vibration quenching in flexible beams. In: IEEE conference on systems, man, and cybernetics, Taipei, Taiwan, 8–11 October 2006. IEEE.

Gaudio

Gibson

Annaswamy

, et al. (2019) Connections between adaptive control and optimization in machine learning. In: 58th IEEE conference on decision and control (CDC), Nice, France, 11–13 December 2019. IEEE.

Gerasimov

Miliushin

Nikiforov

(2019) Algorithms of adaptive tracking of unknown multisinusoidal signals in linear systems with arbitrary input delay. International Journal of Adaptive Control and Signal Processing 33: 900–912.

Golub

Meurant

(2009) Matrices, Moments and Quadrature with Applications. Princeton, NJ: Princeton University Press, Vol. 30.

Hall

(2015) Lie Groups, Lie Algebras, and Representations: An Elementary Introduction. Switzerland: Springer, Vol. 222.

Herzallah

(2007) Adaptive critic methods for stochastic systems with input-dependent noise. Automatica 43(8): 1355–1362.

Herzallah

Kárný

(2011) Fully probabilistic control design in an adaptive critic framework. Neural Networks 24(10): 1128–1135.

10.

Herzallah

Lowe

(2003) Multi-valued control problems and mixture density network. In: IFAC international conference on intelligent control systems and signal processing, Faro, Portugal, 1 April 2003, pp. 387–392. Oxford: Australian Academic Press.

11.

Herzallah

Lowe

(2004) A Bayesian approach to modeling the conditional density of the inverse controller. In: Proceedings of the 2004 IEEE international conference on control applications, Taipe, Taiwan, 2–4 September 2004, pp. 788–793. IEEE.

12.

Herzallah

Lowe

(2006) Bayesian adaptive control of nonlinear systems with functional uncertainty. In: Controlo, Lisboe, Portugal, 11–13 September 2006.

13.

Humaidi

Hameed

(2019) Design and comparative study of advanced adaptive control schemes for position control of electronic throttle valve. Information 10: 1–14.

14.

Kárný

(1996) Towards fully probabilistic control design. Automatica 32(12): 1719–1722.

15.

Kárný

Guy

(2006) Fully probabilistic control design. Systems & Control Letters 55(4): 259–265.

16.

Kulback

(1959) Information Theory and Statistics. New York: John Wiley and Sons.

17.

Lee

Park

Chen

(2001) Robust fuzzy control of nonlinear systems with parametric uncertainties. IEEE Transactions on Fuzzy Systems 9(2): 369–379.

18.

Liu

Wang

(2010) Robust reliable control for discrete-time-delay systems with stochastic nonlinearities and multiplicative noises. Optimal Control Applications and Methods 32: 285–297.

19.

Narendra

Annaswamy

(2005) Stable Adaptive Systems. Mineola, NY: Dover.

20.

Pappalardo

Zhang

, et al. (2016) Rational ANCF thin plate finite element. Journal of Computational and Nonlinear Dynamics 11: 15.

21.

Simone

MCD

Rivera

Guida

(2018) Finite element analysis on squeal-noise in railway applications. FME Transaction 46: 93–100.

22.

Sohn

Han

Choi

, et al. (2009) Vibration and position tracking control of a flexible beam using SMA wire actuators. Journal of Vibration and Control 15(2): 263–281.

23.

Song

(2007) Active vibration suppression of a smart flexible beam using a sliding mode based controller. Journal of Vibration and Control 13(8): 1095–1107.

24.

Tao

(2003) Adaptive control design and analysis. Hoboken, NJ: John Wiley and Sons.

25.

Wei

Zhang

(2013) Quantized stabilization for stochastic discrete-time systems with multiplicative noises. International Journal of Robust and Nonlinear Control 23(6): 591–601.

26.

(2019) Adaptive robust course-tracking control of time-varying uncertain ships with disturbances. International Journal of Control, Automation and Systems 17: 1847–1855.

27.

Xie

de Souza

(1992) H/sub infinity/control and quadratic stabilization of systems with parameter uncertainty via output feedback. IEEE Transactions on Automatic Control 37(8): 1253–1256.

28.

Mehta

(2009) The Kullback–Leibler rate metric for comparing dynamical systems. In: 48th IEEE conference on decision and control (CDC), Shanghai, China, 15–18 December 2009. IEEE.

29.

Yue

Wang

(2003) Minimum entropy control of closed-loop tracking errors for dynamic stochastic systems. IEEE Transactions on Automatic Control 48: 118–122.

30.

Zhang

, et al. (2015) Linear quadratic regulation and stabilization of discrete-time systems with delay and multiplicative noise. IEEE Transactions on Automatic Control 60(10): 2599–2613.

31.

Zhang

Zhou

Wang

, et al. (2016) Output feedback stabilization for a class of multi-variable bilinear stochastic systems with stochastic coupling attenuation. IEEE Transactions on Automatic Control 62(6): 2936–2942.

32.

Zhou

Wang

Zhou

, et al. (2020) Dynamic performance enhancement for nonlinear stochastic systems using RBF driven nonlinear compensation with extended Kalman filter. Automatica 112: 108693.

33.

Zhou

Zhang

Wang

, et al. (2017)EKF-based enhanced performance controller design for nonlinear stochastic systems. IEEE Transactions on Automatic Control 63(4): 1155–1162.

A tracking error–based fully probabilistic control for stochastic discrete-time systems with multiplicative noise

Abstract

1. Introduction

2. Problem statement

3. Solution of the probabilistic tracking control for linear stochastic systems

3.1. Model description

3.2. Estimation of the probabilistic description of the system tracking error

3.3. Randomised control solution

4. Numerical results

5. Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References