Abstract
This article proposes the exploitation of the Kullback–Leibler divergence to characterise the uncertainty of the tracking error for general stochastic systems without constraints of certain distributions. The general solution to the fully probabilistic design of the tracking error control problem is first stated. Further development then focuses on the derivation of a randomised controller for a class of linear stochastic Gaussian systems that are affected by multiplicative noise. The derived control solution takes the multiplicative noise of the controlled system into consideration in the derivation of the randomised controller. The proposed fully probabilistic design of the tracking error of the system dynamics is a more legitimate approach than the conventional fully probabilistic design method. It directly characterises the main objective of system control. The efficiency of the proposed method is then demonstrated on a flexible beam example where the vibration quenching in flexible beams is shown to be effectively suppressed.
1. Introduction
In control systems, the tracking error between the system output and a predefined desired output is the most commonly used optimisation signal for the tuning of the parameters of the system controller (Gaudio et al., 2019; Gerasimov et al., 2019; Humaidi and Hameed, 2019; Wu and Du, 2019; Zhou et al., 2020; Zhou et al., 2017). When accompanied with adaptive control (Chen and Jiao, 2010; Narendra and Annaswamy, 2005; Tao, 2003), the approach has been particularly proven useful to control systems that are affected by model uncertainty, random noises, and that are operating under changing environments and have unforeseen variations in their overall structure. Despite being adaptive and therefore are expected to deal with the underlying system uncertainty, many of the aforementioned methods are based on the minimisation of the mean square tracking error to optimise the controller parameters. The minimisation of the mean square tracking error, also known as tracking error variance, on the other hand, is based on the assumption of certainty equivalence; therefore, it does not generally yield a good performance. Thus, for more general stochastic systems and for systems with functional and model uncertainty, the variance of the tracking error cannot be used alone to represent the performance of the closed-loop system (Herzallah, 2007; Herzallah and Lowe, 2003; Herzallah and Lowe, 2004; Herzallah and Lowe, 2006; Yue and Wang, 2003; Zhang et al., 2016). As a result, the Kullback–Leibler divergence (Cliff et al., 2018; Kulback, 1959; Yu and Mehta, 2009) measure has been proposed recently in several control literatures to characterise the uncertainty of the stochastic systems dynamics. This is because the Kullback–Leibler divergence measures the discrepancies between the stochastic system distributions to their desired distributions rather than characterising them by their means or variances.
An efficient control approach, known as fully probabilistic design (FPD), that uses the Kullback–Leibler divergence as a performance measure for designing randomised controllers has been proposed in Karny (1996) and Herzallah and Karny (2011). In this approach, the Kullback–Leibler divergence is used to measure the discrepancy between the joint pdf of the closed-loop description of the system dynamics and an ideal joint pdf. The main advantage of the FPD control approach is that it provides a closed-form solution for general description of stochastic systems without constraints of certain distribution. However, although a closed-form solution can be obtained, the solution cannot be evaluated analytically because of the multivariate integration involved in the optimisation process. Besides, in its original form the FPD control method considers the design of a randomised controller that shapes the pdf of the system dynamics. Nonetheless, the characterisation of the pdf of the system dynamics can be difficult for many real-world systems that work under high levels of uncertainty and stochasticity. Furthermore, in many real engineering systems the controller objective is to make the output of the system dynamics follow a predefined desired output value, thus emphasising the importance of the tracking error rather than the actual system output.
As such, this study follows an alternative approach where the Kullback–Leibler divergence is defined to be the distance between the pdf of the joint distribution of the tracking error and the randomised controller of the controlled system to an ideal joint distribution function. Therefore, the randomised controller is designed here to reshape the pdf of the tracking error of the controlled system rather than the pdf of its dynamics. Compared with the existing results on the topic and the conventional approach of FPD, this alternative approach has several advantages that have not been reported in the literature. First, the characterisation of the pdf of tracking error of the controlled system is normally easier than that of the pdf of its dynamics. This is because when the stochastic dynamics of the controlled system are estimated accurately, the resulting tracking error of the system will be small and most likely can be characterised by a Gaussian pdf. The aforementioned in turn simplifies the optimisation of the sought randomised controller. Second, the ideal distribution of the tracking error can be naturally specified by a zero mean distribution. In particular, a Gaussian distribution with zero mean and a prespecified covariance matrix that determines the allowed fluctuations of the tracking error around its zero mean value would be ideal. Furthermore, the FPD method in its original form considers additive noise only to the system dynamics. Our alternative solution considers stochastic systems with multiplicative noises which represent conditions under which most real-world systems operate. Therefore, an additional contribution of the study is the consideration of the multiplicative noise of the stochastic system in the derivation of the randomised optimal control law. Moreover, the proposed probabilistic minimisation of the tracking error will be shown to be particularly useful for solving the vibration control problem associated with mechanical systems. The vibration control problem is particularly challenging and is relevant to many real-world control problems, including robotic manipulators, aerospace structures, and biomechanical systems (Flores and Barbieri, 2006; Pappalardo et al., 2016; Simone et al., 2018; Sohn et al., 2009; Song and Gu, 2007).
To reemphasise, this alternative solution of the tracking error and the extension of the FPD to stochastic systems with multiplicative noises have not been discussed previously in the literature. Its theoretical development and numerical demonstration will be presented for the first time in this article.
2. Problem statement
In the original formulation of the FPD, the aim is to derive a randomised controller that shapes the joint probability density function of the stochastic system dynamics and the controller. This joint probability density function of the controller and the dynamics of the stochastic system represents the complete description of the closed-loop behaviour of the controlled system. However, in some control applications, the system is required to track a predefined desired trajectory. Thus, for these control applications, it would be more convenient to design the controller such that it reshapes the pdf of the tracking error as opposed to the original formulation of reshaping the pdf of the system dynamics. For the system to be able to track the desired signal, the controller should be designed such that the pdf of the tracking error is centred around zero with small variations. This objective of achieving a narrow distribution of the tracking error centred around zero error state implies that the system has tracked the desired trajectory and at the same time indicates that the uncertainty in the tracked trajectory is small. To be more specific, assume that the stochastic system can be described at each time instant k by the following conditional pdf
Because the considered system in this study is stochastic and subject to random forces and functional uncertainties, only the probability density function of the state values defined in equation (1) can be specified. On the other hand, because the objective of this study is to design a randomised controller that shapes the pdf of the tracking error as a result of the requirements that the system state tracks a desired set point, the pdf of the tracking error needs to be assumed to be known which may be an unrealistic assumption for many real-world control problems. However, the density function of the tracking error can be obtained from the density function of the system dynamics using the probability theory as follows
In general, s(x k |.) is not known in reality, thus needs to be estimated online using the observed data of the controlled system. The estimation process of this pdf is explained in Section 3.2.
Once the pdf of the tracking error is estimated, the randomised controller can be derived by redefining the Kullback–Leibler divergence such that the discrepancy between the joint pdf of the tracking error and the controller and a predefined ideal joint pdf is minimised
The optimal randomised controller c(uk−1|ek−1) can be obtained by recursively solving the following recurrence equation (Herzallah and Karny, 2011)
The derivation of the above result can be found in Herzallah and Karny (2011). The optimal randomised controller that minimises the recurrence equation specified in equation (5) can then be shown to be given as specified in the following proposition.
The pdf of the optimal randomised controller that minimises cost-to-go function (5) is given by
This proposition can be proven by adapting the proof of Proposition 2 in Karny and Guy (2006). Note that the solution of the optimal randomised controller as specified in this proposition is not restricted by the pdf that characterises the error or the controller. It provides the general solution for the randomised controller without constraints on the required pdfs. However, the evaluation of the analytic solution for this randomised controller is not possible except for the special case of linear and Gaussian pdfs. Therefore, to facilitate the understanding and the analytical solution of the proposed tracking error–based FPD, the next section will demonstrate the solution to the probabilistic tracking control for a class of linear stochastic systems with multiplicative noise.
3. Solution of the probabilistic tracking control for linear stochastic systems
The theory developed in the previous section will be applied here to derive the analytic solution of the probabilistic tracking control for linear stochastic systems with multiplicative noise. Stochastic systems with multiplicative noises arise naturally in networked control systems where multiplicative noises are used to model packet loss. Previous works have considered this class of stochastic systems where the multiplicative noise is used to model packet loss (Wei et al., 2013) and time delay (Zhang et al., 2015) that happens during packet transmission in communication networks. This is different to parameters uncertainty (Lee et al., 2001; Liu et al., 2010; Xie et al., 1992) where the uncertainty of the parameters is usually grouped with the parameters of the state and can be considered stochastic or deterministic. The development of a robust control solution for these systems has been a long standing and still unsolved problem.
3.1. Model description
Consider a stochastic linear discrete-time system with multiplicative Gaussian noise described by
It should be noted that in real-world situations the parameters of stochastic model (8) are not known in general, thus need to be estimated. However, because the current value of the system state is affected by noise, its value cannot be completely specified by the previous control and previous state values. Therefore, the probabilistic description of stochastic model (8) needs be estimated online using observed data from the stochastic system dynamics to describe the probabilistic evolution of the system state. The online estimation process of the stochastic system parameters and consequently the system state distribution will be discussed next.
3.2. Estimation of the probabilistic description of the system tracking error
As discussed in the previous section, because of the stochastic nature of the system dynamics, only the probabilistic description of the system state can be specified. This can be obtained by estimating the system parameters of the stochastic equation of the system state given in equation (8). Therefore, given our prior knowledge of the linear dynamics of the system and the fact that it is driven by multiplicative noise, the required model of system (8) can be assumed to have the following form
As can be seen from equation (12), the pseudo-inverse matrix does have the property that Following the estimation of these parameters, the conditional distribution of the system state is shown to be Gaussian described by For the objective of deriving a randomised controller that will achieve a narrow tracking error distribution centred around zero, thus guaranteeing an accurate tracking of the system state to the desired value, the tracking error distribution needs to be specified. This can be obtained from the definition of the tracking error given in equation (2). The dynamical description of the tracking error can then be obtained by substituting equation (9) into (2), which yields
3.3. Randomised control solution
In this section, the generalised fully probabilistic control solution of the tracking problem for the stochastic linear system with multiplicative noise defined in equation (8) is derived. As discussed in earlier sections, the pdf of the system tracking error is assumed to be unknown, thus estimated online as explained in Section 3.2. The purpose of the designed controller here is to make the pdf of the tracking error
Given the pdf of the tracking error defined in equation (16) and the ideal pdfs of the tracking error and controller defined in equations (19) and (20), respectively, the performance index for the class of linear stochastic systems defined in equation (9) can then be shown to be given by the following theorem.
Using the pdf description of the tracking error dynamics specified by equation (16), the ideal distribution of the tracking error dynamics given by equation (19) and the ideal distribution of the controller given by equation (20) in equations (6) and (7) give the following performance index
The claimed quadratic form of the optimal performance function specified in equation (22) can be verified subsequently by backward induction. The proof starts by evaluating γ in equation (7), repeated here This evaluation requires the evaluation of β1 and β2. Starting with β1 To solve (28), the following rule from Golub and Meurant (2009) is required
Because the objective of the sought randomised optimal controller is to make the distribution of the tracking error of the system dynamics as close as possible to the specified ideal distribution, it is expected that at steady state the covariance of the tracking error dynamics will become close to the covariance of the specified ideal distribution. This means that
Please note that the covariance of the noise, Q, affecting the system will not be too large in real-world systems. This in turn means that Based on Assumption 1 and Lemma 2.6 from Hall (2015), equation (30) can be approximated as follows Using equation (32) in (28) and expanding the terms of equation (28), we get The last part in equation (33), Substituting equation (34) back into (33), we obtain Similarly, The integral in equation (38) can be calculated by completing the square with respect to uk−1. Consequently, Note that according to Theorem 1, Following the above verification of the quadratic performance index, the next step is to evaluate the parameters of the optimal controller distribution that will make the pdf of the tracking error follow the given ideal pdf. Based on equations (6) and (39), the randomised optimal controller that minimises the Kullback–Leibler divergence objective function is given by the following theorem.
The optimal randomised controller that minimises the Kullback–Leibler divergence objective function subject to the probability density function of the tracking error defined in equation (16) and the ideal pdfs of the tracking error and controller defined in equations (19) and (20), respectively, is given by
Substituting equations (20), (35), (36), and (39) in (6) yields Evaluating the terms independent of uk−1 in equation (45), and completing the square with respect to the control input, uk−1, equation (45) can be further expressed as It can be seen that the mean and covariance of the distribution given in equation (46) are the mean and covariance of the optimised randomised controller as stated in equations (41) and (44), respectively. This completes the proof. □
4. Numerical results
This section will demonstrate the effectiveness of the proposed probabilistic minimisation of the tracking error specified by Theorems 1 and 2 in driving the output of the system dynamics to a predefined desired output value. In particular, the theory developed in Section 3 is applied here to a flexible beam system (Flores and Barbieri, 2006) described by the following equation
Also,
Because the proposed framework is developed for discrete-time systems, equation (47) is discretised using the forward difference method, where the sampling time, h, is taken to be equal to 0.06. In addition, to demonstrate all aspects of the proposed method, a multiplicative noise is added to the original deterministic system equation after it has been discretised. Following the discretisation of equation (47) and the addition of the multiplicative noise, the following equivalent discrete-time description which is also modified by the addition of the multiplicative noise is obtained
The objective of the sought randomised controller is then specified to be of suppressing the quenching vibration in the beam and stabilising the angle between the hubs frame and a global (stationary) reference frame, θ at the value of 1. Therefore, the reference value that the system state is required to track is taken to be x r = [1,0,0,0,0,0] T . In addition, the system state is assumed to start from the following initial state values x0 = [22,0.3,1,0.4,0.5,2] T .
As discussed in Section 3, the parameters of the flexible beam system equation as specified in equation (48) are assumed to be unknown, therefore are estimated online at each time step. The mean and covariance of the conditional distribution of the beam system dynamics are then specified using the estimated parameters as discussed in Section 3. These estimates of the mean and covariance of the beam system dynamics are then used in equations (23) and (24) to evaluate the Riccati equation, S
k
, as well as P
k
which are then both used in equation (41) to calculate the mean of the control input to be forwarded to the beam. Also, in the simulation experiment, the covariance, Σ2, of the ideal distribution of the tracking error is taken to be 0.01 × In×n. The covariance, Γ, of the ideal distribution of control inputs is taken to be 1. The simulation results are shown in Figures 1 and 2. Figure 1 shows the various states of the flexible beam with their corresponding reference signals. As can be seen from this figure, all the flexible beam system states are accurately tracking their corresponding reference states. This can be confirmed from the magnified figures in Figure 1 which show the steady state values of the beam states. The tracking errors are presented in Figure 2(a), from which it can also be seen that all the state tracking errors go to zero. These figures, on the other hand, show large deviation of the beam state values from their corresponding reference values and large tracking errors in the transient period. This is expected as the parameters of the beam equation which are estimated online will not have converged to their true values in this transient period. Once the parameters converge to their true values, the beam states show good tracking to their corresponding reference values. Also, the control input as calculated from equation (41) is shown in Figure 2(b). The control input as can be seen from this figure is stable, thus yielding the required results. Finally, the feedback gain as calculated from equation (42) is shown in Figure 3. This figure shows that all the feedback gains have converged and reached steady state values. To reemphasise, the numerical results prove the efficacy of the proposed probabilistic tracking control method and show that the mean of tracking error can be minimised to reach zero value. State of the simulated flexible beam: (a) state x1 (dotted line) and reference state Tracking error of the states and system input of the simulated flexible beam: (a) tracking error of the flexible beam states and (b) the control input to the flexible beam as calculated from equation (41). Feedback gain of the controller as calculated from equation (42).


5. Conclusion
This article presented a new framework for the design of randomised controllers for complex stochastic and uncertain systems that is based on the minimisation of the Kullback–Leibler divergence of the tracking error of the controlled system. The new proposed framework considers the design of randomised controllers that take the multiplicative noises that affect the dynamics of the controlled stochastic system into consideration in the optimisation process. The theoretical development of this framework is demonstrated on linear Gaussian stochastic systems that are affected by multiplicative noises. The theoretical findings were then validated on controlling the vibration quenching of flexible beams.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was funded by a Leverhulme Trust Research Project Grant number RPG-2017-337.
