Efficient neural learning control of nonlinear dynamics with applications

Abstract

The control of nonlinear dynamics is gaining increasing attention since many practical systems are with such kind of characteristics. To deal with the system uncertainty, in this paper, the efficient learning control using neural network is proposed for the nonlinear strict-feedback system. The whole scheme is with the back-stepping design, while the novel learning is proposed for the neural network weights update. To deal with the approximation error, the robust item is added. The stability of the closed-loop dynamics is analysed and the effectiveness of the design is verified through flight simulation.

Keywords

Neural network nonlinear dynamics learning control strict-feedback system

Introduction

Nonlinear dynamics exists in many practical systems such as robots,¹ manipulators,² flight vehicle,³ quadrotors,⁴ and MEMS (microelectromechanical system) gyroscope.⁵ Control of nonlinear dynamics^6–8 is challenging since the design should be according to the structure while the nonlinearity is difficult to deal with. For the nonlinear design, the method is based on Lyapunov theory to make the energy decreasing. One important concern is on how to deal with the unknown nonlinear function. Some designs are using the knowledge of upper bound, while some works are based on the linearized parametric model. The other concern is on the form of the dynamics. For example, the controllability canonical form is different from the strict-feedback system and the pure-feedback system. For the dynamics in controllability canonical form, the main design can be with the error surface and then the robust design can be used. The control of the strict-feedback system is well studied using back-stepping and dynamic surface design. Though the design might be efficient, how to deal with the unknown dynamics is not easy since there might be not enough information to construct the linearized parametric model.

The intelligent control^9–11 can provide learning-based structure and the design is more convenient. Many works can be found as approximate control, reinforcement learning control, adaptive dynamic programming control,⁹ fault tolerant control,¹² and so on. Though the motivation might be quite different, the idea is clear that the neural network (NN) can be used as bridge between known and unknown. Some works can be further included with disturbance observer,^13,14 sliding mode design, and $H_{\infty}$ design.¹⁵

For the system with unknown nonlinear functions, the NN can be used for approximation. Typically, two kinds of designs are widely designed. One way is to approximate the ideal control input, while the other is towards the nonlinear function approximation. The designs can be found in the literature,^16–18 while the application can be found in robot system, ship system, flight system, and mechanical systems. If the nonlinear function can be approximated as precise as possible, the tracking performance can be better.¹⁹ However, most neural control is on the closed-loop stability using the tracking error to tune the NN weights. In this way, the system can be stable but after checking carefully, the approximation is far from the true value of the nonlinear function. Recently, some works have been towards the learning improvement using composite learning,^20,21 In the design, the theoretical analysis is rigorously presented, while the prediction error is constructed using the dynamics and the approximation. In practice, more practical designs are expected if more system information can be obtained such as the derivative of the system state.

Based on the above-mentioned discussion, there exist many works on intelligent control. But most designs are towards the function approximation, and then the controller is constructed to obtain the system stability. During the process, the attention is on closed-loop system stability using tracking error to tune the weights. But the adaptation of the intelligent system is not sufficiently considered. Thus, in this paper, the approximation performance is considered and the new prediction error is given using the derivative of the system state. Furthermore, the efficient learning update law is constructed and the closed-loop system stability is analysed.

The structure of the paper is given as follows. Section ‘Model dynamics and problem formulation’ presents the nonlinear strict-feedback dynamics. Sections ‘Efficient learning control’ and ‘Stability analysis’ present the learning control and the closed-loop stability analysis, respectively. The verification is presented in section ‘Simulation’. Section ‘Conclusions and future works’ gives the conclusions and the future discussions.

Model dynamics and problem formulation

In this paper, the following dynamics with strict-feedback form is considered

\begin{matrix} {\overset{\cdot}{ξ}}_{1} = f_{1} (ξ_{1}) + g_{1} ξ_{2} \\ {\overset{\cdot}{ξ}}_{i} = f_{i} ({\bar{ξ}}_{i}) + g_{i} ξ_{i + 1} \\ {\overset{\cdot}{ξ}}_{n} = f_{n} ({\bar{ξ}}_{n}) + g_{n} u \\ y = x_{1} \end{matrix}

(1)

where $f_{i}$ are the nonlinear system functions, $g_{i}$ are the control gain functions, ${\bar{ξ}}_{i} = [ξ_{1}, \dots, ξ_{i}]^{T}$ are the system states, $y$ is the output, and $u$ is the control input.

Assumption 1

The system states $ξ_{i}$ and their derivatives ${\overset{\cdot}{ξ}}_{i}$ can be obtained.

Assumption 2

The functions $f_{i}$ are not known, while $g_{i}$ are known. The control goal is to design the efficient learning algorithm for the back-stepping control so that the output y can track the reference signal $y_{r}$

Efficient learning control

For the strict-feedback design, the back-stepping scheme is of great interest since the design can break the complex dynamics into several simple dynamics. The main difficulty is the so-called ‘explosion of complexity’. Several designs can be introduced for simplification such as dynamic surface control and the command filtered back-stepping. In this paper, the derivative of the virtual control signal is obtained using $\frac{x_{i}^{d} (k + 1) - x_{i}^{d} (k)}{t_{s}}$ , where $x_{i}^{d} (k)$ is the virtual signal and $t_{s}$ is the sample period.

Step 1. As the first equation shown in dynamics (1), it is known that

{\overset{\cdot}{ξ}}_{1} = f_{1} (ξ_{1}) + g_{1} ξ_{2} = ω_{1}^{* T} θ_{1} (ξ_{1}) + ε_{1} + g_{1} x_{ξ}

(2)

where $ω_{1}^{*}$ is unknown and $ε_{1}$ is the approximation error satisfying $| ε_{1} | \leq ε_{1}^{m}$ and $ε_{1}^{m}$ is the upper bound of $ε_{1}$ .

Define the tracking error

e_{1} = ξ_{1} - ξ_{1}^{d}

(3)

where $ξ_{1}^{d} = γ_{d}$

Design virtual control $ξ_{2}^{d}$ as

ξ_{2}^{d} = \frac{- {\hat{ω}}_{1}^{T} θ_{1} (ξ_{1}) - k_{1} e_{1} + {\overset{\cdot}{ξ}}_{1}^{d} - ε_{1}^{m} sign (e_{1})}{g_{1}}

(4)

where ${\hat{ω}}_{1}$ is the NN weight vector and $k_{1} > 0$ is the design constant.

Define $e_{2} = ξ_{2} - ξ_{2}^{d}$ . Then, the derivative of $e_{1}$ is obtained as

\begin{matrix} {\overset{\cdot}{e}}_{1} = {\overset{\cdot}{ξ}}_{1} - {\overset{\cdot}{ξ}}_{1}^{d} = ω_{1}^{* T} θ_{1} (ξ_{1}) + ε_{1} + g_{1} ξ_{2} - {\overset{\cdot}{ξ}}_{1}^{d} \\ = {\tilde{ω}}_{1}^{T} θ_{1} (ξ_{1}) + ε_{1} - ε_{1}^{m} sign (e_{1}) - k_{1} e_{1} + g_{1} e_{2} \end{matrix}

(5)

where ${\tilde{ω}}_{1} = ω_{1}^{*} - {\hat{ω}}_{1}$

Define the prediction error as

z_{1} = {\overset{\cdot}{ξ}}_{1} - {\overset{\cdot}{\hat{ξ}}}_{1} = {\tilde{ω}}_{1}^{T} θ_{1} (ξ_{1}) + ε_{1}

(6)

Since $f_{1}$ is unknown, ${\overset{\cdot}{x}}_{1}$ is unknown. Here, since the dynamics does not contain the noise, the information is calculated using ${\overset{\cdot}{ξ}}_{1} \approx \frac{ξ_{1} (k + 1) - ξ_{1} (k)}{t_{s}}$ . The signal $\overset{\cdot}{\hat{ξ}}$ is calculated as

{\overset{\cdot}{\hat{ξ}}}_{1} = {\hat{ω}}_{1}^{T} θ_{1} (ξ_{1}) + g_{1} ξ_{2}

(7)

Then, the following equality can be obtained

z_{1} = {\tilde{ω}}_{1}^{T} θ_{1} ({\bar{ξ}}_{1}) + ε_{1}

(8)

The NN weight update is given as

{\overset{\cdot}{\hat{ω}}}_{1} = γ_{1} (e_{1} + γ_{z 1} z_{1}) θ_{1} (ξ_{1})

(9)

where $γ_{1}$ is a positive design constant.

Step i. As the ith equation shown in dynamics (1) and using NN to approximate $f_{i} ({\bar{ξ}}_{i})$ , it is known that

{\overset{\cdot}{ξ}}_{1} = f_{i} ({\bar{ξ}}_{i}) + g_{i} u = ω_{i}^{* T} θ_{i} ({\bar{ξ}}_{i}) + ε_{i} + g_{i} x_{i + 1}

(10)

where $ω_{i}^{*}$ is unknown and $ε_{i}$ is the approximation error with $| ε_{i} | \leq ε_{i}^{m}$ and $ε_{i}^{m}$ is the upper bound of $ε_{i}$ .

Design virtual control $ξ_{i + 1}^{d}$ as

ξ_{i + 1}^{d} = \frac{- {\hat{ω}}_{i}^{T} θ_{i} ({\bar{ξ}}_{i}) - k_{i} e_{i} - g_{i - 1} e_{i - 1} + {\overset{\cdot}{ξ}}_{i}^{d} - ε_{i}^{m} sign (e_{i})}{g_{i}}

(11)

where ${\hat{ω}}_{i}$ is the NN weight vector, $k_{i} > 0$ is the design constant, and ${\overset{\cdot}{ξ}}_{i}^{d} \approx \frac{ξ_{i}^{d} (k + 1) - ξ_{i}^{d} (k)}{t_{s}}$

The derivative of $e_{i}$ is obtained as

{\overset{\cdot}{e}}_{i} = {\overset{\cdot}{ξ}}_{i} - {\overset{\cdot}{ξ}}_{i}^{c} = {\tilde{ω}}_{i}^{T} θ_{i} ({\bar{ξ}}_{i}) + ε_{i} - ε_{i}^{m} sign (e_{i}) - k_{i} e_{i} - g_{i - 1} e_{i - 1}

(12)

where ${\hat{ω}}_{i} = ω_{i}^{*} - {\hat{ω}}_{i}$

The prediction error is constructed as

z_{i} = {\overset{\cdot}{ξ}}_{i} - {\overset{\cdot}{\hat{ξ}}}_{i}

(13)

Since $f_{1}$ is unknown, ${\overset{\cdot}{ξ}}_{i}$ is unknown. Here, since the dynamics does not contain the noise, the information is calculated using ${\overset{\cdot}{ξ}}_{i} \approx \frac{ξ_{i} (k + 1) - ξ_{i} (k)}{t_{s}}$ . The signal ${\overset{\cdot}{\hat{ξ}}}_{i}$ is calculated as

{\overset{\cdot}{\hat{ξ}}}_{i} = {\hat{ω}}_{i}^{T} θ_{i} ({\bar{ξ}}_{i}) + g_{i} u

(14)

Then, the following equality can be obtained

z_{i} = {\tilde{ω}}_{i}^{T} θ_{i} ({\bar{ξ}}_{i}) + ε_{i}

(15)

The NN weight update is given as

{\overset{\cdot}{\hat{ω}}}_{i} = γ_{i} (v_{i} + γ_{zi} z_{i}) θ_{i} ({\bar{ξ}}_{i})

(16)

where $γ_{i}$ and $γ_{zi}$ are the positive design constants.

Step n. As the nth equation shown in dynamics (1) and using NN to approximate $f_{n} ({\bar{ξ}}_{n})$ , it is known that

{\overset{\cdot}{ξ}}_{n} = f_{n} ({\bar{ξ}}_{n}) + g_{n} u = ω_{n}^{* T} θ_{n} ({\bar{ξ}}_{n}) + ε_{n} + g_{n} u

(17)

where $ω_{n}^{*}$ is unknown and $ε_{n}$ is the approximation error with $| ε_{n} | \leq ε_{n}^{m}$ and $ε_{n}^{m}$ is the upper bound of $ε_{n}$ .

The final control signal $u$ is designed as

u = \frac{- {\hat{ω}}_{n}^{T} θ_{n} ({\bar{ξ}}_{n}) - k_{n} e_{n} - g_{n - 1} e_{n - 1} + {\overset{\cdot}{ξ}}_{n}^{d} - ε_{n}^{m} sign (e_{n})}{g_{n}}

(18)

where ${\hat{ω}}_{n}$ is the NN weight vector, $k_{n} > 0$ is the design constant, and ${\overset{\cdot}{ξ}}_{n}^{d} \approx \frac{ξ_{n}^{d} (k + 1) - ξ_{n}^{d} (k)}{t_{s}}$ .

The derivative of $e_{n}$ is obtained as

{\overset{\cdot}{e}}_{n} = {\overset{\cdot}{ξ}}_{n} - {\overset{\cdot}{ξ}}_{n}^{c} = {\tilde{ω}}_{n}^{T} θ_{n} ({\bar{ξ}}_{n}) + ε_{n} - k_{n} e_{n} - g_{n - 1} e_{n - 1}

(19)

where ${\tilde{ω}}_{n} = ω_{n}^{*} - {\hat{ω}}_{n}$

The prediction error is constructed as

z_{n} = {\overset{\cdot}{ξ}}_{n} - {\overset{\cdot}{\hat{ξ}}}_{n}

(20)

Since $f_{n}$ is unknown, ${\overset{\cdot}{ξ}}_{n}$ is unknown. Here, since the dynamics does not contain the noise, the information is calculated using ${\overset{\cdot}{ξ}}_{n} \approx \frac{ξ_{n} (k + 1) - ξ_{n} (k)}{t_{s}}$ . The signal ${\overset{\cdot}{\hat{ξ}}}_{n}$ is calculated as

{\overset{\cdot}{\hat{ξ}}}_{n} = {\hat{ω}}_{n}^{T} θ_{n} ({\bar{ξ}}_{n}) + g_{n} u

(21)

Then, the following equality can be obtained

z_{n} = {\tilde{ω}}_{n}^{T} θ_{n} ({\bar{ξ}}_{n}) + ε_{n}

(22)

The NN weight update is given as

{\overset{\cdot}{\hat{ω}}}_{n} = γ_{n} (v_{n} + γ_{zn} z_{n}) θ_{n} ({\bar{ξ}}_{n})

(23)

where $γ_{n}$ and $γ_{zn}$ are the positive design constants.

Stability analysis

Theorem 1

Consider the dynamics (1) with the virtual signals (4), (11), (18) and the neural weights update (9), (16), (23). Then, the tracking errors are bounded.

Proof

The Lyapunov function is selected as

L_{v} = \frac{1}{2} \sum_{i = 1}^{n} e_{i}^{2} + {\tilde{ω}}_{i}^{T} γ_{i}^{- 1} {\tilde{ω}}_{i}

(24)

The derivative of $L_{v}$ is derived as

\begin{matrix} {\overset{\cdot}{L}}_{v} = e_{1} ({\tilde{ω}}_{1}^{T} θ_{1} (ξ_{1}) + ε_{1} - ε_{1}^{m} sign (e_{1}) - k_{1} e_{1} + g_{1} e_{2}) \\ - {\tilde{ω}}_{1}^{T} (e_{1} + γ_{z 1} z_{1}) θ_{1} (ξ_{1}) \\ + \sum_{i = 2}^{n - 1} e_{i} (- k_{i} e_{i} - g_{i - 1} e_{i - 1} + g_{i} e_{i + 1}) \\ + e_{n} ({\tilde{ω}}_{n}^{T} θ_{n} ({\bar{ξ}}_{n}) + ε_{n} - k_{n} e_{n} - g_{n - 1} e_{n - 1}) \\ - {\tilde{ω}}_{n}^{T} (e_{n} + γ_{zn} z_{n}) θ_{n} ({\bar{ξ}}_{n}) \end{matrix}

(25)

Then it is calculated as

{\overset{\cdot}{L}}_{v} = \sum_{i = 1}^{n} - k_{i} e_{i}^{2} + e_{i} (ε_{i} - ε_{i}^{m} sign (e_{i})) - (z_{i} - ε_{i}) γ_{zi} z_{i}

(26)

The equation is further obtained as

{\overset{\cdot}{L}}_{v} \leq \sum_{i = 1}^{n} - k_{i} e_{i}^{2} - k_{γ} z_{i}^{2} + P

(27)

where $k_{γ} = γ_{zi} (1 - \frac{ε_{i}^{m}}{μ})$ , $P = μ γ_{zi}^{2} (ε_{i}^{m})^{2}$ , and $μ$ is the scalar. So all the signals in equation (24) are bounded. Now it is concluded that under the proposed method, the system can track the reference very well. This completes the proof.

Simulation

The flight dynamics²² are presented with attack angle α, flight path angle (FPA) $γ$ , and pitch rate $q$ . The control input is the elevator deflection $δ_{e}$ . The attitude dynamics is listed as follows

\overset{\cdot}{γ} = \frac{L + T \sin α}{mV} - \frac{g \cos γ}{V}

(28)

\overset{\cdot}{α} = q - \overset{\cdot}{γ}

(29)

\overset{\cdot}{q} = \frac{M_{yy}}{I_{yy}}

(30)

Define $X = [ξ_{1}, ξ_{2}, ξ_{3}]^{T}$ , $ξ_{1} = γ$ , $ξ_{2} = θ_{p}$ , and $ξ_{3} = q$ , where $θ_{p} = α + γ$ . The general form of the dynamics can be obtained as

\begin{matrix} {\overset{\cdot}{ξ}}_{1} = f_{1} (ξ_{1}) + g_{1} ξ_{2} \\ {\overset{\cdot}{ξ}}_{2} = {\overset{\cdot}{ξ}}_{3} \\ {\overset{\cdot}{ξ}}_{3} = f_{3} (X) + g_{3} u \\ u = δ_{e} \end{matrix}

The way of using the tracking error to update the neural weight is denoted as ‘Method 1’, while the design in this paper is named as ‘Proposed Method’ which means the predictor-based update design. To show the performance, the index is selected as $J_{i} = \int | e_{i} | dt$ , where $i = 1, 2, 3$ . The parameters are selected as in Zhang et al.,²³ while $γ_{i}$ , $i = 1, 3$ , are selected as 1.5. Define ${\hat{f}}_{i} = {\hat{ω}}_{i}^{T} θ_{i}$ , $i = 1, 3$ .

In the simulation, the altitude will climb from 86,000 to 87,000 ft in 50 s, while the altitude will decrease from 87,000 to 85,000 ft in the next 50 s. Given the reference signal of altitude, the flight path angle is generated through the similar way as in Zhang et al.²³ In Example 1, there is no noise, while in Example 2 there exist noise for a and q.

Example 1

The simulation results are presented in Figures 1 –6. It is clear that the proposed method obtains better tracking performance with high tracking precision for system states tracking in Figures 1 and 2. The control input responds smoother as shown in Figure 3. The neural approximation is depicted in Figure 5, while the trajectory of NN weights is shown in Figure 6. The tracking performance is demonstrated in Figure 4. Overall, the proposed method achieves the better convergence and the higher tracking accuracy.

Figure 1.

Altitude tracking.

Figure 2.

System states.

Figure 3.

Elevator deflection.

Figure 4.

Performance index.

Figure 5.

NN approximation.

Figure 6.

NN weights.

Example 2

The random noises with amplitude 0.0001 and 0.001 are added for a and q. In Figure 7, the system response is demonstrated, while in Figure 8 the elevator deflection is depicted. Furthermore, the NN response is shown in Figure 9. It is interesting to see that the proposed method can achieve much better performance in case of measurement noise. Also from the response of the elevator deflection and the NN weights, more chattering occurs in case of noise.

Figure 7.

System states with random noises.

Figure 8.

Elevator defection.

Figure 9.

NN weights with random noises.

Conclusions and future works

The efficient learning-based control is designed for the strict-feedback systems. The design constructs the signal to obtain the prediction error for the neural weight update. The system stability is analysed and the control performance is verified through nonlinear dynamic simulation.

For the future work, the output-feedback design can be studied. In reality, the time-varying disturbance exists in the dynamics and the new estimation design can be analyzed. For practical applications, the method can be applied to manipulators, underwater vehicles, quadrotor, and automobile dynamics for experimental purpose.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Ruixin Liu

References

Dong

Adaptive fuzzy neural network control for a constrained robot using impedance learning. IEEE T Neural Netw Learn Syst 2018; 29(4): 1174–1186.

Stonier

. Continuous finite-time control for robotic manipulators with terminal sliding modes. In: Sixth international conference of information fusion, Cairns, QLD, Australia, 8–11 July 2003, pp. 1433–1440. New York: IEEE.

Yang

Sun

, et al. Nonlinear-disturbance-observer-based robust flight control for airbreathing hypersonic vehicles. IEEE T Aerosp Electr Syst 2013; 49(2): 1263–1275.

Composite learning finite-time control with application to quadrotors. IEEE T Syst Man Cyb 2018; 48(10): 1806–1815.

Fei

Yan

WF.

Adaptive control of MEMS gyroscope using global fast terminal sliding mode control and fuzzy-neural-network. Nonlinear Dynam 2014; 78(1): 103–116.

Kokotovic

The joy of feedback: nonlinear and adaptive: 1991 bode prize lecture. IEEE Control Syst Mag 1991; 12: 7–17.

Polycarpou

Stable adaptive neural control scheme for nonlinear systems. IEEE T Automat Control 1996; 41(3): 447–451.

Wang

Huang

Adaptive neural network control for a class of uncertain nonlinear systems in pure-feedback form. Automatica 2002; 38(8): 1365–1372.

Wang

Zhong

, et al. Event-driven nonlinear discounted optimal regulation involving a power system application. IEEE T Ind Electron 2017; 64(10): 8177–8186.

10.

Chen

SS.

Adaptive neural output feedback control of uncertain nonlinear systems with unknown hysteresis using disturbance observer. IEEE T Ind Electron 2015; 62(12): 7706–7716.

11.

Sun

Yan

, et al. Disturbance observer-based neural network control of cooperative multiple manipulators with input saturation. IEEE T Neural Netw Learn Syst. Epub ahead of print 13 August 2019. DOI: 10.1109/TNNLS.2019.2923241.

12.

Yin

Jiang

Tian

, et al. A data-driven fuzzy information granulation approach for freight volume forecasting. IEEE T Ind Electron 2017; 64(2): 1447–1456.

13.

Chen

Yang

Guo

, et al. Disturbance-observer-based control and related methods – an overview. IEEE T Ind Electron 2016; 63(2): 1083–1095.

14.

Shou

Luo

, et al. Neural learning control of strict-feedback systems using disturbance observer. IEEE T Neural Netw Learn Syst 2019; 30(5): 1296–1307.

15.

Qian

Xing

Wang

, et al. New optimal analysis method to stability and H_∞ performance of varying delayed systems. ISA Trans 2019; 93: 137–144.

16.

Wang

Design and analysis of fuzzy identifiers of nonlinear dynamic systems. IEEE T Automat Control 1995; 40(1): 11–23.

17.

Composite learning control of flexible-link manipulator using NN and DOB. IEEE T Syst Man Cyb 2018; 48(11): 1979–1985.

18.

Peng

Wang

Distributed containment maneuvering of multiple marine vessels via neurodynamics-based output feedback. IEEE T Ind Electron 2017; 64(5): 3831–3839.

19.

Hojati

Gazor

Hybrid adaptive fuzzy identification and control of nonlinear systems. IEEE T Fuzzy Syst 2002; 10(2): 198–210.

20.

Pan

Zhou

Sun

, et al. Composite adaptive fuzzy H_∞ tracking control of uncertain nonlinear systems. Neurocomputing 2013; 99: 15–24.

21.

Bellomo

Naso

Turchiano

, et al. Composite adaptive fuzzy control. In: Proceedings of the 16th IFAC world congress, Prague, 4–8 July 2005.

22.

Parker

Serrani

Yurkovich

, et al. Control-oriented modeling of an air-breathing hypersonic vehicle. J Guid Control Dynam 2007; 30(3): 856–869.

23.

Zhang

Zhu

Composite dynamic surface control of hypersonic flight dynamics using neural networks. Sci China Inform Sci 2015; 58(7): 1–9.