Real-time neural identification and inverse optimal control for a tracked robot

Abstract

This work presents the implementation in real-time of a neural identifier based on a recurrent high-order neural network which is trained with an extended Kalman filter–based training algorithm and an inverse optimal control applied to a tracked robot. The recurrent high-order neural network identifier is developed without the knowledge of the plant model or its parameters; on the other hand, the inverse optimal control is designed for tracking velocity references. This article includes simulation and real-time results, both using MATLAB^®, and also the experimental tests use a modified HD2^® Treaded ATR Tank Robot Platform with wireless communication.

Keywords

Differential mobile robot extended Kalman filter inverse optimal control neural networks real-time recurrent high-order neural network neural identification time-delay system

Introduction

A tracked robot is a mobile robot that runs on continuous tracks instead of wheels. The main advantage of a tracked robot is that it can be used to navigate in rough terrains.^1,2

The thrust developed by a wheeled vehicle will generally be lower than that developed by a comparable tracked vehicle;¹ this is why these kinds of vehicles are used in a variety of applications where terrain conditions are difficult or unpredictable: urban reconnaissance, forestry, mining, agriculture, rescue mission scenarios, autonomous planetary explorations, to name but a few.^2,3 Besides, tracked robots offer some other advantages, such as follows:

Tracked robots are versatile vehicles in different terrains and weather conditions.

Tracked robots generate low ground pressure which conserves the environment.

The design of tracked robots prevents them from sinking or becomes stuck into soft ground.

Tracked mobile robots can be considered as the most important type of mobile robots, and an extensive class of controllers has been proposed for trajectory tracking for this kind of robots;^4–7 these works have the common characteristic that they need to know a lot of information about the system to be controlled and most of them are implemented in continuous time.

It is well known that the current trend towards digital rather than analogue control of dynamic systems is mainly due to the advantages found in working with digital rather than continuous-time signals.⁸ This work is based on the previous work⁹ where the recurrent high-order neural network (RHONN) identifier and an inverse optimal control scheme for all-terrain tracked robots are presented with just one simulation result. The RHONN is used to identify the plant model and it is trained with an extended Kalman filter (EKF)–based algorithm, under the assumption that the whole state is available for measurement. On the other hand, an inverse optimal control is designed for tracking velocity trajectory. So in this work, we extend⁹ by including more simulation results, as well as real-time results using MATLAB^® (MathWorks, Inc.) and a modified HD2^® (SuperDroid Robots.) Treaded ATR Tank Robot Platform to prove the effectiveness of our approach.

The neural identifier is based on the RHONN series–parallel model and it is used to get a mathematical model of a plant even with the presence of changing parameters, due to its on-line EKF-based training that adjusts the neural network weights during all the execution of the application.¹⁰ This RHONN identifier has proven to be an effective identifier for real-time applications on systems where the knowledge about the plant is unknown or insufficient or where their parameters change during operation;^11–14 besides, it presents good results even with time delays.¹⁵

On the other hand, the objective of optimal control is to determine the control signals which will force a process to satisfy physical constrains and at the same time minimize a performance criterion.¹⁶ However, this requires to solve the associated Hamilton–Jacobi–Bellman (HJB) equation which is not an easy task. This work uses the inverse approach in which solving the HJB equation is avoided; in this approach, stabilizing feedback control is developed and then it is established that this control optimizes a cost functional.¹⁷

The inverse optimal control is a Lyapunov function–based method as well as backstepping^18–20 and H_∞-based methods.^21,22 These methods are adaptive and robust controllers and work well with uncertainties; however, in their design process, they need a model of the system to be controlled. In the proposed RHONN identifier – inverse optimal control scheme, the identified model is used in the control process design. In this way, the main advantage of such controller is that it does not require a previous knowledge of the model of the system to be controlled.

Other advanced control methods for mechatronic systems to address the tracking trajectory and synchronization problems are vibration isolation for active suspensions with performance constraints and actuator saturation²³ and integrated adaptive robust control for multilateral teleoperation systems under arbitrary time delays.²⁴ However, there are no reported works for tracked robots; furthermore, to work, these methods need to know the model of the system to be controlled.

In this article, we present the following:

The implementation of the identifier-control scheme for a simulated tracked robot;

The comparison results of the proposed scheme with other method²⁵ based on discrete-time super twisting;

A real-time implementation of the scheme for a modified HD2 Treaded ATR Tank Robot Platform with wireless communication which due to its communication method can induce time delays.

In this way, the main contributions of this work are as follows:

A control scheme which allows the presence of disturbances, noise and delays and that does not require the knowledge of the plant to be controlled;

The comparison of the proposed method with other method;

A real-time implementation of the proposed scheme with a system with time delays.

This work is organized in the following order: first, we present the model of all-terrain tracked robots; second, the basic concepts of neural networks are introduced together with the RHONN series–parallel model and the EKF-based training algorithm used in this work; third, neural identification process using RHONN and inverse optimal control is described; after that, applicability to a tracked all-terrain robot is described followed by the results of simulation and real-time test and finally, important conclusions are included.

Tracked all-terrain robot

The kinematics of an electrically driven tracked robot is described by the state-space model (1)^6,26

\begin{array}{l} {\dot{x}}_{1} = J (x_{1}) x_{2} \\ {\dot{x}}_{2} = M^{- 1} (- C ({\dot{x}}_{1}) x_{2} - D x_{2} - τ_{d} + N K_{T} x_{3}) \\ {\dot{x}}_{3} = L_{a}^{- 1} (u - R_{a} x_{3} - N K_{E} x_{2}) \end{array}

(1)

Each subsystem of model (1) is defined in (2).

\begin{matrix} x_{1} = [x_{11}, x_{12}, x_{13}]^{T} = [x, y, θ]^{T} \\ x_{2} = [x_{21}, x_{22}]^{T} = [v_{2}, v_{1}]^{T} \\ x_{3} = [x_{31}, x_{32}]^{T} = [i_{a_{1}}, i_{a_{2}}]^{T} \\ u = [u_{1}, u_{2}]^{T} \end{matrix}

(2)

where x and y are the coordinates of $P_{0}$ , $θ$ is the heading angle of the robot (Figure 1), v₁ and v₂ are the angular velocities of the robot, $i_{a_{1}}$ and $i_{a_{2}}$ are the currents of the motor of the robot, u₁ and u₂ are the input voltages and $x_{3}$ is the actuator dynamics.

where

J (x_{1}) = 0.5 r [\begin{matrix} \cos (x_{13}) & \cos (x_{13}) \\ \sin (x_{13}) & \sin (x_{13}) \\ R^{- 1} & - R^{- 1} \end{matrix}]

(3)

M = [\begin{matrix} x_{11} & x_{12} \\ x_{12} & x_{11} \end{matrix}]

(4)

N = [\begin{matrix} n_{1} & 0 \\ 0 & n_{2} \end{matrix}]

(5)

K_{T} = [\begin{matrix} K_{t_{1}} & 0 \\ 0 & K_{t_{2}} \end{matrix}]

(6)

L_{a} = [\begin{matrix} l_{a_{1}} & 0 \\ 0 & l_{a_{2}} \end{matrix}]

(7)

R_{a} = [\begin{matrix} r_{a_{1}} & 0 \\ 0 & r_{a_{2}} \end{matrix}]

(8)

K_{E} = [\begin{matrix} K_{e_{1}} & 0 \\ 0 & K_{e_{2}} \end{matrix}]

(9)

where R is half of the width of the tracked robot and r is the radius of the wheels which drive the tracks. M is the inertia matrix symmetric and positive defined by the physical parameters of the tracked robot, $K_{T}$ is the motor torque constant, $L_{a}$ is the inductance, $K_{E}$ is the back electromotive force coefficient and $R_{a}$ is the resistance of the actuator.

Figure 1.

Schematic model of a tracked mobile robot, where x, y are the coordinates of $P_{0}$ and $θ$ is the heading angle of the mobile robot.

Neural networks

Artificial neural networks also known as neural networks are massively parallel distributed processors built of a massive interconnection of simple computing units called neurons; they are designed to model the way in which the brain performs a task or a function;²⁷ in other words, artificial neural networks are simplified models of the biological neural networks which we can implement in software or hardware.^27–29 Neural network computing power comes from its massive interconnection and its ability to learn and generalize.²⁷

According to their architecture, neural networks can be classified as follows:^10,27,28

Static neural networks. These kinds of neural networks are capable of approximating any function using a static mapping.

Dynamic neural networks. These types of neural networks have feedback connections which give them higher capability than static neural networks. Due to their feedback connection, dynamic neural networks are capable of capturing the dynamic response of a system.

Neural networks are usually implemented using electronic components or simulated in software.²⁷

RHONN

RHONNs are dynamic neural networks which are capable of capturing the dynamic response of complex non-linear systems^10,30 and also systems with time delay¹⁵ due to characteristics such as a flexible model that allows us to incorporate a priori information about the system to be identified, approximation capabilities, robustness against noise, on-line training and their dynamical behaviour that is the result of their recurrent connections that improve their learning capabilities and performance.^10,30 RHONNs are the result of including high-order interactions represented by triplets ( $y_{i} y_{j} y_{k}$ ), quadruplets ( $y_{i} y_{j} y_{k} y_{l}$ ) and so on to the first-order Hopfield model.^10,30

The RHONN model used in this work is the series–parallel model model,¹⁰ which is defined as

{\hat{χ}}_{i} (k + 1) = ω_{i}^{T} z_{i} (x (k), u (k)), i = 1, \dots, n

(10)

with

z_{i} (x (k), u (k)) = [\begin{matrix} z_{i_{1}} \\ z_{i_{2}} \\ ⋮ \\ z_{i_{L_{i}}} \end{matrix}] = [\begin{matrix} Π_{j \in I_{1}} ξ_{i_{j}}^{d_{ij} (1)} \\ Π_{j \in I_{2}} ξ_{i_{j}}^{d_{ij} (2)} \\ ⋮ \\ Π_{j \in I_{L_{i}}} ξ_{i_{j}}^{d_{ij} (L_{i})} \end{matrix}]

(11)

ξ_{i} = [\begin{matrix} ξ_{i_{1}} \\ ⋮ \\ ξ_{i_{n}} \\ ξ_{i_{n + 1}} \\ ⋮ \\ ξ_{i_{n + m}} \end{matrix}] = [\begin{matrix} S (x_{1}) \\ ⋮ \\ S (x_{n}) \\ u_{1} \\ ⋮ \\ u_{m} \end{matrix}]

(12)

S (ς) = \frac{1}{1 + \exp (- β ς)}, β > 0

(13)

where n is the state dimension, $\hat{χ}$ is the state vector of the neural network, ω is the weight vector, x is the plant state vector and $u = [u_{1}, u_{2}, \dots, u_{m}]^{T}$ is the input vector to the neural network.

Neural network training

Neural network training is a process in which the neural network learns a task; this training can be on-line or off-line.^27,28 The most common training algorithms for static neural networks and dynamic neural networks are backpropagation and backpropagation through time learning, respectively.^27,28,31

Kalman filter

It estimates the state of a linear system with additive state and output white noise using a recursive solution in which each update of the state is estimated from the previous estimated state and the new input data.^10,31

Kalman filter training

Kalman filter training for neural networks offers advantages such as reduction in the epoch number and number of required neurons, on-line and off-line training implementation, improvement in learning convergence and also they are more computationally efficient compared to the most used backpropagation methods.^10,31 Moreover, they have proven to be reliable and practical.^10,27

The training goal is to find the optimal weight vector which minimizes the prediction error. Due to the fact that the neural network mapping is non-linear, EKF is required.

EKF-based training algorithm

It estimates the neural network weights which become the state, and the error between the measured output of the plant and the output of the neural network is considered as additive white noise.^10,31

The EKF-based training algorithm¹⁰ for RHONN series–parallel model (10) is equation (14)

\begin{matrix} ω_{i} (k + 1) & = ω_{i} (k) + η_{i} K_{i} (k) e_{i} (k) \\ K_{i} (k) & = P_{i} (k) H_{i} (k) M_{i} (k) \\ P_{i} (k + 1) & = P_{i} (k) - K_{i} (k) H_{i}^{T} (k) P_{i} (k) + Q_{i} (k) \\ M_{i} (k) & = {[R_{i} (k) + H_{i}^{T} (k) P_{i} (k) H_{i} (k)]}^{- 1} \end{matrix}

(14)

with

e_{i} (k) = x_{i} (k) - {\hat{χ}}_{i} (k)

(15)

H_{ij} = {[\frac{\partial {\hat{χ}}_{i} (k)}{\partial ω_{ij} (k)}]}^{T}, i = 1, \dots, n

(16)

where $ω_{i} \in ℜ^{L_{i}}$ is the on-line adapted weight vector, $K_{i} \in ℜ^{L_{i}}$ is the Kalman gain vector, $e_{i} \in ℜ$ is the identification error, $P_{i} \in ℜ^{L_{i} \times L_{i}}$ is the weight estimation error covariance matrix, $χ_{i}$ is the ith state variable of the neural network, $Q_{i} \in ℜ^{L_{i} \times L_{i}}$ is the estimation noise covariance matrix, $R_{i} \in ℜ$ is the error noise covariance matrix and $H_{i} \in ℜ^{L_{i}}$ is a vector in which each entry $H_{ij}$ is the derivative of the neural network state $({\hat{χ}}_{i})$ with respect to one neural network weight $(ω_{ij})$ and it is given by equation (16). $P_{i}$ and $Q_{i}$ are initialized as diagonal matrices with entries $P_{i} (0)$ and $Q_{i} (0)$ , respectively. It is important to remark that $H_{i} (k)$ , $K_{i} (k)$ and $P_{i} (k)$ for the EKF are bounded.³²

Neural identification

Neural identification consists in selecting an appropriated neural network model and adjusting its weights according to an adaptation law, so that the neural network approximates the real system response for the same input.³³ Now let us consider the following non-linear discrete time-delay multiple input, multiple output (MIMO) system described by

\begin{matrix} x (k + 1) & = F (x (k - l), u (k)) \\ y (k) & = h (x (k)) \end{matrix}

(17)

where $x \in ℜ^{n}$ , $u \in ℜ^{m}$ , $F \in ℜ^{n} \times ℜ^{m} \to ℜ^{n}$ is a non-linear function and $l = 1, 2, \dots$ is the unknown delay.

We use a model with time delay because the wireless communication with the tracked robots could induce some delays and we can modify the series–parallel model (10) to accept a state vector of a plan with time delays;¹⁵ therefore, we get the following model

{\hat{χ}}_{i} (k + 1) = ω_{i}^{T} (k) z_{i} (x (k - l), u (k)), i = 1, 2, \dots, n

(18)

This model is semi-globally uniformly ultimately bounded and the proof can be found in Alanis et al.¹⁵ The RHONN series–parallel model (18) is selected to identify model (17), and the EKF-based algorithm (14) is used as the adaptation law.

Inverse optimal control

Consider the following affine discrete non-linear system (19)

x (k + 1) = f (x (k)) + g (x (k)) u (k)

(19)

where $x \in ℜ^{n}$ is the state of the system, $u \in ℜ^{m}$ is the control input and $f : ℜ^{n} \to ℜ^{n}$ and $g : ℜ^{n} \to ℜ^{n \times m}$ are the smooth maps. System (19) is supposed to have an equilibrium point $x (0) = 0$ . Moreover, the full state $x (k)$ is assumed to be available.

For the inverse optimal control, a Lyapunov control function is design to satisfy the passivity condition which states that a passive system can be stabilized by making a negative feedback from the output. $u (k) = - α y (k)$ with $α > 0$ . Equation (20) is proposed as a control Lyapunov fuction^17,34 to ensure stability of system (19)

V (x (k)) = \frac{1}{2} x (k)^{T} Px (k), P = P^{T} > 0

(20)

Instead of solving the associated HJB equation, the inverse optimal control synthesis is based on the knowledge of $V (x (k))$ . The inverse optimal control law for system (19) with equation (20) is equation (21)

\begin{array}{l} u (k) = - \frac{1}{2} R^{- 1} (x (k)) g^{T} (x (k)) \frac{\partial V (x (k))}{\partial x (k + 1)} \\ = - \frac{1}{2} {(R (x (k)) + \frac{1}{2} g^{T} (x (k)) P g (x (k)))}^{- 1} \\ \times g^{T} (x (k)) P f (x (k)) \end{array}

(21)

where $R (x (k)) = R (x (k))^{T} > 0$ is a matrix whose elements can be functions of the system state or can be fixed. P is a matrix such that inequality (22) holds

\begin{matrix} V_{f} (x (k)) - \frac{1}{4} P_{1}^{T} (x (k)) {(RP (x (k)))}^{- 1} P_{1} (x (k)) \\ \leq - x^{T} (k) Qx (k) \end{matrix}

(22)

with

RP (x (k)) = R (x (k)) + P_{2} (x (k))

(23)

V_{f} (x (k)) = \frac{1}{2} f^{T} (x (k)) Pf (x (k)) - V (x (k))

(24)

P_{1} (x (k)) = g^{T} (x (k)) Pf (x (k))

(25)

P_{2} (x (k)) = \frac{1}{2} g^{T} (x (k)) Pg (x (k))

(26)

Q = Q^{T} > 0

(27)

In Sanchez and Ornelas-Tellez,¹⁷ it is demonstrated that control law (21) is globally asymptotically stable. Moreover, equation (21) is inverse optimal in the sense that it minimizes a cost functional.¹⁷

Application to a tracked mobile robot

Figure 2 shows the closed loop for neural identifier – inverse optimal control scheme.

Figure 2.

Control scheme.

Neural identification of a tracked robot

The discrete-time RHONN identifier (28) is proposed to get a valid mathematical model for tracked mobile robots

\begin{matrix} {\hat{χ}}_{1} (k + 1) = ω_{11} (k) S (x_{11} (k)) + ω_{12} (k) S (x_{12} (k)) \\ + ω_{13} (k) S (x_{13} (k)) + ω_{14} (k) x_{21} (k) \\ + ω_{15} (k) x_{22} (k) \\ {\hat{χ}}_{2} (k + 1) = ω_{21} (k) S (x_{11} (k)) + ω_{22} (k) S (x_{12} (k)) \\ + ω_{23} (k) S (x_{13} (k)) + ω_{24} (k) x_{21} (k) \\ + ω_{25} (k) x_{22} (k) \\ {\hat{χ}}_{3} (k + 1) = ω_{31} (k) S (x_{11} (k)) + ω_{32} (k) S (x_{12} (k)) \\ + ω_{33} (k) S (x_{13} (k)) + ω_{34} (k) x_{21} (k) \\ + ω_{35} (k) x_{22} (k) \\ {\hat{χ}}_{4} (k + 1) = ω_{41} (k) S (x_{11} (k)) + ω_{42} (k) S (x_{12} (k)) \\ + ω_{43} (k) S (x_{21} (k)) + ω_{44} (k) S (x_{31} (k)) \\ + ω_{45} (k) x_{31} (k) \\ {\hat{χ}}_{5} (k + 1) = ω_{51} (k) S (x_{11} (k)) + ω_{52} (k) S (x_{12} (k)) \\ + ω_{53} (k) S (x_{22} (k)) + ω_{54} (k) S (x_{32} (k)) \\ + ω_{55} (k) x_{32} (k) \\ {\hat{χ}}_{6} (k + 1) = ω_{61} (k) S (x_{11} (k)) + ω_{62} (k) S (x_{12} (k)) \\ + ω_{63} (k) S (x_{21} (k)) + ω_{64} (k) S (x_{31} (k)) \\ + ω_{65} (k) u_{1} (k) \\ {\hat{χ}}_{7} (k + 1) = ω_{71} (k) S (x_{11} (k)) + ω_{72} (k) S (x_{12} (k)) \\ + ω_{73} (k) S (x_{22} (k)) + ω_{74} (k) S (x_{32} (k)) \\ + ω_{75} (k) u_{2} (k) \end{matrix}

(28)

where ${\hat{χ}}_{1}$ , ${\hat{χ}}_{2}$ , ${\hat{χ}}_{3}$ , ${\hat{χ}}_{4}$ , ${\hat{χ}}_{5}$ , ${\hat{χ}}_{6}$ and ${\hat{χ}}_{7}$ identify x, y, $θ$ , v₁, v₂, $i_{a_{1}}$ and $i_{a_{2}}$ , respectively. Moreover, it is worth mentioning that this model includes the actuator dynamics.

RHONN identifier (28) is adapted on-line using the EKF-based training algorithm (14). All the neural network states and weights are initialized in a random way.

Inverse optimal control of a tracked robot

The control objective is the design of a control law u to track the desired trajectory generated by the following reference robot

\begin{array}{l} {\dot{x}}_{r} = v_{r} \cos (θ_{r}) \\ {\dot{y}}_{r} = v_{r} \sin (θ_{r}) \\ {\dot{θ}}_{r} = ω_{r} \end{array}

(29)

where x_r, y_r and $θ_{r}$ are the position and orientation of the reference robot. v_r and $ω_{r}$ are the linear and angular velocities of the reference robot, respectively.

The design of a controller based on model (1) requires the exact knowledge of the plant parameters and disturbances which can vary with time; since in practice getting this model is not a trivial task, the identified model (28) is used as a valid model for the tracked robots, and therefore, the identified model (28) is used to design the controller for solving the trajectory tracking problem.

Systems (1) and (29) are discretized based on the Euler methodology. System (1) is rewritten in the block structure form (30) to simplify the controller synthesis

\begin{matrix} {\hat{χ}}_{a} (k + 1) = ω_{1} (k) z_{a} (k) + ω_{a} (k) x_{b} (k) \\ {\hat{χ}}_{b} (k + 1) = ω_{2} (k) z_{b} (k) + ω_{b} (k) x_{c} (k) \\ {\hat{χ}}_{c} (k + 1) = ω_{3} (k) z_{c} (k) + ω_{c} (k) u_{2} (k) \end{matrix}

(30)

where

\begin{matrix} {\hat{χ}}_{a} & = [{\hat{χ}}_{1}, {\hat{χ}}_{2}, {\hat{χ}}_{3}]^{T} \\ {\hat{χ}}_{b} & = [{\hat{χ}}_{4}, {\hat{χ}}_{5}]^{T} \\ {\hat{χ}}_{c} & = [{\hat{χ}}_{6}, {\hat{χ}}_{7}]^{T} \end{matrix}

(31)

\begin{matrix} x_{b} = x_{2} \\ x_{c} = x_{3} \end{matrix}

(32)

with

ω_{1} (k) = [\begin{matrix} ω_{11} & ω_{12} & ω_{13} \\ ω_{21} & ω_{22} & ω_{23} \\ ω_{31} & ω_{32} & ω_{33} \end{matrix}]

(33)

ω_{a} (k) = [\begin{matrix} ω_{14} & ω_{15} \\ ω_{24} & ω_{25} \\ ω_{34} & ω_{35} \end{matrix}]

(34)

ω_{2} (k) = [\begin{matrix} ω_{41} & ω_{42} & ω_{43} & ω_{44} \\ ω_{51} & ω_{52} & ω_{53} & ω_{54} \end{matrix}]

(35)

ω_{b} (k) = [\begin{matrix} ω_{45} & 0 \\ 0 & ω_{55} \end{matrix}]

(36)

ω_{3} (k) = [\begin{matrix} ω_{61} & ω_{62} & ω_{63} & ω_{64} \\ ω_{71} & ω_{72} & ω_{73} & ω_{74} \end{matrix}]

(37)

ω_{c} (k) = [\begin{matrix} ω_{65} & 0 \\ 0 & ω_{75} \end{matrix}]

(38)

In this way, the control objective is to force x_a to track the reference signal $x_{a δ} (k + 1) = [x_{r}, y_{r}, θ_{r}]$ which is achieved by designing a control law $x_{b} (k) = u_{1} (k)$ based on an inverse optimal approach for discrete-time non-linear affine systems.¹⁷ Also, $x_{b}$ is forced to track the previous control law which is achieved designing a control law as equation (39)

x_{c} = ω_{b}^{- 1} (- ω_{2} (k) z_{b} (k) + u_{1} (k))

(39)

Therefore, the reference signal for the control law $u_{2}$ is $x_{c δ} (k + 1) = x_{c}$ . Thus, the inverse optimal control laws are defined as

u_{i} (k) = - {[I_{m} + J_{i} (x (k))]}^{- 1} h_{i} (x (k), x_{δ} (k + 1))

(40)

with

h_{i} (x_{i} (k), x_{i δ} (k + 1)) = g_{i}^{T} (x (k)) P_{i} (f_{i} (x_{i} (k)) - x_{i δ} (k + 1))

(41)

J_{i} (x (k)) = \frac{1}{2} g_{i}^{T} (x (k)) P_{i} g_{i} (x (k))

(42)

where $i = 1, 2$ .

Results

This section presents the simulation and real-time results. The simulations of model (1), RHONN identifier (28) and control (40) were implemented in MATLAB and Simulink^® (MathWorks, Inc.) software. On the other hand, in the real-time tests, the block that represents the model of the robot was removed and replaced with a block where the communication with HD2 was implemented.

In this way, it was demonstrated that the same RHONN identifier is capable of identifying both the models, the one used for simulation and the one for HD2. Also, due to the fact that the control was designed using the model of RHONN identifier, it is valid for both simulation and real-time tests. The parameters for RHONN identifier for all tests are shown in Table 1.

Table 1.

EKF-based training algorithm parameters.

I	P	Q	R
1	$diag (3) \times 10^{6}$	$diag (3) \times 10^{6}$	$10^{4}$
2	$diag (3) \times 10^{6}$	$diag (3) \times 10^{6}$	$10^{4}$
3	$diag (3) \times 10^{6}$	$diag (3) \times 10^{6}$	$10^{4}$
4	$diag (4) \times 10^{4}$	$diag (4) \times 10^{6}$	$10^{3}$
5	$diag (4) \times 10^{4}$	$diag (4) \times 10^{6}$	$10^{3}$
6	$diag (4) \times 10^{4}$	$diag (4) \times 10^{6}$	$10^{3}$
7	$diag (4) \times 10^{4}$	$diag (4) \times 10^{6}$	$10^{3}$

Moreover, the following weights are fixed: $ω_{45} = 1$ , $ω_{55} = 1$ , $ω_{65} = 0.00001$ and $ω_{75} = 0.00001$ . The initial postures for the reference robot are $x_{r} = 0$ , $y_{r} = 0$ and $θ_{r} = 0$ .

Simulation

Simulation test

The parameters for the test are set as follows

P_{1} (k) = 14400 [\begin{matrix} 162 & 1 & 2 \\ 1 & 162 & 3 \\ 2 & 3 & 162 \end{matrix}]

(43)

P_{2} (k) = 20 [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]

(44)

g_{1} = 0.5 rT [\begin{matrix} \cos (x_{13}) & \cos (x_{13}) \\ \sin (x_{13}) & \sin (x_{13}) \\ R^{- 1} & - R^{- 1} \end{matrix}]

(45)

g_{2} = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]

(46)

T = 0.001 s

(47)

where T is the sample time.

Figures 3 and 4 show the references for linear and angular velocities, respectively.

Figure 3.

Linear velocity reference for simulation test.

Figure 4.

Angular velocity reference for simulation test.

Figures 5 –7 show the references, real and identified signals of x, y and $θ$ , respectively.

Figure 5.

Real, identified and reference signals of x of simulation test.

Figure 6.

Real, identified and reference signals of y of simulation test.

Figure 7.

Real, identified and reference signals of $θ$ of simulation test.

The errors of the simulation test are shown in Figure 8 and Table 2. Figure 8 shows the errors of reference versus simulated real signals; a comparison between reference signals and identified signals is omitted because the error between the real and identified signal is so small that a second figure showing the reference versus identified signal would look just like Figure 8.

Figure 8.

Tracking errors of simulation test.

Table 2.

RMSE of simulation test tracking errors of x, y and $θ$ .

	x	y	$θ$
Real signals
RMSE	0.0239	0.0189	0.0128
Identified signals
RMSE	0.0241	0.0211	0.0134

RMSE: root mean square error.

Figures 9 and 10 show the velocities v₁ and v₂, respectively.

Figure 9.

Real and identified signals of v₁ of simulation test.

Figure 10.

Real and identified signals of v₂ of simulation test.

Figures 11 and 12 show currents i₁ and i₂, respectively.

Figure 11.

Real and identified signals of i₁ of simulation test.

Figure 12.

Real and identified signals of i₂ of simulation test.

Figures 13 and 14 show control signals u₁ and u₂, respectively.

Figure 13.

Control signal u₁ of simulation test.

Figure 14.

Control signal u₂ of simulation test.

Table 3 shows the root mean square error (RMSE) of the real versus identified signals.

Table 3.

RMSE of simulated real versus identified state variables of simulation test.

I	RMSE
1	$0.0031$
2	$0.0049$
3	$0.0029$
4	$0.0088$
5	$0.0068$
6	$0.0767$
7	$0.0128$

RMSE: root mean square error.

Performance comparison

The following simulation results compare the proposed neural identifier – inverse optimal control scheme – and the super twisting control proposed in Lopez-Franco et al.,²⁵ which is a discrete-time control algorithm for nonholonomic wheeled mobile robots, without the previous knowledge of the plant model or its parameters. The super twisting control proposed in Lopez-Franco et al.²⁵ was adapted from the three-state model presented in Lopez-Franco et al.²⁵ to our seven-state model. For comparison, we use the same references shown in Figures 15 and 16.

Figure 15.

Linear velocity reference for the performance comparison test.

Figure 16.

Angular velocity reference for the performance comparison test.

The tracking performance for the neural identifier – inverse optimal control – is shown in Figures 17 –19, for x, y and $θ$ , respectively. The tracking performance for super twisting is shown in Figures 20 –22, for x, y and $θ$ , respectively.

Figure 17.

Real, identified and reference signals of x with neural identifier: inverse optimal control.

Figure 18.

Real, identified and reference signals of y with neural identifier: inverse optimal control.

Figure 19.

Real, identified and reference signals of $θ$ with super twisting.

Figure 20.

Real and reference signals of x with super twisting.

Figure 21.

Real and reference signals of y with super twisting.

Figure 22.

Real and reference signals of $θ$ with super twisting.

For more details, Figures 23 and 24 show the errors and Table 4 shows the RMSEs of the tests.

Figure 23.

Tracking errors with neural identifier: inverse optimal control.

Figure 24.

Tracking errors with super twisting.

Table 4.

RMSE of simulation comparison test tracking errors of x, y and $θ$ .

	x	y	$θ$
Real signals – inverse optimal control
RMSE	0.0225	0.0348	0.0126
Identified signals – inverse optimal control
RMSE	0.0260	0.0362	0.0158
Real signals – super twisting
RMSE	0.0317	0.0036	0.0652

RMSE: root mean square error.

Figures 15 to 24 and Table 4 show that a better performance is found in the neural identifier – inverse optimal control scheme compare to the super twisting in states x and, especially, in $θ$ where a not desirable behaviour is presented when super twisting controller is used.

Real-time

Figure 25 shows the HD2 Treaded ATR Tank Robot Platform with an added wireless router.

Figure 25.

HD2® Treaded ATR Tank Robot Platform.

Figure 26 shows the HD2 Treaded ATR Tank Robot Platform inside where the modification of the platform can be seen. Basically, the modification is the replacement of the original board with a system based on Arduino^® (Arduino LLC) and added current sensors; Figure 26 also shows the original batteries and motors.

Figure 26.

HD2^® Treaded ATR Tank Robot inner components.

Real-time test 1

The parameters for the real-time test are set as follows

P_{1} (k) = 72, 000 [\begin{matrix} 162 & 1 & 2 \\ 1 & 162 & 3 \\ 2 & 3 & 162 \end{matrix}]

(48)

P_{2} (k) = 10000 [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]

(49)

g_{1} = 0.5 rT [\begin{matrix} \cos (x_{13}) & \cos (x_{13}) \\ \sin (x_{13}) & \sin (x_{13}) \\ R^{- 1} & - R^{- 1} \end{matrix}]

(50)

g_{2} = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]

(51)

T = 0.003 s

(52)

Figures 27 and 28 show the references for linear and angular velocities, respectively.

Figure 27.

Linear velocity reference for real-time test 1.

Figure 28.

Angular velocity reference for real-time test 1.

Figures 29 –31 show the references, real and identified signals of x, y and $θ$ .

Figure 29.

Real, identified and reference signals of x of real-time test 1.

Figure 30.

Real, identified and reference signals of y of real-time test 1.

Figure 31.

Real, identified and reference signals of $θ$ of real-time test 1.

In contrast to the simulation results, the presented real-time tracking results of Figures 29 –31 cannot achieve a perfect tracking mainly to the following reason:

WiFi communication and packet loss;

Actuators’ saturation;

Noise and precision of the sensors;

Unmodelled dynamics.

However, they have the same dynamic and small errors shown in Figure 32 and Table 5; also, it is important to mention that this is accomplished without the knowledge of our modified tracked robot model and parameters.

Figure 32.

Tracking errors of real-time test 1.

Table 5.

RMSE of real-time test 1 tracking errors of x, y and $θ$ .

	x	y	$θ$
Real signals
RMSE	0.0216	0.0098	0.0182
Identified signals
RMSE	0.0238	0.0101	0.0193

RMSE: root mean square error.

Figures 33 and 34 show velocities v₁ and v₂, respectively.

Figure 33.

Real and identified signals of $v_{1}$ of real-time test 1.

Figure 34.

Real and identified signals of $v_{2}$ of real-time test 1.

Figures 35 and 36 show currents i₁ and i₂, respectively.

Figure 35.

Real and identified signals of $i_{1}$ of real-time test 1.

Figure 36.

Real and identified signals of $i_{2}$ of real-time test 1.

Figures 37 and 38 show control signals u₁ and u₂, respectively.

Figure 37.

Control signal $u_{1}$ of real-time test 1.

Figure 38.

Control signal $u_{2}$ of real-time test 1.

Table 6 shows the RMSE of the real versus identified signals.

Table 6.

RMSE of real versus identified states of real-time test 1.

I	RMSE
1	$0.0071$
2	$0.0013$
3	$0.0035$
4	$0.0280$
5	$0.0204$
6	$0.1389$
7	$0.1309$

RMSE: root mean square error.

Real-time test 2

The parameters for this test are the same as the previous test. Figures 39 and 40 show the references for linear and angular velocities, respectively.

Figure 39.

Linear velocity reference for real-time test 2.

Figure 40.

Angular velocity reference for real-time test 2.

Figures 41 –43 show the references, real and identified signals of x, y and $θ$ .

Figure 41.

Real, identified and reference signals of x of real-time test 2.

Figure 42.

Real, identified and reference signals of y of real-time test 2.

Figure 43.

Real, identified and reference signals of $θ$ of real-time test 2.

The errors of real-time test 2 are shown in Figure 44 and Table 7.

Figure 44.

Tracking errors of real-time test 2.

Table 7.

RMSE of real-time test 2 tracking errors of x, y and $θ$ .

	x	y	$θ$
Real signals
RMSE	0.0378	0.0101	0.0309
Identified signals
RMSE	0.0388	0.0179	0.0342

RMSE: root mean square error.

Figures 45 and 46 show velocities v₁ and v₂, respectively.

Figure 45.

Real and identified signals of $v_{1}$ of real-time test 2.

Figure 46.

Real and identified signals of $v_{2}$ of real-time test 2.

Figures 47 and 48 show currents i₁ and i₂, respectively.

Figure 47.

Real and identified signals of $i_{1}$ of real-time test 2.

Figure 48.

Real and identified signals of $i_{2}$ of real-time test 2.

Figures 49 and 50 show control signals u₁ and u₂, respectively.

Figure 49.

Control signal $u_{1}$ of real-time test 2.

Figure 50.

Control signal $u_{2}$ of real-time test 2.

Table 8 shows the RMSE of the real versus identified signals.

Table 8.

RMSE of real versus identified states of real-time test 2.

I	RMSE
1	$0.0026$
2	$0.0146$
3	$0.0098$
4	$0.0343$
5	$0.0210$
6	$0.1510$
7	$0.1404$

RMSE: root mean square error.

Conclusion

This work uses an RHONN identifier trained with an EKF-based algorithm to get the model of a simulated plant to identify and the model of an actual tracked robot. Through the results, it can be seen that the same neural identifier was capable of identifying both simulated and real models, with sample times equal to 0.001 and 0.003 s, respectively.

It is important to say that the knowledge about the model parameters of our modified HD2 was not available. Moreover, for simulation even if the parameters were available, they were not used for the design of the identifier.

On the other hand, the inverse optimal control which was designed using the identified model shows good results for simulation and real-time tests. This can be appreciated in the result graphs which compare the references against the real signals of the robot.

Footnotes

Academic Editor: Jianyong Yao

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors thank the support of CONACYT Mexico, through projects CB256769 and CB258068 (‘Project supported by Fondo Sectorial de Investigación para la Educación’).

References

Wong

Huang

‘Wheels vs. tracks’: a fundamental evaluation from the traction perspective. J Terramech 2006; 43: 27–42.

González

Rodríguez

Guzmán

JL.

Autonomous tracked robots in planar off-road conditions: modelling, localization, and motion control. Berlin: Springer, 2014.

Siegwart

Nourbakhsh

Scaramuzza

Introduction to autonomous mobile robots. Cambridge, MA: MIT Press, 2011.

Fierro

Lewis

FL.

Control of a nonholonomic mobile robot using neural networks. IEEE T Neural Netw 1998; 9: 589–600.

Akhavan

Jamshidi

. ANN-based sliding mode control for non-holonomic mobile robots. In: Proceedings of the IEEE international conference on control applications, Anchorage, AK, 27 September 2000, pp.664–667. New York: IEEE.

Moosavian

SAA

Kalantari

. Experimental slip estimation for exact kinematics modeling and control of a Tracked Mobile Robot. In: Proceedings of the 2008 IEEE/RSJ international conference on intelligent robots and systems, Nice, 22–26 September 2008, pp.95–100. New York: IEEE.

Wang

Yang

Ding

. Sliding mode control for trajectory tracking of nonholonomic wheeled mobile robots based on neural dynamic model. In: Proceedings of the 2010 second WRI global congress on intelligent system (vol. 2), Wuhan, China, 16–17 December 2010, pp.270–273. New York: IEEE.

Ogata

Discrete-time control systems. Upper Saddle River, NJ: Prentice Hall, 1995.

Lopez-Franco

Landa

CLF

Alanis

. Discrete-time inverse optimal neural control for a tracked all terrain robot. In: Proceedings of the XVI IEEE autumn meeting of power, electronics and computer science (ROPEC 2014), Ixtapa, Mexico, 5–7 November 2014, pp.70–75. New York: IEEE.

10.

Sanchez

Alanís

Loukianov

Discrete-time high order neural control: trained with Kalman filtering. Berlin, Heidelberg: Springer, 2008.

11.

Alanis

Lopez-Franco

Arana-Daniel

. Discrete-time neural control for electrically driven nonholonomic mobile robots. Int J Adapt Control Signal Process 2012; 26: 630–644.

12.

Antonio-Toledo

Sanchez

Loukianov

. Real-time implementation of a neural block control using sliding modes for induction motors. In: Proceedings of the 2014 world automation congress (WAC), Waikoloa, HI, 3–7 August 2014, pp.502–507. New York: IEEE.

13.

Alanis

Rios

Rivera

. Real-time discrete neural control applied to a linear induction motor. Neurocomputing 2015; 164: 240–251.

14.

Quintal

Sanchez

Alanis

. Real-time FPGA decentralized inverse optimal neural control for a shrimp robot. In: Proceedings of the 10th system of systems engineering conference (SoSE), San Antonio, TX, 17–20 May 2015, pp.250–255. New York: IEEE.

15.

Alanis

Rios

Arana Daniel

. Neural identifier for unknown discrete-time nonlinear delayed systems. Neural Comput Appl 2015; 27: 2453–2464.

16.

Kirk

Optimal control theory: an introduction. Mineola, NY: Dover Publications, 2004.

17.

Sanchez

Ornelas-Tellez

Discrete-time inverse optimal control for nonlinear systems. Boca Raton, FL: CRC Press, 2016.

18.

Yao

Jiao

. High-accuracy tracking control of hydraulic rotary actuators with modeling uncertainties. IEEE/ASME T Mech 2014; 19: 633–641.

19.

Yao

Jiao

Extended-state-observer-based output feedback nonlinear robust control of hydraulic systems with backstepping. IEEE T Ind Electron 2014; 61: 6285–6293.

20.

Huichao

Shurong

Haiyang

. Robust backstepping tracking control for mobile robots. In: Proceedings of the 31st Chinese control conference, Hefei City, Anhui Province, China, 25–27 July 2012, pp.4842–4846.

21.

Sun

Gao

Kaynak

Finite frequency H∞ control for vehicle active suspension systems. IEEE T Control Syst Technol 2011; 19: 416–422.

22.

Chen

Yao

Wang

µ-Synthesis-based adaptive robust control of linear motor driven stages with high-frequency dynamics: a case study. IEEE/ASME T Mech 2015; 20: 1482–1490.

23.

Sun

Gao

Kaynak

Vibration isolation for active suspensions with performance constraints and actuator saturation. IEEE/ASME T Mech 2015; 20: 675–683.

24.

Chen

Pan

Integrated adaptive robust control for multilateral teleoperation systems under arbitrary time delays. Int J Robust Nonlinear Control 2016; 26: 2708–2728.

25.

Lopez-Franco

Salome-Baylón

Alanis

. Discrete super twisting control algorithm for the nonholonomic mobile robots tracking problem. In: Proceedings of the 2011 8th international conference on electrical engineering, computing science and automatic control, Merida City, Mexico, 26–28 October 2011, pp.1–5.

26.

Das

Kar

IN.

Design and implementation of an adaptive fuzzy logic-based controller for wheeled mobile robots. IEEE T Control Syst Technol 2006; 14: 501–510.

27.

Haykin

Neural networks and learning machines. Upper Saddle River, NJ: Prentice Hall, 2009.

28.

Hagan

Demuth

Beale

. Neural network design. 2nd ed. USA: Martin Hagan, 2014.

29.

Samarasinghe

Neural networks for applied sciences and engineering: from fundamentals to complex pattern recognition. Boca Raton, FL: CRC Press, 2016.

30.

Rovithakis

Christodoulou

Adaptive control with recurrent high-order neural networks: theory and industrial applications. London: Springer, 2012.

31.

Haykin

Kalman filtering and neural networks. New York: John Wiley & Sons, 2004.

32.

Song

Grizzle

. The extended Kalman filter as a local asymptotic observer for nonlinear discrete-time systems. In: Proceedings of the American control conference, Chicago, IL, 24–26 June 1992, pp.3365–3369. New York: IEEE.

33.

Norgaard

Neural networks for modelling and control of dynamic systems: a practitioner’s handbook. London: Springer, 2000.

34.

Lopez

Alanis

Sanchez

. Real-time implementation of neural optimal control and state estimation for a linear induction motor. Neurocomputing 2015; 152: 403–412.