Sage Journals: Discover world-class research

Abstract

In this article, a distributed model-free consensus control is proposed for a network of nonlinear agents with unknown nonlinear dynamics, unknown process disturbances, and white noise measurement disturbances. Here, the purpose of the control protocol is to first synchronize the states of all follower agents in the network to a leader and then track a reference trajectory in the state space. The leader has at least one information connection with one of the follower agents in the network. The design procedure includes adaptive laws for estimating the unknown linear and nonlinear terms of each agent’s dynamics. The salient feature of the proposed control scheme is that each agent’s estimation is a model-free adaptive law, that is, the need for regressor or linear-in-parameter basis is alleviated. In addition, without requiring direct connection to the leader, the leader’s control input can still be reconstructed by virtue of a robust observer which can be defined in a distributed manner in the network. The entire design procedure is analyzed successfully for the stability using Lyapunov stability theorem. In addition, it is shown that the proposed distributed controller includes an optimal term. Besides, a modified Kalman filter is added to eliminate the measurement noise. Finally, the simulation results on three networks of unknown nonlinear systems are presented. Moreover, a comparative study is presented to evaluate the proposed algorithm against a model-based cooperative control algorithm.

Keywords

Cooperative control model-free control consensus problem adaptive control nonlinear systems optimal control

Introduction

Great attention has been paid to the problem of controlling multiagent systems ranging from consensus to formation control.^1

–4 The solutions applied to oscillator synchronization, mobile robot and aircraft formation, mobile sensor area coverage, vehicle routing in traffic, containment control of moving bodies, and so on.⁵ Generally, all of these problems can be considered as a consensus problem, in which all agents’ states (or outputs) should be synchronized inside a network.⁶ In practice, each agent’s dynamics usually has an unknown nonlinear structure due to unpredictable environmental disturbances, unmodeled dynamics, and other uncertainties. Hence, the requirement for designing distributed cooperative control without any a priori model of the agents’ dynamics is essential. These types of control policies are called model-free controllers (MFCs)⁷ or data-driven controllers ⁸ in the literature. Although several MFCs have been proposed for a single-agent system,^7,9

–14 the use of MFCs for multiagent systems is quite new.^15,16

The consensus problem for a class of nonlinear first-order multiagent systems with external disturbances is discussed by Das and Lewis,¹⁷ while the problems for second-order and higher-order nonlinear multiagent systems are reported by Zhang and Lewis¹ and Peng et al.,² respectively. Distributed adaptive leader-following control for unknown dynamic systems with guaranteed finite-time convergence is proposed by Mahyuddin et al.^18,19 The algorithms are model-based cooperative controllers which require sufficiently rich input signals to guarantee persistently excitation condition for the regressors. The procedure to design a distributed state–output feedback cooperative control is presented by Wang et al.²⁰ for uncertain multiagent systems in undirected communication graphs. This procedure is extended for a directed communication graph with a spanning tree characteristic.²¹ To remedy the problem of a non-affine system for a general class, several works such as by Meng et al.²² employ a direct adaptive approach using an artificial neural network (ANN).

Most of the MFCs for nonlinear systems in the literature are proposed under the context of reinforcement learning (RL). These controllers are actually optimal adaptive controllers, which calculate the optimal control policy in an online manner using some adaptive laws.²³ These algorithms include an online estimation process for the cost function to evaluate the controller performance (critic network) and another online process to estimate the optimal control signals (actor network). These two online estimations are performed using two distinct ANNs.²⁴ While more appropriate performances can be achieved by increasing the number of nodes in each of the mentioned ANNs, this may increase the computational complexity.²⁵ Future prospective applications may be limited due to this caveat especially involving scarce energy resources. If the MFCs do not include ANNs for online estimation, the number of adaptive laws is reduced and the problem of computational complexity would be eliminated. Such an attractive feature, that is, being lite in computation, opens up multiple possibility of having the control scheme deployed on any distributed system.

A robust adaptive cooperative control is proposed by Mahyuddin and Safaei²⁶ for formation-tracking problem, whereby the system matrix for dynamic system is assumed to be known. In this article, however, the system matrix is completely unknown and to be estimated with a model-free adaptive law. In contrast to the previous work by Mahyuddin and Safaei,²⁶ the main controller gains are, instead, determined online. Another salient feature exemplified in this article is that the proposed cooperative controller considers an optimal policy by virtue of RL to find a solution to the algebraic Riccati equation.

Safaei and Mahyuddin²⁷ proposed the idea of model-free cooperative controller for the first time. By contrast, here an optimal analysis is also provided to show that an optimal term is incorporated into the proposed controller. Moreover, a detailed explanation about the dynamical structure for unknown dynamics of each agent in the network is presented in the current work. The approach adopted does not require the maximum absolute values for unknown dynamics of each agent.

In this article, a distributed consensus control problem is solved for a network of agents with general unknown nonlinear multi-input and multi-output (MIMO) dynamic system using a model-free control algorithm. The main contribution of this article is the design and development of an MFC algorithm for consensus problem involving a network of nonlinear multiagent systems without requiring the use of ANNs to estimate the unknown system dynamics and disturbances. Here, the proposed distributed MFC is based on a new structure for the unknown dynamics of each agent. The unknown dynamics of each agent can be segmented into two parts: linear-in-states and nonlinear. Two separate adaptive laws are proposed for estimating both linear and nonlinear terms in the system at each agent. The estimation of unknown nonlinear terms is performed in such a way that the dependence on any model regressor or nonlinear basis functions is removed. This implies that the estimation is regressor-free. By estimating the linear terms, a technique is proposed for online determination of the controller gains locally at each agent by the solution of a continuous-time algebraic Riccati equation (CARE). Moreover, a robust observer is designed for all the follower agents, which are not connected to the leader. The observer estimates the leader’s control input(s). The stability analysis of the whole algorithm is provided with Lyapunov stability theorem. In addition, an optimality analysis based on the solution of a Hamilton–Jacobi–Bellman (HJB) equation is presented to illustrate the efficacy of the optimal term in the proposed cooperative MFC controller. Furthermore, since the proposed structure for unknown dynamics of each nonlinear agent in the network is linear-in-states, a modified Kalman filter is implemented to remove the measurement white noise. Finally, a simulation study is provided to evaluate the performance of the proposed distributed controller on a chaotic plant and a non-affine nonlinear system. The contributions of this article are listed as follows:

A distributed MFC protocol is proposed for a generic unknown nonlinear system without the use of any ANN.

The network-based adaptive law for estimating the unknown nonlinear terms at each agent is regressor-free.

The controller gain (P matrix) is updated online without requiring knowledge on the communication topology of the network.

In the following, first, a general formulation for a network of unknown nonlinear agents is proposed in General formulation for a network of unknown nonlinear MIMO agents. The design procedure for the MFC cooperative control is presented in Design procedure for model-free cooperative control with tracking objective. That section includes three different subsections dedicated to distributed estimation for unknown system matrix, adaptive MFC cooperative protocol, and cooperative robust observer for the leader’s control inputs. In Cooperative robust observer for leader’s control inputs, the modified Kalman filter observer is proposed for compensating the measurement noise. Finally, a simulation study including three different cases, a comparative study against a model-based cooperative control algorithm, and an analysis for different types of measurement noise are provided in Observer design for compensation of measurement noise, Simulation study, and Comparison with model-based cooperative control algorithms, respectively.

General formulation for a network of unknown nonlinear MIMO agents

Definition 1

Consider a network of N homogenous nonlinear dynamic systems. Let $G (V, E, A)$ be an undirected graph with the set of N nodes $V = (v_{1},..., v_{N})$ , a set of edges $E = (e_{i j})$ , and associated adjacency matrix $A = (a_{i j}) \in ℝ^{N \times N}$ , representing the interagent connectivity.^5,28 The in-degree matrix $D = diag (d_{1},..., d_{N}) \in ℝ^{N \times N}$ , where each d_i is the input degree to each node, is defined. One can define Laplacian matrix as $L = D - A$ . Furthermore, we define the pinning gain matrix as $B = diag (β_{1},..., β_{N})$ including the values for communication between the leader and agents in the network. Then, we denote

H = L + B

to represent the interagent communication link (inclusive of the leader pinning) to be exploited in the analysis of this article.

Assumption 1

Here, the necessary condition about the network is that at least one of the agents should have the communication link with the leader. In other words, at least one of the diagonal elements in matrix $B$ should be one, while the others can be zero.

Definition 2

Consider the network defined in definition 1. A general unknown nonlinear dynamic system for ith agent can be defined as

\begin{array}{l} {\dot{x}}^{i} & = A x^{i} + f^{i} (x^{i}, u^{i}) \end{array}

where $A \in ℝ^{n \times n}$ is a system matrix including unknown constant elements, $x^{i} \in ℝ^{n \times 1}$ is the vector of system states, and $u^{i} \in ℝ^{m \times 1}$ is the vector of control inputs, $f^{i} (.) \in ℝ^{n \times 1}$ is the vector of unknown bounded nonlinear Lipschitz continuous functions depending on both xⁱ and uⁱ, which also can include any unknown external disturbances. Moreover, since the agents in the network are homogenous, it is assumed that A is identical for all agents, but unknown. Each agent in the network has n states (and also outputs) and m control inputs. Here, it is assumed that all states are measurable and reachable, and a state-feedback control as $u^{i} = u^{i} (x^{i})$ exists and can be designed for this system. One can have

{\dot{x}}^{i} = A x^{i} + f^{i} (x^{i}, u^{i} (x^{i})) - B u^{i} (x^{i}) + B u^{i} (x^{i})

where $B = [b_{j k}] \in ℝ^{n \times m}$ is a gain matrix defined as

\begin{array}{l} b_{j k} = {\begin{matrix} 0, i f {\dot{x}}_{j}^{i} does not depend on u_{k}^{i} \\ 1, i f {\dot{x}}_{j}^{i} depends on u_{k}^{i} \end{matrix} \end{array}

The only minimal information required about the dynamic system of each agent is to know whether each system states depends on each of the control inputs or not. The matrix B can be constructed by this information according to equation (4). Finally, by considering

g^{i} (x^{i}) = f^{i} (x^{i}, u^{i}) - B u^{i}

the unknown nonlinear system in equation (2) can be presented as

{\dot{x}}^{i} = A x^{i} + B u^{i} + g^{i}

where $g^{i} = g^{i} (x^{i}) \in ℝ^{n \times 1}$ is a vector of unknown bounded Lipschitz continuous nonlinear functions.

Definition 3

For a network of N agents with dynamics defined in equation (6), we can have the dynamic expression for the whole network as

\dot{x} = (I_{N} \otimes A) x + (I_{N} \otimes B) u + g

where $x = [x^{1},..., x^{N}]^{T} \in ℝ^{N n \times 1}$ , $u = [u^{1},..., u^{N}]^{T} \in ℝ^{N m \times 1}$ , $g = [g^{1},..., g^{N}]^{T} \in ℝ^{N n \times 1}$ , I_N is an identity matrix in $ℝ^{N \times N}$ , and ⊗ is the Kronecker product.

Definition 4

Let us define dynamics of a virtual leader for the network introduced in definition 2 as ${\dot{x}}^{0} = u^{0}$ , where $x^{0} \in ℝ^{n \times 1}$ is the states vector for the leader and $u^{0} \in ℝ^{n \times 1}$ is the leader’s control input. The values of u⁰ are defined regarding the reference trajectory in the state space of agents, which should be followed by the entire network.

Definition 5

For a network defined in definition 1 to definition 4, we can define a consensus error $e^{i} \in ℝ^{n}$ , considering the neighboring information available at agent i via the communication graph²⁸

e^{i} = \sum_{j = 1}^{N} a_{i j} (x^{i} - x^{j}) + β_{i} (x^{i} - x^{0})

The consensus errors of all agents in the network can be expressed as

e = (H \otimes I_{n}) x - (B \otimes x^{0}) \underline{1}

where $e = [e^{1},..., e^{N}]^{T} \in ℝ^{N n \times 1}$ , I_n is an identity matrix in $ℝ^{n \times n}$ , and $\underline{1} {= [1, ..., 1]}^{T} \in ℝ^{N \times 1}$ .

Design procedure for model-free cooperative control with tracking objective

Distributed robust adaptive parameter estimation for unknown system matrix

Lemma 1

Combination of a stable estimator and a stable controller within a dynamic system will lead to a stable system. This is investigated in the literature as separation principle for both linear and nonlinear dynamic systems.^29,30

Theorem 1

Consider the dynamics of agent i in the network as in equation (6). If one can define

{\dot{\hat{A}}}^{i} = λ {\tilde{w}}^{i} r^{i T}

as the rate for estimation of A at ith agent, where λ is a positive scalar and

{\tilde{w}}^{i} = \frac{s}{s + α} x^{i} - \frac{1}{s + α} B u^{i} - \frac{1}{s + α} {\hat{g}}^{i} - \hat{A} r^{i}

and

r^{i} = \frac{1}{s + α} x^{i}

where s is the Laplacian operator and $α \in ℝ^{+}$ is a constant positive scalar denoting filter coefficient of the designer’s choice, then by considering the following Hamiltonian

H_{0} = {\dot{x}}^{i} - {\hat{A}}^{i} x^{i} - B u^{i} - {\hat{g}}^{i}

the filtered format of H₀ defined as

H_{0}^{F} = \frac{1}{s + α} H_{0}

converges to zero asymptotically.

Proof

Let us define the estimation error for Aⁱ as ${\tilde{A}}^{i} = A - {\hat{A}}^{i}$ . Consider equation (6) as the dynamics of agent i. According to lemma 1, one can use $\hat{g}$ instead of g (a stable estimator will be proposed later in theorem 1 for $\hat{g}$ ) in this equation. Hence, we have

s x^{i} = A x^{i} + B u^{i} + {\hat{g}}^{i}

Then, by filtering both sides with $\frac{1}{s + α}$ to avoid signal differentiation, we have³¹

\frac{s}{s + α} x^{i} - \frac{1}{s + α} B u^{i} - \frac{1}{s + α} {\hat{g}}^{i} = A \frac{1}{s + α} x^{i}

which is in the form of $w^{i} = A r^{i}$ , where

w^{i} = \frac{s}{s + α} x^{i} - \frac{1}{s + α} B u^{i} - \frac{1}{s + α} {\hat{g}}^{i}

consequently leading to

{\tilde{w}}^{i} = w^{i} - {\hat{w}}^{i} = A r^{i} - {\hat{A}}^{i} r^{i} = {\tilde{A}}^{i} r^{i}

Now, consider the following Lyapunov function

V_{0} = \frac{1}{2 λ} {({\tilde{w}}^{i})}^{T} {\tilde{w}}^{i}

which has derivative as

{\dot{V}}_{0} = \frac{1}{λ} {({\tilde{w}}^{i})}^{T} {\dot{\tilde{A}}}^{i} r^{i}

To have ${\dot{V}}_{0} < 0$ , we reach the adaptive law

{\dot{\hat{A}}}^{i} = - {\dot{\tilde{A}}}^{i} = λ {\tilde{w}}^{i} r^{i T}

It should be noted that according to definition 2, the elements of A are constant real values; hence $\dot{A} = 0$ . Recalling the Lyapunov stability theorem, since $V_{0} > 0$ and ${\dot{V}}_{0} < 0$ in $ℝ - {0}$ , ${\tilde{w}}^{i}$ converges to zero asymptotically. Then the proof is completed.

Proposition 1

The adaptive law, estimating the linear term A, is equipped with a leakage term to make the estimation robust against bounded perturbation³¹ as follows

{\dot{\hat{A}}}^{i} = λ {\tilde{w}}^{i} r^{i T} - ρ_{1} λ {\hat{A}}^{i}

where ρ₁ is a positive scalar.

Remark 1

Referring to theorem 1, the values of ${\tilde{w}}^{i}$ at each agent converge to zero asymptotically. Then according to equation (18), one can say that ${\tilde{A}}^{i}$ reaches zero. It means that each ${\hat{A}}^{i}$ converges to correct values of A, in the given network.

Remark 2

Referring to lemma 1 and theorem 1, one can replace $I_{n} \otimes A$ in equation (7) with $\hat{A} = diag ({\hat{A}}^{1},..., {\hat{A}}^{N}) \in ℝ^{N n \times N n}$ . Hence, we represent the network’s dynamics as

\dot{x} = \hat{A} x + (I_{N} \otimes B) u + g

Distributed adaptive MFC protocol

In this section, two analyses are presented for designing the distributed adaptive MFC protocols. They are stability analysis and optimality analysis.

Stability analysis

Proposition 2

If the consensus errors of all agents as in equation (9) converge to zero, then all agents in the network will synchronize to each other and to the reference trajectory denoted by the leader agent successfully, that is, $x_{i} \to x_{j}, t \to \infty$ and ∀ agent i, $x_{i, j} \to x_{o}, t \to \infty$ .

Theorem 2

For a network with dynamics defined in equation (7) and recalling the consensus error in equation (9), with providing the conditions that the diagonal elements in $H$ are nonzero and the gain matrix B is full-rank, if one can define the control input uⁱ for agent i as

u^{i} = \frac{1}{H (i, i)} {(B^{T} B)}^{- 1} B^{T} [- K^{i} P^{i} e^{i} - K_{I}^{i} ζ^{i} - \sum_{j = 1 \neq i}^{N} (H (i, j) B u^{j}) - \sum_{j = 1}^{N} (H (i, j) {\hat{g}}^{j}) + β_{i} (u^{0} - {\hat{A}}^{i} x^{0})]

where $P^{i} \in ℝ^{n \times n}$ , $K^{i} \in ℝ^{n \times n}$ , and $K_{I}^{i} \in ℝ^{n \times n}$ are positive definite matrices, $H (i, j)$ is the element of $H$ in ith row and jth column, and $ζ^{i} = \int e^{i} d t \in ℝ^{n \times 1}$ , and the following adaptive law for estimation of the unknown nonlinearities at agent i

{\dot{\hat{g}}}^{i} = γ \sum_{j = 1}^{N} (H (j, i) P^{j} e^{j}) - ρ_{2} {\hat{g}}^{i}

where γ is a positive constant scalar defining the adaptation rate and ρ₂ is another positive scalar acting as the leakage term, then proposition 1 will be achieved.

Proof

Consider the following Lyapunov function

\begin{array}{l} V_{1} = e^{T} P e + ζ^{T} P K_{I} ζ + \frac{1}{γ} {\tilde{g}}^{T} \tilde{g} \end{array}

where $P = diag (P^{1},..., P^{N})$ , $K_{I} = diag (K_{I}^{1},..., K_{I}^{N})$ , and $\tilde{g} = g - \hat{g}$ is the estimation error for unknown nonlinearities in the network. Continuing with the derivative of V₁, we have

{\dot{V}}_{1} = {\dot{e}}^{T} P e + e^{T} P \dot{e} + 2 e^{T} P K_{I} ζ + \frac{2}{γ} {\dot{\tilde{g}}}^{T} \tilde{g}

leading to

\begin{matrix} {\dot{V}}_{1} = [{\dot{x}}^{T} (H^{T} \otimes I_{n}) - {\underline{1}}^{T} (B^{T} \otimes {({\dot{x}}^{0})}^{T}] P e + \frac{2}{γ} {\dot{\tilde{g}}}^{T} \tilde{g} + \\ e^{T} P [(H \otimes I_{n}) \dot{x} - (B \otimes {\dot{x}}^{0}) \underline{1}] + 2 e^{T} P K_{I} ζ \end{matrix}

By replacing $\dot{x}$ from equation (23), and also adding and subtracting $2 e^{T} P \hat{A} (B \otimes x^{0}) \underline{1}$ , we have

\begin{matrix} {\dot{V}}_{1} = [x^{T} {\hat{A}}^{T} (H^{T} \otimes I_{n}) - {\underline{1}}^{T} (B^{T} \otimes x^{0 T}) {\hat{A}}^{T} + \\ u^{T} (I_{N} \otimes B^{T}) (H^{T} \otimes I_{n}) + g^{T} (H^{T} \otimes I_{n}) - \\ {\underline{1}}^{T} (B^{T} \otimes {({\dot{x}}^{0})}^{T}] P e + e^{T} P [(H \otimes I_{n}) \hat{A} x - \hat{A} (B \otimes x^{0}) \underline{1} + \\ (H \otimes I_{n}) (I_{N} \otimes B) u + (H \otimes I_{n}) g - (B \otimes {\dot{x}}^{0}) \underline{1}] + \\ 2 e^{T} P \hat{A} (B \otimes x^{0}) \underline{1} + 2 e^{T} P K_{I} ζ + \frac{2}{γ} {\dot{\tilde{g}}}^{T} \tilde{g} \end{matrix}

Besides, by multiplying both sides of equation (9) with $\hat{A}$ , we have

\hat{A} e = \hat{A} (H \otimes I_{n}) x - \hat{A} (B \otimes x^{0}) \underline{1}

In addition, by recalling remark 1 and the undirected property for the communication graph of the network (which in turn means that matrix $H$ is a symmetric matrix), one can have the following commutative property

\hat{A} (H \otimes I_{n}) = (H \otimes I_{n}) \hat{A}

Thus, by incorporating equations (30) and (31) into equation (29) and recalling the mixed-product property for Kronecker product, substituting ${\dot{x}}^{0} = u^{0}$ , and also adding and subtracting $2 e^{T} P (H \otimes I_{n}) \hat{g}$ , one can rewrite equation (29) as

\begin{matrix} {\dot{V}}_{1} = e^{T} [{\hat{A}}^{T} P + P \hat{A}] e + \\ 2 e^{T} P [(H \otimes B) u + (H \otimes I_{n}) \hat{g} - (B \otimes u^{0}) \underline{1} + \\ \hat{A} (B \otimes x^{0}) \underline{1} + K_{I} ζ] + 2 {\tilde{g}}^{T} [(H^{T} \otimes I_{n}) P e + \frac{1}{γ} \dot{\tilde{g}}] \end{matrix}

Since $\dot{g} \neq 0$ , we have

\begin{matrix} {\dot{V}}_{1} = e^{T} [{\hat{A}}^{T} P + P \hat{A}] e + \\ 2 e^{T} P [(H \otimes B) u + (H \otimes I_{n}) \hat{g} - (B \otimes u^{0}) \underline{1} + \\ \hat{A} (B \otimes x^{0}) \underline{1} + K_{I} ζ] + 2 {\tilde{g}}^{T} [(H^{T} \otimes I_{n}) P e + \frac{1}{γ} \dot{g} - \frac{1}{γ} \dot{\hat{g}}] \end{matrix}

Then, by adding and subtracting

S_{0} = \frac{1}{γ ρ_{2}} {(\dot{g} + ρ_{2} g)}^{T} (\dot{g} + ρ g) + \frac{2 γ}{ρ_{2}} {\tilde{g}}^{T} \tilde{g} + \frac{2 γ}{ρ_{2}} {\tilde{g}}^{T} \hat{g}

we lead to

\begin{matrix} {\dot{V}}_{1} = e^{T} [{\hat{A}}^{T} P + P \hat{A}] e + \\ 2 e^{T} P [(H \otimes B) u + (H \otimes I_{n}) \hat{g} - (B \otimes u^{0}) \underline{1} + \\ \hat{A} (B \otimes x^{0}) \underline{1} + K_{I} ζ] + 2 {\tilde{g}}^{T} [(H^{T} \otimes I_{n}) P e - \frac{ρ_{2}}{γ} \dot{g} - \frac{1}{γ} \dot{\hat{g}}] + \\ [\frac{1}{2 γ ρ_{2}} {(\dot{g} + ρ_{2} g)}^{T} (\dot{g} + ρ_{2} g)] - \\ \frac{1}{γ} [\frac{1}{2 ρ_{2}} {(\dot{g} + ρ_{2} g)}^{T} (\dot{g} + ρ_{2} g) + {(\sqrt{2 ρ_{2}})}^{2} {\tilde{g}}^{T} \tilde{g} - \\ 2 (\sqrt{2 ρ_{2}} {\tilde{g}}^{T}) (\frac{1}{\sqrt{2 ρ_{2}}}) (ρ_{2} \tilde{g} + ρ_{2} \hat{g} + \dot{g})] \end{matrix}

Utilizing the following adaptive law

\dot{\hat{g}} = γ (H^{T} \otimes I_{n}) P e - ρ_{2} \hat{g}

the third term in equation (35) is zero. Hence, we have

\begin{matrix} {\dot{V}}_{1} = e^{T} [{\hat{A}}^{T} P + P \hat{A}] e + \\ 2 e^{T} P [(H \otimes B) u + (H \otimes I_{n}) \hat{g} - (B \otimes u^{0}) \underline{1} + \\ \hat{A} (B \otimes x^{0}) \underline{1} + K_{I} ζ] + [\frac{1}{2 γ ρ_{2}} {(\dot{g} + ρ_{2} g)}^{T} (\dot{g} + ρ_{2} g)] - \\ (\frac{1}{γ \sqrt{2 ρ_{2}}}) [\dot{g} + ρ_{2} g - \sqrt{2 ρ_{2}} \tilde{g}]^{T} [\dot{g} + ρ_{2} g - \sqrt{2 ρ_{2}} \tilde{g}] \end{matrix}

Since g is bounded and Lipschitz, we have $| g | \leq M_{g}$ and $| \dot{g} | \leq M_{\dot{g}}$ , where M_g and $M_{\dot{g}}$ are two positive vectors in $ℝ^{N n \times 1}$ . Equation (37) can be represented as follows

\begin{matrix} {\dot{V}}_{1} \leq e^{T} [{\hat{A}}^{T} P + P \hat{A}] e + \\ 2 e^{T} P [(H \otimes B) u + (H \otimes I_{n}) \hat{g} - (B \otimes u^{0}) \underline{1} + \\ \hat{A} (B \otimes x^{0}) \underline{1} + K_{I} ζ] - Λ_{1} H_{1} + δ_{1} \end{matrix}

where $Λ_{1} = \frac{1}{γ \sqrt{2 ρ_{2}}}$ and

\begin{array}{l} H_{1} & = [\dot{g} + ρ_{2} g - \sqrt{2 ρ_{2}} \tilde{g}]^{T} [\dot{g} + ρ_{2} g - \sqrt{2 ρ_{2}} \tilde{g}] \\ δ_{1} & = \frac{1}{2 γ ρ_{2}} {(M_{\dot{g}} + ρ_{0} M_{g})}^{T} (M_{\dot{g}} + ρ_{2} M_{g}) \end{array}

Note that Λ₁ and δ₁ are two positive constant scalars. Besides, if we set

\begin{array}{r} (H \otimes B) u = - K P e - (H \otimes I_{n}) \hat{g} + (B \otimes u^{0}) \underline{1} - \\ \hat{A} (B \otimes x^{0}) \underline{1} - K_{I} ζ \end{array}

where $K = diag (K^{1},..., K^{N}) \in ℝ^{N n \times N n}$ , one can reach

\begin{array}{r} {\dot{V}}_{1} \leq e^{T} [{\hat{A}}^{T} P + P \hat{A} - P K P] e - e^{T} P K P e - \\ Λ_{1} H_{1} + δ_{1} \end{array}

Further, by setting

{\hat{A}}^{T} P + P \hat{A} - P K P = - Q

for Q > 0 in $R^{N n \times N n}$ , we have

{\dot{V}}_{1} = - H_{2} + δ_{1}

where

H_{2} = e^{T} [Q + P K P] e + Λ_{1} H_{1}

According to the LaSalle–Yoshizawa theorem, V₁ is uniformly ultimately bounded. Since V₁ includes the tracking error and the estimation error, we can deduce that e and its time integral ζ and also $\tilde{g}$ converge to a small radially bounded compact set around origin. Besides, the ith row (corresponding to the ith agent) in equation (40) can be presented as

\begin{array}{r} \sum_{j = 1}^{N} (H (i, j) B u^{j}) = - K^{i} P^{i} e^{i} - \sum_{j = 1}^{N} (H (i, j) {\hat{g}}^{j}) + \\ β_{i} (u^{0} - {\hat{A}}^{i} x^{0}) - K_{I}^{i} ζ^{i} \end{array}

Then, by rearranging this equation and recalling that B is full-rank and $H (i, i) \neq 0$ , we lead to equation (24). In addition, by considering equation (36), we can express the adaptive law for agent i as equation (25). Then, the proof is completed.

Remark 3

According to equation (42), the controller gains Pⁱ can be determined online using the solution of following CARE

\begin{matrix} {({\hat{A}}^{i})}^{T} P^{i} + P^{i} {\hat{A}}^{i} - P^{i} K^{i} P^{i} = - Q^{i}, Q^{i} = Q^{i T} > 0 \end{matrix}

It is significant that the solution of this equation does not depend on the communication graph (i.e. $H$ ). This property will be beneficial while dealing with time-varying communication graph (i.e. when $H = H (t)$ ). Only, the values for K should be tuned offline to be some large values with minimum effort, subject to communication graph being simply connected.

Optimality analysis

Definition 6

Referring to the proposed formulation in definition 4 for a network consisting N continuous-time nonlinear agents, one can define the following cost-to-go function³²

V_{4} (e, u) = \int_{t}^{\infty} J (e (τ), u (τ)) d τ

for measuring the performance of a designed set of distributed control inputs u regarding the consensus-tracking objective. Here, V₄(.) is a scalar cost according to the performance of the system in future operations and J(.) is a scalar value named system’s utility.

Proposition 3

Based on the HJB equation, the optimal control for the dynamic system proposed in equation (23) should satisfy³²

0 = min_{u = u_{o p}} {J (e, u) + \frac{d}{d t} V_{4} (e)}

Theorem 3

For the consensus problem defined in proposition 1, if one can construct the following cost function

V_{4} = e^{T} P e + ζ^{T} P K_{I} ζ

and the following utility function

J = e^{T} Q e + u_{1}^{T} (H^{T} \otimes B^{T}) K^{- 1} (H \otimes B) u_{1}

where

(H \otimes B) u_{1} = - K P e

then it can be shown that the distributed controllers proposed in equation (24) include the optimal policies.

Proof

Since the dynamics of agents is proposed in a linear format as in equation (23), one can conclude that the cost-to-go function J for this system can be represented as quadratic functions of system states.³² Hence, equation (49) can be defined. Besides, by recalling lemma 1, remark 2, and theorem 2, one can represent equation (23) as

\dot{x} = \hat{A} x + (I_{N} \otimes B) u + \hat{g}

For the defined V₄ and J in equations (49) and (50) and referring to equation (48), the following Hamiltonian is proposed

\begin{array}{r} H_{3} = e^{T} Q e + u_{1}^{T} (H^{T} \otimes B^{T}) K^{- 1} (H \otimes B) u_{1} \\ + 2 e^{T} P \dot{e} + 2 e^{T} P K_{I} ζ \end{array}

By replacing $\dot{e}$ from the derivative of equation (9)

\dot{e} = (H \otimes I_{n}) \dot{x} - (B \otimes {\dot{x}}^{0}) \underline{1}

we have

\begin{matrix} H_{3} = e^{T} Q e + u_{1}^{T} (H^{T} \otimes B^{T}) K^{- 1} (H \otimes B) u_{1} + \\ 2 e^{T} P [(H \otimes I_{n}) \dot{x} - (B \otimes {\dot{x}}^{0}) \underline{1}] + 2 e^{T} P K_{I} ζ \end{matrix}

Moreover, by replacing $\dot{x}$ from equation (52), we reach to

\begin{matrix} H_{3} = e^{T} Q e + u_{1}^{T} (H^{T} \otimes B^{T}) K^{- 1} (H \otimes B) u_{1} + \\ 2 e^{T} P [(H \otimes I_{n}) \hat{A} x + (H \otimes I_{n}) (I_{N} \otimes B) u + \\ (H \otimes I_{n}) \hat{g} - (B \otimes {\dot{x}}^{0}) \underline{1}] + 2 e^{T} P K_{I} ζ \end{matrix}

Then by recalling the mixed-product property for Kronecker product, we have

\begin{array}{l} H_{3} = e^{T} Q e + u_{1}^{T} (H^{T} \otimes B^{T}) K^{- 1} (H \otimes B) u_{1} + \\ 2 e^{T} P [(H \otimes I_{n}) \hat{A} x + (H \otimes B) u + (H \otimes I_{n}) \hat{g} - \\ (B \otimes {\dot{x}}^{0}) \underline{1} + K_{I} ζ] \end{array}

At this point, we redefine u in equation (40) as $u = u_{1} + u_{2}$ , where

(H \otimes B) u_{1} = - K P e

and

\begin{array}{l} (H \otimes B) u_{2} = - (H \otimes I_{n}) \hat{g} + (B \otimes u^{0}) \underline{1} - \\ \hat{A} (B \otimes x^{0}) \underline{1} - K_{I} ζ \end{array}

By recalling $\dot{x} = u^{0}$ and also using equations (58) and (59) in equation (57), we have

\begin{array}{l} H_{3} = e^{T} Q e + e^{T} P K P e - 2 e^{T} P K P e + \\ 2 e^{T} P [(H \otimes I_{n}) \hat{A} x - \hat{A} (B \otimes x^{0}) \underline{1}] \end{array}

Then by recalling equations (30) and (31), we have

H_{3} = 2 e^{T} P [\hat{A} e] - e^{T} P K P e + e^{T} Q e

H_{3} = e^{T} [{\hat{A}}^{T} P + P \hat{A} - P K P + Q] e

which is equal to zero by determining the values of P from the CAREs in equation (46). In addition, by differentiating H₃ with respect to u₁, we have

\begin{array}{l} {[\nabla_{u_{1}} H_{3}]}^{T} = 2 e^{T} P (H \otimes B) \\ + 2 u_{1}^{T} (H^{T} \otimes B^{T}) K^{- 1} (H \otimes B) \end{array}

This equation is equal to zero by substituting u₁ from equation (58). It means that a part of u is designed in such a way that the partial derivative of H₃ is zero. Then by referring to proposition 3, the optimality condition is satisfied and the proof is completed.

Cooperative robust observer for leader’s control inputs

Remark 4

Looking at equation (24), the controller input at the neighboring agent (i.e. u^j) is required for computing uⁱ. But, this value is not available, since it is being computed at the same time. Thus, the need for an estimation algorithm for u^j is invoked.

Theorem 4

For a network defined in definitions 1 and 2, if proposition 1 is satisfied by theorem 2, then one can have the following approximation for relation between the control inputs at agent j and the leader’s control inputs

\begin{matrix} u^{j} ≃ {(B^{T} B)}^{- 1} B^{T} [u^{0} - {\hat{A}}^{j} x^{j} - {\hat{g}}^{j}] \end{matrix}

Proof

Recalling lemma 1 and theorem 1 and also by subtracting both sides of ${\dot{x}}^{0} = u^{0}$ from equation (6), we have

\begin{matrix} {\dot{x}}^{j} - {\dot{x}}^{0} = B u^{j} - u^{0} + {\hat{A}}^{j} x^{j} + {\hat{g}}^{j} \end{matrix}

Then, by reaching consensus on synchronization and tracking problem (according to theorem 2), one can state that

lim_{t \to \infty} (x^{j} - x^{0}) = 0

Using the time derivative of both sides of this equation, we have

lim_{t \to \infty} (B u^{j} - u^{0} + {\hat{A}}^{j} x^{j} + {\hat{g}}^{j}) = 0

Finally, the approximated value for controller inputs of agent j can be expressed as follows

u^{j} ≃ {(B^{T} B)}^{- 1} B^{T} [u^{0} - {\hat{A}}^{j} x^{j} - {\hat{g}}^{j}]

Then the proof is completed.

Remark 5

Utilizing theorem 4, one can use equation (64) to compute the designed control inputs in equation (24). But, considering the pinning gain matrix $B$ , the values for u⁰ are still not available to all agents in the network. Hence, one needs a cooperative observer to estimate the values of control inputs for the leader at each agent in the network.

Proposition 4

Referring to the objective of reaching consensus on observation for values of u⁰ at all agents in the network, one can define the following consensus error

∊^{i} = \sum_{j = 1}^{N} a_{i j} ({\hat{T}}^{i} - {\hat{T}}^{j}) + β_{i} ({\hat{T}}^{i} - u^{0})

where ${\hat{T}}^{i} \in ℝ^{n \times 1}$ is the observed vector at agent i for the leader control inputs u⁰. Thus, this network-based consensus error can be defined as

∊ = (H \otimes I_{m}) \hat{T} - (B \otimes u^{0}) \underline{1}

where $∊ = [∊^{1},..., ∊^{N}]^{T} \in ℝ^{N n \times 1}$ and $\hat{T} = [{\hat{T}}^{1},..., {\hat{T}}^{N}]^{T}$ . If this consensus error converges to zero in finite time, one can say that the objective of reaching consensus on observation of u⁰ among the agents in network is achieved. Then, ${\hat{T}}^{i}$ can be used as the estimated values of u⁰ in equation (24).

Theorem 5

For a network defined in definitions 1 and 2 with at least one communication connection between the leader and the agents in the network, if one uses the following equation as the rate for observing the leader’s control input

{\dot{\hat{T}}}^{i} = - μ ∊^{i} - [S (\sum_{j = 1}^{N} (H (i, j) ∊^{j})) \times u_{M}^{0}]

where μ is a positive scalar, $S (F)$ is a diagonal matrix whose elements are the sign of elements in vector $F$ , and $u_{M}^{0}$ is the maximum absolute value for ${\dot{u}}^{0}$ , then proposition 4 can be achieved.

Proof

Considering the following Lyapunov function

V_{2} = \frac{1}{2} ∊^{T} ∊

we have

{\dot{V}}_{2} = ∊^{T} [(H \otimes I_{n}) \dot{\hat{T}} - (B \otimes {\dot{u}}^{0}) \underline{1}]

Since the summation of all elements in each row of the Laplacian matrix is zero,⁵ we have

(L \otimes {\dot{u}}^{0}) \underline{1} = \underline{0}

Hence, equation (73) can be written as

{\dot{V}}_{2} = ∊^{T} (H \otimes I_{n}) \dot{\hat{T}} - ∊^{T} (H \otimes {\dot{u}}^{0}) \underline{1}

Considering $\dot{\hat{T}} = - μ ∊ + {\hat{T}}_{1}$ , we lead to

{\dot{V}}_{2} = - μ ∊^{T} (H \otimes I_{n}) ∊ + ∊^{T} (H \otimes I_{n}) {\hat{T}}_{1} - ∊^{T} (H \otimes {\dot{u}}^{0}) \underline{1}

In the case that the communication graph is connected and undirected and $B$ has at least one nonzero diagonal element, $(H \otimes I_{n})$ is symmetric with positive diagonal and nonpositive off-diagonal elements. This means that the matrix has positive determinant and positive eigenvalues. Hence, $(H \otimes I_{n})$ is a nonsingular M-matrix.^5,28 As a result, we can say that $(H \otimes I_{n}) > 0$ . Then, the first term in equation (76) is surely negative. Let us consider

{\dot{V}}_{2} = - μ ∊^{T} (H \otimes I_{n}) ∊ + V_{3}

where

V_{3} = ∊^{T} (H \otimes I_{n}) {\hat{T}}_{1} - ∊^{T} (H \otimes {\dot{u}}^{0}) \underline{1}

To achieve ${\dot{V}}_{2} < 0$ , we should show that $V_{3} \leq 0$ . Recalling the mixed-product property of Kronecker product, equation (78) can be written as follows

V_{3} = ∊^{T} (H \otimes I_{n}) {\hat{T}}_{1} - ∊^{T} (H \otimes I_{n}) (I_{N} \otimes {\dot{u}}^{0}) \underline{1}

Then, since $| {\dot{u}}^{0} | \leq u_{M}^{0}$ , we have

V_{3} \leq ∊^{T} (H \otimes I_{n}) {\hat{T}}_{1} + | | ∊^{T} (H \otimes I_{n}) | | (I_{N} \otimes {\dot{u}}_{M}^{0}) \underline{1}

At this point, we should only show that

∊^{T} (H \otimes I_{n}) {\hat{T}}_{1} + | | ∊^{T} (H \otimes I_{n}) | | (I_{N} \otimes {\dot{u}}_{M}^{0}) \underline{1} = 0

Thus,

\begin{array}{l} ∊^{T} (H \otimes I_{n}) {\hat{T}}_{1} = - ∊^{T} (H \otimes I_{n}) S (∊^{T} (H \otimes I_{n})) \times \\ (I_{N} \otimes {\dot{u}}_{M}^{0}) \underline{1} \end{array}

Finally, since

\begin{matrix} {(∊ ∊^{T} (H \otimes I_{n}))}^{- 1} ∊ ∊^{T} (H \otimes I_{n}) = I_{N} \otimes I_{n} \end{matrix}

we have

{\hat{T}}_{1} = - S (∊^{T} (H \otimes I_{n})) (I_{N} \otimes {\dot{u}}_{M}^{0}) \underline{1}

and then the rate for observed parameter is

\dot{\hat{T}} = - μ ∊ - S (∊^{T} (H \otimes I_{n})) (I_{N} \otimes {\dot{u}}_{M}^{0}) \underline{1}

Using $\dot{\hat{T}}$ from equation (85), we can have ${\dot{V}}_{2} \leq 0$ , which in turn shows that ε is stable and converging to zero referring to the Lyapunov stability theorem. Hence, proposition 4 is achieved. Equation (85) can be presented for agent i as in equation (71). Then, the proof is completed.

Remark 6

Recalling proposition 1 and theorems 2, 4, and 5, the distributed controller at agent i in the network is proposed as

\begin{array}{l} u^{i} = \frac{1}{H (i, i)} {(B^{T} B)}^{- 1} B^{T} [- K^{i} P^{i} e^{i} - K_{I}^{i} ζ^{i} \\ - H (i, i) {\hat{g}}^{i} - \sum_{j = 1 \neq i}^{N} {H (i, j) ({\hat{T}}^{j} - {\hat{A}}^{j} x^{j} - 2 {\hat{g}}^{j})} \\ + β_{i} (u^{0} - {\hat{A}}^{i} x^{0})] \end{array}

where

\begin{array}{l} {\dot{\hat{A}}}^{j} & = λ {\tilde{w}}^{j} r^{j T} - ρ_{1} λ {\hat{A}}^{j} \\ {\dot{\hat{g}}}^{j} & = γ \sum_{k = 1}^{N} (H (k, j) P^{k} e^{k}) - ρ_{2} {\hat{g}}^{j} \\ {\dot{\hat{T}}}^{j} & = - μ ∊^{j} - [S (\sum_{k = 1}^{N} (H (j, k) ∊^{k})) \times u_{M}^{0}] \end{array}

and Pⁱ is the solution of following CARE

{({\hat{A}}^{i})}^{T} P^{i} + P^{i} {\hat{A}}^{i} - P^{i} K^{i} P^{i} = - Q^{i}

Remark 7

It should be noted that the number of estimations (excluding the number of observers for the leader’s control inputs) at each agent in the network is equal to the number of adaptive laws for $\hat{A}$ and $\hat{g}$ , which in turn is depended only on the number of states in the dynamic system. In other words, the number of estimations only depends on the system’s order and it is not related to the type of differential equations in the dynamic system. The only requirement for the dynamic system is that it should be Lipschitz continuous.

Observer design for compensation of measurement noise

Looking at the considered dynamical structure at each agent as in equation (6), one can implement a linear Kalman filter at each agent as an observer on the proposed distributed model-free control protocol to remove any possible bounded measurement noise. This forms one of the primary motivations for the expressed dynamical structure in equation (6) to have a linear-in-states term.

Proposition 5

Suppose that there is a source of white noise on each measured states of agent i. Hence, using the adapted parameters in equation (6), the dynamics of agent i in the network can be represented as

\begin{array}{l} {\dot{x}}^{i} & = {\hat{A}}^{i} x^{i} + B u^{i} + {\hat{g}}^{i} \\ y^{i} & = C x^{i} + τ^{i} \end{array}

where $τ^{i} \in ℝ^{n \times 1}$ is a normally distributed noise with zero mean and variance of $R_{o b}^{i} \in ℝ^{n \times n}$ , $C \in ℝ^{n \times n}$ is the output matrix, and $y^{i} \in ℝ^{n \times 1}$ is the measured output of the ith system which is simply the corrupted system states with white noise. Consider the Kalman filter for designing the state observer in continuous-time linear systems.³³ Recalling that here any process disturbances are estimated by the adaptive law for unknown nonlinear term ${\hat{g}}^{i}$ , only the measurement disturbances should be eliminated using the Kalman filter. Hence, by assuming $B u^{i} + {\hat{g}}^{i}$ in equation (89) as a modified control input for the nonlinear system, we can consider the system in equation (89) in a linear format and without any term for process disturbances. Then, a modified Kalman filter would be implemented on this system as follows³³

\begin{array}{l} {\dot{P}}_{o b}^{i} & = ({\hat{A}}^{i})^{T} P_{o b}^{i} + P_{o b}^{i} {\hat{A}}^{i} - P_{o b}^{i} C^{T} {(R_{o b}^{i})}^{- 1} C P_{o b}^{i} \\ K_{o b}^{i} & = η P_{o b}^{i} C^{T} {(R_{o b}^{i})}^{- 1} \\ {\dot{\hat{x}}}^{i} & = {\hat{A}}^{i} {\hat{x}}^{i} + B u^{i} + {\hat{g}}^{i} + K_{o b}^{i} (y^{i} - C {\hat{x}}^{i}) \end{array}

Here, η is a positive tuning gain defined to provide fast and accurate performance for the modified observer. It is shown in the study by Lewis et al.³³ that using equation (90), the observer error, that is, $y^{i} - C {\hat{x}}^{i}$ is bounded and converged to zero asymptotically. Note that the term for effect of process disturbances is omitted in the first equation of equation (90), since it is included in ${\hat{g}}^{i}$ . Using equation (90), the system states xⁱ is replaced with a filtered version of system states (i.e. ${\hat{x}}^{i}$ ) and incorporated into equations (86) and (87). The modified Kalman filter proposed here is implemented locally at each agent in the network. It should be noted that based on lemma 1, the combination of a stable controller and a stable observer leads to a stable system.^29,30 The stability analysis of the standard Kalman filter can be found in the study by Lewis et al.³³

Simulation study

In this section, three applications for the proposed consensus control protocol are presented. First, we studied the performance of the controller on a network including four chaotic plants. Then, the controller is evaluated on another network of non-affine nonlinear plants. The third simulation case is dedicated to a network including four limit cycle resonators. In these three applications, the properties of the communication graph and the constant parameters of the controller are considered to be the same. The communication network in each case consists of four agents with different initial values for the system states. The adjacency and pinning gain matrices for the communication graphs are

\begin{matrix} A = [\begin{matrix} 0 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{matrix}], B = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}] \end{matrix}

In addition, the controller parameters at agent i in the network are tuned, as presented in Table 1. These values are used for all of the following simulation cases, otherwise it is mentioned specifically. In Table 1, I₂ is an identity matrix with dimensions of two. Moreover, in the following simulation cases, matrix B is assumed to be I₂. Here, a normally distributed random noise with mean value of zero and variance equal to 0.5 and 0.05 is added as the measurement noise to the first and second states, respectively, at each agent for all of the simulation cases.

Table 1.

Controller parameters at agent i.

Parameter	Value	Parameter	Value
Kⁱ	$1 e 4 \times I_{2}$	$K_{I}^{i}$	$0.1 \times I_{2}$
Qⁱ	I ₂	$u_{M}^{0}$	$10 \times {[1 1]}^{T}$
γ	1	λ	1
ρ ₁	1e3	ρ ₂	1e3
μ	100	α	1
η	1e4

Case 1: Chaotic plant

Here, a network consisting of four Duffing–Holmes chaotic systems has been considered for evaluating the performance of the proposed controller. The dynamics of a single-agent Duffing–Holmes chaotic system is³⁴

\begin{array}{l} {\dot{x}}_{1} & = x_{2} \\ {\dot{x}}_{2} & = - p_{1} x_{1} - p x_{2} - x_{1}^{3} + q cos ω t + h (u) + d (t) \\ y & = x_{1} \end{array}

where $p_{1} = 0.3 + 0.2 sin 10 t$ , $q = 5 + 0.1 cos t$ , $p = 0.2 + 0.2 cos 5 t$ , $ω = 0.5 + 0.1 sin t$ , $h (u) = u + 0.5 cos u$ , and the external disturbance $d = 0.4 sin 0.2 π t + 0.3 sin x_{1} x_{2}$ . Also, the desired trajectory for this system is considered as $y_{d} = sin 0.2 t + cos 0.5 t$ . As can be seen in equation (92), the Duffing–Holmes chaotic system has one control input and two states. The dynamic system can be formulated as in equation (6) with two states. The simulation results for this case are presented in Figures 1 to 6. The consensus errors are converged to zero asymptotically. It can be seen that the consensus errors, control variables, estimated linear and nonlinear terms are all bounded for all agents. In addition, the different initial values for the states of agents and also for the observed values of the leader control inputs have effects on the convergence trend of the agents’ states and the observed values. The objective of tracking and reaching consensus on the desired trajectory is achieved after a short time.

Figure 1.

Case 1: Values for first state (x₁) for all agents. The initial values for states of all agents are not identical. The consensus of states for all agents to the desired trajectory can be confirmed after about 2 s.

Figure 2.

Case 1: Consensus errors of first state for all agents. It can be seen that the consensus errors for agents (e_i) are bounded around zero with the upper-band value of about 0.35.

Figure 3.

Case 1: Control inputs for all agents. The control variables for all agents are also bounded.

Figure 4.

Case 1: Estimated values for linear terms ( $\hat{A}$ ). These values for all agents are bounded. The values for the coupling term, which relates the dynamics of x₂ to x₁ (i.e. $A (2, 1)$ ), are larger than the other terms.

Figure 5.

Case 1: Estimated values for nonlinear terms ( $\hat{g}$ ). These values are bounded for all of the agents.

Figure 6.

Case 1: Observed values for the leader control input ( $\hat{T}$ ). The convergence trends of the observed values in different agents are almost the same.

Case 2: Non-affine nonlinear system

The dynamic system in this simulation case is³⁵

\begin{array}{l} {\dot{x}}_{1} & = - x_{1} + x_{2} \\ {\dot{x}}_{2} & = - x_{1}^{3} - x_{2} - \frac{x_{1}^{2}}{x_{2}} + 0.25 x_{2} {[cos (2 x_{1} + x_{1}^{3}) + 2]}^{2} u \\ y & = x_{1} \end{array}

The system defined in equation (93) is a non-affine system. The desired value for the first state is zero for all agents, which are starting the simulation at different nonzero initial values. The dynamic system can be formulated as in equation (6) with two states. In addition, the controller parameters are set as suggested in Table 1, except the value of Kⁱ which is equal to $1 e 3 \times I_{2}$ and the value of η which is set to 1e3 in this case. The corresponding simulation results are depicted in Figures 7 to 11. Tracking objective for all agents in the network is achieved.

Figure 7.

Case 2: Values for first state (x₁) for all agents. The initial values for states of all agents are not identical. The consensus of states for all agents to the desired trajectory can be confirmed after about 3 s.

Figure 8.

Case 2: Consensus errors of first state for all the agents. It can be seen that the consensus errors for agents (e_i) are bounded around zero with the upper-band value of about 0.3.

Figure 9.

Case 2: Control inputs for all agents. The control variables for all agents are also bounded.

Figure 10.

Case 2: Estimated values for linear terms ( $\hat{A}$ ). These values for all agents are bounded. The values for the term which relates the dynamics of x₂ to itself (i.e. $A (2, 2)$ ) are larger than the other terms.

Figure 11.

Case 2: Estimated values for nonlinear terms ( $\hat{g}$ ). These values are bounded for all of the agents.

Case 3: Limit cycle resonator

For the third case, a limit cycle dynamic system is considered for the dynamics of each agent in the network. The dynamic system for a Van der Pol resonator as a limit cycle dynamic system is proposed as follows³⁶

\begin{array}{l} {\dot{x}}_{1} & = x_{2} \\ {\dot{x}}_{2} & = u - p_{1} (x_{1}^{2} - 1) x_{2} - p_{2} x_{1} \\ y & = x_{1} \end{array}

where p₁ and p₂ are two positive constant values. This simple dynamic system has a stable equilibrium point at (0, 0). In addition, the system has an unstable limit cycle surrounding the origin.³⁷ The unstable limit cycle represents the boundary between the transients which converge to the origin and those transients which diverge.³⁷ In this simulation, we consider p ₁ = 0.1 and p ₂ = 0.2. The simulation results for this case are presented in Figures 12 to 17. It can be seen that the states for all agents are converged to the desired value of 3, although the initial values for the states of agents are not the same. The convergence trends of the observed values of the leader control input in different agents are almost the same (see Figure 17). Moreover, the tracking errors and the control inputs are bounded.

Figure 12.

Case 3: Values for first state (x₁) for all agents. The initial values for states of all agents are not identical. The consensus of states for all agents to the desired trajectory can be confirmed after about 3 s.

Figure 13.

Case 3: Consensus errors of first state for all agents. It can be seen that the consensus errors for agents (e_i) are bounded around zero with the upper-band value of about 0.4.

Figure 14.

Case 3: Control inputs for all agents. The control variables for all agents are also bounded.

Figure 15.

Case 3: Estimated values for linear terms ( $\hat{A}$ ). These values for all agents are bounded. The values for the coupling term, which relates the dynamics of x₂ to x₁ (i.e. $A (2, 1)$ ), are larger than the other terms.

Figure 16.

Case 3: Estimated values for nonlinear terms ( $\hat{g}$ ). These values are bounded for all of the agents.

Figure 17.

Case 3: Observed values for the leader control input ( $\hat{T}$ ). The convergence trends of the observed values in different agents are almost the same.

Comparison with model-based cooperative control algorithms

In this section, the performance of the proposed model-free cooperative control algorithm (remark 6) is compared with a well-known model-based cooperative control algorithm designed and presented by Lewis et al.⁵ The control signal at agent i in this algorithm is defined as follows⁵

\begin{array}{l} u_{i} & = c r_{i} - {\hat{W}}_{i}^{T} ϕ_{i} + \frac{λ}{β_{i} + d_{i}} e_{i}^{2} \\ {\dot{\hat{W}}}_{i} & = - F ϕ_{i} r_{i} p_{i} (d_{i} + β_{i}) - k F \hat{W} \end{array}

where d_i and β_i are defined in definition 1 and

\begin{array}{l} e_{i}^{1} & = Σ_{j} a_{i j} (x_{j}^{1} - x_{i}^{1}) + β_{i} (x_{0}^{1} - x_{i}^{1}) \\ e_{i}^{2} & = Σ_{j} a_{i j} (x_{j}^{2} - x_{i}^{2}) + β_{i} (x_{0}^{2} - x_{i}^{2}) \\ r_{i} & = e_{i}^{2} + λ e_{i}^{1} \end{array}

In the above two equations, W_i is —the vector of gains for the employed neural nodes at agent i, while ϕ_i is the basis for activation functions of neural nodes.⁵ Moreover, c, λ, F, and k are constant parameters for tuning the control algorithm. The values for p_i are defined based on the solution of a Lyapunov equation.⁵ The algorithm is designed specifically for the unknown second-order dynamic systems, where $x_{i}^{1}$ represents the first state at agent i (e.g. the position or displacement) and $x_{i}^{2} = {\dot{x}}_{i}^{1}$ is the second state at that agent (e.g. the velocity or rate of changing in displacement).

The dynamic system for the comparison study in this section is a reverse pendulum presented with the following model⁵

\begin{array}{l} {\dot{x}}_{1} & = x_{2} \\ {\dot{x}}_{2} & = \frac{1}{J_{p}} (u - B_{p} x_{2} - M_{p} L_{p} g sin x_{1}) \\ y & = x_{1} \end{array}

where J_p, M_p, and L_p are the moment of inertia, mass, and the length of the pendulum, respectively. In addition, g is the gravitational acceleration and B_p is a constant for damping force. Here, we have a network of five agents with the dynamic system as presented in equation (97) at each agent. The communication graph in this network is presented in Figure 18 and only the third agent is pinned to the leader.⁵ The desired value for x₁ of all agents in the network is $x_{1 d} = 2$ . The parameters of the model are set at $J_{p} = 1$ , $M_{p} = 1$ , $L_{p} = 0.1$ , and $B_{p} = 0.01$ . Moreover, the constant values of the model-based cooperative control algorithm are tuned to c = 1000, λ = 0.04, k = 1.5, and F = 10 (after several try-and-error simulation tests). Also, three neural nodes are considered at each agent and their activation functions are log-sigmoid.⁵

Figure 18.

The communication graph of the network considered for the comparison study. It has five agents (nodes) with one leader pinned to the third agent.⁵

By contrast, the constant values of the proposed model-free cooperative control algorithm are chosen same as the values in the simulation case studies (Table 1), except $K_{I}^{i} = 1 \times I_{2}$ and $Q = 0.01 \times I_{2}$ . In other words, no further tuning is required for the constant values of the proposed algorithm in this article. Additionally, there is no need for defining the activation functions, since there are not any neural nodes acquired in the algorithm. The simulation results for this comparison study are presented in Figures 19 and 20. Note that measurement noise is not considered. As can be seen, the convergence of the states using the model-free algorithm is comparatively faster than the convergence of the model-based control algorithm. In order to compare the control efforts in the two mentioned cases, the following equation

T_{e f}^{i} = \sum_{t = 0}^{t_{f}} | u {(t)}^{i} |

is used to compute the total absolute effort $T_{e f}^{i}$ of the cooperative controller at the agent i in the network. The values of $T_{e f}$ for all agents acquired by the model-based and the model-free cooperative control algorithms are presented in Table 2. As it can be seen, the fewer control effort is required by the proposed model-free cooperative control algorithm.

Figure 19.

Comparison study: The consensus errors for all agents using the model-based (top) and model-free (bottom) cooperative control algorithms. The convergence is provided in both cases, while the model-free algorithm is a bit faster and has extra overshoot.

Figure 20.

Comparison study: The values for first states of all agents using model-based (top) and model-free (bottom) cooperative control algorithms.

Table 2.

Total absolute control effort at agent i.

Agent number	Model-based algorithm⁵	Model-free algorithm
1	6.1843e4	1.3548e4
2	6.8303e4	1.3056e4
3	1.2134e5	1.1017e4
4	3.5914e4	1.6708e4
5	6.8803e4	1.6998e4

Analysis for different type of measurement noise

As it is mentioned in proposition 5, the proposed observer is designed with this consideration that the measurement noise is a normally distributed random system. In this section, the performance of the proposed joint controller and observer system is evaluated under the assumption of the existing uniformly distributed noise (instead of normally distributed noise) on each state of all agents in the network. In this regard, the simulation results for case 1 are computed again with considering the above assumption. Here, a uniformly distributed measurement noise with maximum value of 0.5 and minimum value of −0.5 is implemented on the first state. The implemented noise on the second state is less by a factor of 0.1. As it is shown in the plots in Figures 21 and 22, the convergence is achieved appropriately and the errors are all bounded. Based on these results, one can say that the proposed distributed adaptive model-free cooperative control algorithm and the observer in proposition 5 have acceptable performance in the case of uniformly distributed measurement noise.

Figure 21.

Values for first state (x₁) for all agents in case 1, under the assumption of uniformly distributed measurement noise. The convergence to the desired trajectory is provided for all agents.

Figure 22.

Consensus errors of first state for all agents in case 1, under the assumption of uniformly distributed measurement noise. The errors are bounded around zero with the upper-band value of about 0.15.

Conclusions

This article presents a model-free distributed control algorithm for consensus problem in a network of nonlinear agents with completely unknown dynamics and external disturbances. The main purpose is to achieve tracking objective for the whole network while all agents are synchronized with a virtual leader in the network. The algorithm includes two distributed adaptive laws for estimating both linear and nonlinear terms in the agents’ dynamic systems. In addition, a cooperative observer is designed based on a consensus-type error for estimating the leader’s control inputs at each agent. Since there are partial information links between the leader and the agents, the control inputs of the leader are required to be estimated at each agent in the distributed control protocols. While the stability of entire design is analyzed with Lyapunov stability theorem, an optimality analysis is presented to show that the proposed distributed controller has an optimal term. Utilizing a modified Kalman filter state observer, the measurement noise can be eliminated from the data available from onboard sensors. It is shown that the observer works for the measurement noise with normal distribution property and uniform distribution. The presented simulation results for three cases indicate the appropriate performance of the proposed distributed control algorithm. According to the comparative study, the provided convergence by the model-free cooperative control algorithm is faster in comparison with a model-based distributed control algorithm. In addition, less control effort is required in the proposed model-free algorithm. Moreover, minimal controller synthesis and tuning are needed in our proposed distributed MFC algorithm. In addition, since the adaptive laws are regressor-free, there is no requirement to define some regressor (activation) functions for the implementation of the distributed controller. Such salient properties provide practical convenience when implementing the proposed algorithm on a real hardware platform. Future investigations can be further corroborated to address some solutions for decreasing the number of estimations at each agent in the network.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a Research University (RUi) grant (1001/PELECT/8014029) and a Bridging Fund grant (304/PELECT/6316106) from Universiti Sains Malaysia. Besides, the PhD studies of the first author are under a TWAS-USM Postgraduate Fellowship.

ORCID iD

Ali Safaei

Muhammad Nasiruddin Mahyuddin

References

Zhang

Lewis

. Adaptive cooperative tracking control of higher-order nonlinear systems with unknown dynamics. Automatica 2012; 48(7): 1432–1439.

Peng

Wang

Sun

. Distributed cooperative stabilisation of continuous time uncertain nonlinear multi-agent systems. Int J Syst Sci 2014; 45(10): 2031–2041.

Wang

Peng

. Cooperative fuzzy adaptive output feedback control for synchronisation of nonlinear multi-agent systems under directed graphs. Int J Syst Sci 2015; 46(16): 2982–2995.

Wang

Chen

Lewis

. Coordination of multi-agent systems on interacting physical and communication topologies. Syst Control Lett 2017; 100: 56–65.

Lewis

Zhang

Hengster-Movric

. Cooperative control of multi-agent systems. London: Springer-Verlag, 2014.

Jiao

Modares

Lewis

. Distributed L2-gain output-feedback control of homogeneous and heterogeneous systems. Automatica 2016; 71: 361–368.

Fliess

Join

. Model-free control. Int J Control 2013; 86(12): 2228–2252.

Hou

Jin

. Model-free adaptive control; theory and applications. Boca Raton: CRC Press; Taylor and Francis Group, 2014.

Chen

. Direct adaptive neural control for a class of uncertain nonaffine nonlinear systems based on disturbance observer. IEEE Trans Cybern 2013; 43(4): 1213–1225.

10.

Wang

Song

Krstic

. Adaptive finite time coordinated consensus for high-order multi-agent systems: adjustable fraction power feedback approach. Inform Sci 2016; 372: 392–406.

11.

Roman

Radac

Precup

. Multi-input-multi-output system experimental validation of model-free control and virtual reference feedback tuning techniques. IET Control Theor Appl 2016; 10(12): 1395–1403.

12.

Safaei

Koo

Mahyuddin

. Adaptive model-free control for robotic manipulators. In: Proceeding IEEE international symposium on robotics and intelligent sensors (IRIS2017), Ottawa, Canada, October 2017, pp. 7–12. IEEE.

13.

Safaei

Mahyuddin

. Adaptive model-free control based on an ultra-local model with model-free parameter estimations for a generic SISO system. IEEE Access 2018; 6: 4266–4275.

14.

Safaei

Mahyuddin

. Optimal model-free control for a generic MIMO nonlinear system with application to autonomous mobile robots. Int J Addapt Control Signal Proc Published online. DOI: 10.1002/acs.2865, 2018.

15.

Cai

Lewis

. The adaptive distributed observer approach to the cooperative output regulation of linear multi-agent systems. Automatica 2017; 75: 299–305.

16.

Modares

Nageshrao

Delgado Lopes

. Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning. Automatica 2016; 71: 334–341.

17.

Das

Lewis

. Cooperative adaptive control for synchronization of second-order systems with unknown nonlinearities. Int J Robust Nonlin Control 2011; 21(13): 1509–1524.

18.

Mahyuddin

Herrmann

Lewis

. Distributed adaptive leader-following control for multi-agent multi-degree manipulators with finite-time guarantees. In: 2013 IEEE 52nd conference on decision and control (CDC2013), Florence, Italy, 2013, pp. 1496–1501. IEEE.

19.

Mahyuddin

Herrmann

. Finite-time adaptive distributed control for double integrator leader-agent synchronisation. In: 2012 IEEE international symposium on intelligent control (ISIC), Dubrovnik, Croatia, 2012, pp. 714–720. IEEE.

20.

Wang

Peng

. Cooperative fuzzy adaptive output feedback control for synchronisation of nonlinear multi-agent systems under directed graphs. Int J Syst Sci 2014; 46(16): 2982–2995.

21.

Zhao

Chen

. Distributed adaptive tracking control of non-affine nonlinear multi-agent systems. In: 2016 Chinese control and decision conference (CCDC), Yinchuan, China, 2016, pp. 1518–1523. IEEE.

22.

Meng

Yang

Jagannathan

. Adaptive neural control of high-order uncertain nonaffine systems: a transformation to affine systems approach. Automatica 2014; 50: 1473–1480.

23.

Lewis

Vrabie

Vamvoudakis

. Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Contr Syst Mag 2012; 32(6): 76–105.

24.

Lewis

Vrabie

. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circ Syst Mag 2009; 9(3): 32–50.

25.

Cui

Zhuang

. Neural-network-based distributed adaptive synchronization for nonlinear multi-agent systems in pure-feedback form. Neurocomputing 2016; 218: 234–241.

26.

Mahyuddin

Safaei

. Robust adaptive cooperative control for formation-tracking problem in a network of non-affine nonlinear agents. In: Rocha

(ed) Multi-agent Systems. Croatia: InTech, 2017. DOI: 10.5772/intechopen.69352.

27.

Safaei

Mahyuddin

. Adaptive model-free consensus control for a network of nonlinear agents under the presence of measurement noise. In: Asian control conference (ASCC2017), Gold Coast, Australia, December 2017, pp. 1701–1706. IEEE.

28.

Duan

. Cooperative control of multi-agent systems. Boca Raton: CRC Press; Taylor and Francis Group, 2015.

29.

Atassi

Khalil

. A separation principle for the stabilization of a class of nonlinear systems. IEEE Trans Automat Contr 1999; 44(9): 1672–1687.

30.

Shiriaev

Johansson

Robertsson

. Separation principle for a class of nonlinear feedback systems augmented with observers. In: Proceedings of the 17th world congress of the International Federation of Automatic Control (IFAC), Seoul, South Korea, 6–11 July 2008, pp. 6196–6201. IEEE.

31.

Ioannou

Fidan

. Adaptive control tutorial. Philadelphia: SIAM, 2006.

32.

Lewis

Vrabie

Syrmos

. Optimal control. Hoboken: John Wiley and Sons, 2012.

33.

Lewis

Xie

Popa

. Optimal and robust estimation; with an introduction to stochastic control theory. Boca Raton: CRC Press; Taylor and Francis Group, 2008.

34.

Safaei

Mahyuddin

. An optimal adaptive model-free control with a Kalman-filter-based observer for a generic nonlinear MIMO system. In: Proceeding 2017 IEEE 2nd international conference on automatic control and intelligent systems (I2CACIS2017), Kota Kinabalu, Malaysia, October 2017, pp. 56–61. IEEE.

35.

Vamvoudakis

Vrabie

Lewis

. Online adaptive algorithm for optimal control with integral reinforcement. Int J Robust Nonlin Control 2014; 24(17): 2686–2710.

36.

Zhao

. Stabilization and tracking of the Van der Pol oscillator. Thesis, A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in the Department of Electrical and Computer Engineering at the University of Central Florida, Florida, US, 2005.

37.

Howell

Venkatasubramanian

. Transient stability assessment with unstable limit cycle approximation. IEEE Trans Power Syst 1999; 14(2): 667–677.

Distributed adaptive model-free cooperative control for a network of generic unknown nonlinear systems

Abstract

Keywords

Introduction

General formulation for a network of unknown nonlinear MIMO agents

Definition 1

Assumption 1

Definition 2

Definition 3

Definition 4

Definition 5

Design procedure for model-free cooperative control with tracking objective

Distributed robust adaptive parameter estimation for unknown system matrix

Lemma 1

Theorem 1

Proof

Proposition 1

Remark 1

Remark 2

Distributed adaptive MFC protocol

Stability analysis

Proposition 2

Theorem 2

Proof

Remark 3

Optimality analysis

Definition 6

Proposition 3

Theorem 3

Proof

Cooperative robust observer for leader’s control inputs

Remark 4

Theorem 4

Proof

Remark 5

Proposition 4

Theorem 5

Proof

Remark 6

Remark 7

Observer design for compensation of measurement noise

Proposition 5

Simulation study

Case 1: Chaotic plant

Case 2: Non-affine nonlinear system

Case 3: Limit cycle resonator

Comparison with model-based cooperative control algorithms

Analysis for different type of measurement noise

Conclusions

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References