A distributed estimation method over network based on compressed sensing

Abstract

This article presents a distributed estimation method called compressed-combine-reconstruct-adaptive to estimate an unknown sparse parameter of interest from noisy measurement over networks based on compressed sensing. It is useful in some distributed networks where the robustness and low consumption are desired features. The compressed sensing theory is introduced in the distributed estimation to further reduce the communication load as the unknown parameter of interest is sparse in many situations. With the proposed method, each node compresses its estimation in a compressed dimension form. The nodes only exchange their compressed estimations to reduce the communication load over the network. Next, each node combines the compressed estimations of neighbors with its own compressed estimation using combination coefficients depend on the topology of the network. Then, the compressed estimations are reconstructed in full dimension form with a reconstruction algorithm. At last, the nodes update their estimations with normalized least mean square algorithm. The stability analysis of the proposed compressed-combine-reconstruct-adaptive method is illustrated in this article. Our method is compared with standard diffusion methods and communication reduced methods in simulations. The results show that the compressed-combine-reconstruct-adaptive method achieves nearly the same performance as the standard diffusion methods while reducing the communication load significantly, and with a better performance (network mean square error), network mean square error, steady-state mean-square deviation and steady-state mean-square deviation) than other communication reduced methods.

Keywords

Distributed estimation CCRA LMS compressed sensing

Introduction

In a network like wireless sensor network, sensor nodes collect data in a distributed way in applications such as target localization and tracking, environment monitoring, spectrum sensing and so on. An unknown common parameter of interest is the distortion of the collected regression data by noise, which occurs when the local copy of the underlying system input signal at each node is corrupt by various sources of impairment such as measurement or quantization noise.¹ A critical issue is how to estimate the unknown parameter from the obtained data from each node over the network.

To solve the problem of parameter estimation in a network has two main strategies: one is centralized strategy and the other is distributed strategy. In a centralized strategy, all the nodes send the information to a central node to process and estimate the unknown parameter. Since the distributed networks are usually limited with energy and the communications between nodes are multi-hop, distributed strategies have attracted more and more attention. In a distributed strategy, each node estimates the parameter based on its own local computation and the estimation information received from its neighbors without the need for the central node.² One approach of the existing distributed strategies can be classified into incremental,^3,4 diffusion^1,5–8 and hierarchical methods.^9,10 The diffusion least mean square (LMS) method is the most popular strategy, and we focus on it in this article. Each node performs an LMS update after exchanging the estimation with its neighbors in a diffusion method.⁶ Compared with the centralized strategy, it can achieve the scalability, robustness and low communication load. There are many distributed diffusion methods proposed in articles. In work,⁵ a simple adaptive diffusion LMS method is illustrated. Cattivelli and Sayed⁶ analyzed the performance of combine-then-adapt (CTA) and adapt-then-combine (ATC) diffusion methods in a distributed network. Fernandezbes et al.⁸ use the normalize step-size in the adaptive stage to adapt the input signal. As most networks contain a large number of nodes and the dimensions of the unknown parameter may be incredibly high, the communication load of estimation is still considerable in a distributed diffusion method. To further reduce the communication load, reduced-communication diffusion LMS (RC-DLMS) method allows each node to receive the intermediate estimations of a subset of its neighbors.¹¹ The set-membership normalized LMS (SM-NLMS)¹² reduces the communication load with selective cooperation. In the study by Wang et al.,¹³ they use a intermittent diffusion method called adapt-thenintermittently-combine (ATIC). However, these methods reduce the communication load at the cost of the performance of parameter estimation. The other approach of distributed strategies is distributed consensus method. Distributed consensus strategies develop into an elegant procedure to enforce agreement among cooperating nodes. The work by Sayed and Tu¹⁴ confirms that diffusion networks are shown to converge faster and reach lower mean-square deviation than consensus networks.

Since the unknown parameter of interest is sparse, containing only a few non-zero elements among many negligible ones in various applications,^15–17 we introduce the compressed sensing (CS) theory in this article. The CS theory has received much attention in the researcher community. CS has been shown to bring significant performance gain in wide-ranging application from astronomy, biology, communications, image and video processing, medicine, and radar.¹⁸ By CS, it is possible to reconstruct signal of interest with a number of samples which is far smaller than the desired resolution of signal.¹⁹ In literature,^15,20 the researchers apply the CS to parameter estimation in a full dimension case by adding another function to consider the sparsity of unknown parameters. Nevertheless, the communication load between nodes is not considered. Xu et al.^21,22 implement the CS in parameter estimation while the input signal in their method needs to be compressed by the same sensing matrix of the estimation, and it is difficult to operate.

Considering the communication load in a distributed network, we propose a new distributed estimation method which called compressed-combine-reconstruct-adaptive (CCRA) in this article. First, with the CCRA method, each node translates its estimation in full dimension form into a compressed dimension form so that the nodes only need to transmit the estimation in compressed form. Then, each node combines the compressed estimations from neighbors with its local compressed estimation. Next, they reconstruct the compressed estimations in the full dimension form accurately with a reconstruction algorithm called Stagewise Orthogonal Matching Pursuit (StOMP). At last, each node performs a LMS update of estimation with a normalized step-size. Moreover, by using the proposed CCRA method, the communication load in the whole network has significantly reduced without affecting the estimation performance.

This article is organized as follows. In section “Problem statement,” we state the distributed estimation problem and define the cost functions. Then, the derivation of the diffusion solution method is presented. In section “The proposed CCRA method,” we introduce the CS theory and describe our CCRA method. In section “Stability analysis,” the stability analyze of the CCRA method is illustrated. In section “Stability analysis,” we provide the detail simulation results of a distributed network with 20 nodes to illustrate the performance of our method compared with the existing standard and communication reduction diffusion methods. In section “Conclusion,” we have a conclusion to this article.

Problem statement

Network model

In this article, we consider a network with N nodes which is in a topology connected with each other illustrated in Figure 1. The nodes are denoted by neighbors as they can exchange their information directly without transferring. A usual linear regression model is shown as follows

d_{i} (t) = w_{o}^{T} u_{i} (t) + v_{i} (t)

(1)

Figure 1.

Model of a simple distributed network.

The node i output, a scalar measurement $d_{i} (t)$ in instant time t, relates to the input regression vector $u_{i} (t)$ and the true parameter $w_{o}$ , where $d_{i} (t)$ is a scalar value. The $u_{i} (t)$ is an M × 1 vector and so is the $w_{o}$ . We assume that $w_{o}$ is a sparse vector with S << M non-zero coefficients. $v_{i} (t)$ denotes the observation noise or disturbance of each node i, and we assume that $v_{i} (t)$ of each node i is independent and unrelated.

Cost function

To achieve an estimation vector $w$ for $w_{o}$ , the global cost function of the whole network should be minimized given by

J_{global} (w) = \sum_{i = 1}^{N} {E | d_{i} (t) - u_{i}^{T} (t) w) |}^{2}

(2)

where E denotes the expectation operator. Assuming the process $u_{i} (t)$ is jointly wide sense stationary. A centralized least mean square (LMS) algorithm update which is shown as

w (t + 1) = w (t) + μ \sum_{i = 1}^{N} u_{i} (t) (d_{i} (t) - u_{i}^{T} (t) w (t))

(3)

where $μ$ > 0 and $μ$ is a step size, $w (t)$ is the estimation of $w_{o}$ in time t.

Distributed diffusion strategy

By the centralized LMS algorithm, the whole network information should be collected and processed in a central node. To send and transmit the information to central node, the communication burden is greatly increased. It is impractical in a distributed network due to the limited resources of nodes. Moreover, if some links fail and change in the network, the centralized algorithm will not have a good performance.

On the contrary, we introduce the distributed diffusion method to overcome these drawbacks. In a distributed estimation method, each node only needs to exchange the information with its neighbors to achieve the estimation. It is assumed that two nodes are connected if they can communicate with each other directly. The neighbor denoted by $Ω_{i}$ of node i is a set of nodes (include node i itself), which are connected to the node i. Each node can process its local estimation and get the diffusion estimations from its neighbors. In Figure 1, there is an example of a network consists with 10 nodes. The arrows indicate the connections of the nodes, while the nodes at the end of an arrow can exchange information with each other. The neighbor of node 7 denoted by $Ω_{7}$ includes the nodes 3, 5, 6, 7, 8. In this case, the distributed estimation does not collapse even if some nodes fail.

The distributed strategy is commonly performed in two stages: adaption and combination. Based on the topology of the network, the estimations are combined with combination coefficients

γ_{ii} + \sum_{j \in Ω_{i}} γ_{ij} = 1 j \neq i

(4)

where $γ_{ii}$ is the combination coefficient of itself and the $γ_{ij}$ is the combination coefficient of the node j and its neighbors, satisfied by equation (4). In this particle, we use the Metropolis rules to get the combination coefficients with equation (5)

{\begin{matrix} γ_{ij} = \frac{1}{max (| Ω_{i} |, | Ω_{j} |)} if j \in Ω_{i} j \neq i \\ γ_{ij} = 0 if j \notin Ω_{i} j \neq i \\ γ_{ii} = 1 - \sum_{j \in Ω_{i}} γ_{ij} if j \in Ω_{i} j \neq i \end{matrix}

(5)

where $| Ω_{i} |$ denotes the cardinality of the set $Ω_{i}$ .

In this article, we seek to estimate the parameter of $w_{o}$ only by processing the information of the neighbors in a distributed diffusion method. The node i has a priori estimate $w_{i} (t)$ of parameter $w_{o}$ in the instant time t. The update function is generated in equation (6)

w_{i} (t + 1) = \underset{w_{i}}{\arg min} {\begin{matrix} γ_{ii} ‖ w_{i} - w_{i} (t) ‖^{2} + \\ \sum_{j \in Ω_{i}, i \neq j} γ_{ij} ‖ w_{i} - w_{j} (t) ‖^{2} \\ + μ_{i} (d_{i} (t) - u_{i}^{T} (t) w_{i})^{2} \end{matrix}}

(6)

where $μ_{i}$ is the step size of node i.

To simplify the update function, we expand the last item $(d_{i} (t) - u_{i}^{T} (t) w_{i})^{2}$ of the unknown $w_{i}$ around $w_{j} (t)$ in Taylor formula

\begin{matrix} (d_{i} (t) - u_{i}^{T} (t) w_{i})^{2} \\ = e_{ij}^{2} (t) - 2 e_{ij} (t) u_{i}^{T} (t) (w_{i} - w_{j} (t)) + o ‖ w_{i} ‖^{2} \end{matrix}

(7)

where $e_{ij} (t) = d_{i} (t) - u_{i}^{T} (t) w_{j} (t)$ .

In the same way, the expansion of the last term around $w_{i} (t)$ is equation (8)

\begin{matrix} (d_{i} (t) - u_{i}^{T} (t) w_{i})^{2} \\ = e_{i}^{2} (t) - 2 e_{i} (t) u_{i}^{T} (t) (w_{i} - w_{i} (t)) + o ‖ w_{i} ‖^{2} \end{matrix}

(8)

where $e_{i} (t) = d_{i} (t) - u_{i}^{T} (t) w_{i} (t)$ .

Then, we substitute equations (7) and (8) into equation (6). Since the combination coefficients satisfy equation (4), we have the function in equation (9).

\begin{matrix} w_{i} (t + 1) \\ = \underset{w_{i}}{\arg min} \\ {\begin{matrix} γ_{ii} ‖ w_{i} - w_{i} (t) ‖^{2} + \\ \sum_{j \in Ω_{i}, i \neq j} γ_{ij} ‖ w_{i} - w_{j} (t) ‖^{2} + \\ μ_{i} γ_{ii} [e_{i}^{2} (t) - 2 e_{i} (t) u_{i}^{T} (t) (w_{i} - w_{i} (t))] \\ + μ_{i} γ_{ij} \sum_{j \in Ω_{i}, i \neq j} [e_{ij}^{2} (t) - 2 e_{ij} (t) w_{j} (t) u_{i}^{T} (t) (w_{i} - w_{j} (t))] \end{matrix}} \end{matrix}

(9)

The term of the big large brackets is a function of the $w_{i}$ . To get the $w_{i} (t + 1)$ , we differentiate the function of $w_{i}$ and let it equal to 0. Then the distributed update estimation $w_{i} (t + 1)$ is shown as follows

w_{i} (t + 1) = φ_{i} (t + 1) + μ_{i} u_{i} (t) (d_{i} (t) - u_{i}^{T} (t) φ_{i} (t + 1))

(10)

where

φ_{i} (t + 1) = γ_{ii} w_{i} (t) + \sum_{j \in Ω_{i}, i \neq j} γ_{ij} w_{j} (t)

(11)

Equation (10) is regarded as the combination stage and equation (11) is the adaptive stage. The distributed diffusion LMS is shown in Figure 2.

Figure 2.

Distributed diffusion strategy.

The proposed CCRA method

CS

By the strategy in Section 2, the amount of information exchange is reduced. The communication burden is still considerable in an energy-limited network since the $w_{i} (t)$ is full dimension vector of (M x 1). To further reduce the communication burden and improve the network performance, the CS theory is introduced in this method. CS is an emerging field in signal processing. By CS theory, the sparse signals can be reconstructed from what was previously believed to be incomplete measurements.

We consider the $w$ is an M x 1 vector with S sparsity and $Φ$ is called a sensing matrix of D x M where D ≪ M. The basic compress model is defined in equation (12)

\bar{w} = Γ w

(12)

Since both $Γ$ and $\bar{w}$ are exactly known in the compressed phase, it is easy to get the $\bar{w}$ vector of D x 1 dimension which is called compressed vector. The term of model means that each element of $\bar{w}$ is obtained as inner product between the vector $w$ and each row vector of $Γ$ .²²

As equation (12) is an underdetermined equation, it is hard to acquire the original vector $\bar{w}$ by the known compressed vector and sensing matrix in the reconstruction phase. The CS theory suggests that, under some conditions, we can fully reconstruct the original vector. Since the vector $w$ is S-sparse which means that the $w$ has less than s non-zero elements in an orthonormal basis. An D x M i.i.d. Gaussian matrix $Γ$ can be shown to have the restricted isometry property (RIP) with high probability if $D \geq τ K \log (M / K)$ , where $τ$ is a small constant.²³ A natural and straightforward approach to obtain a sparse solution from the underdetermined equation (12) will be formulated as the optimization problem

\underset{w}{\arg min} ‖ w ‖_{ℓ 0} subject to \bar{w} = Γ w

(13)

The solution is still difficult to solve because it is an non-deterministic polynomial (NP)-hard problem. We can replace $ℓ 0$ norm by $ℓ 1$ norm

\underset{w}{\arg min} ‖ w ‖_{ℓ 1} subject to \bar{w} = Γ w

(14)

It is shown that the optimization solution in equations (13) and (14) are the same if the vector is sufficiently sparse.²⁴ There are many reconstruction algorithms to pursue sparse solutions with, which are proposed by researchers such as basis pursuit (BP),²⁴ orthogonal matching pursuit (OMP),²⁵ subspace pursuit (SP),²⁶ precognition matching pursuit (PMP)²⁷ and so on. In this article, we use Stagewise Orthogonal Matching Pursuit (StOMP) to reconstruct the signal which can reconstruct the compressed vector without setting its sparsity.²⁸ The reconstruction phase is denoted as $ℏ (•)$

w = ℏ (\bar{w})

(15)

CCRA method

In the proposed method in this article, we add the compressed stage and reconstruction stage in the standard diffusion method. Our method can be denoted as the CCRA. At the first stage in instant time t, each node acquires its local compressed estimation by own calculation. The compressed stage in function is shown in equation (16)

{\bar{w}}_{i} (t) = Γ_{i} w_{i} (t) i \in N

(16)

Then, the information exchanges between the nodes are the compressed estimations that are described in equation (17)

{\bar{φ}}_{i} (t + 1) = γ_{ii} {\bar{w}}_{i} (t) + \sum_{j \in Ω_{i}, i \neq j} γ_{ij} {\bar{w}}_{j} (t)

(17)

Similar to equation (11), the compressed ${\bar{φ}}_{i} (t + 1)$ is used instead of $φ_{i} (t + 1)$ and ${\bar{w}}_{i} (t)$ instead of $w_{i} (t)$ . Then the node i in time instant t reconstructs $φ_{i} (t + 1)$ by operating the StOMP algorithm just once since equation (17) in a linear function. It is denoted by equation (18)

φ_{i} (t + 1) = ℏ ({\bar{φ}}_{i} (t + 1))

(18)

Since the performance of the LMS algorithm depends strongly on the step size parameter $μ$ , the normalized algorithm is proposed. We use the ${μ_{i}}^{'} = μ_{i} / u_{i}^{T} (t) u_{i} (t)$ instead of $μ_{i}$ . With the function in equation (19), the $w_{i} (t + 1)$ is obtained since $φ_{i} (t + 1)$ has been acquired

w_{i} (t + 1) = φ_{i} (t + 1) + {μ_{i}}^{'} u_{i} (t) (d_{i} (t) - u_{i}^{T} (t) φ_{i} (t + 1))

(19)

The proposed distributed CCRA method is shown in Figure 3.

Figure 3.

CCRA method.

Compared with the standard CTA diffusion algorithm, there are two more stages. With the help of the compressed stage, the dimension of the exchange information is reduced from M x 1 to D x 1. The reconstruction stage is to insure that the adaptive stage works. Through the proposed CCRA method, the communication burden is significantly reduced while the estimation is obtained. Here is the process running of CCRA in Table 1.

Table I.

Process running of CCRA.

Initialize:

For each node i

w_{i} (0) = 0

for where

w_{i}

is M x 1 estimation vector

end

Running:

For each time instant t = 1,2,…, T

For each node i = 1,2,…,N

Compressed

{\bar{w}}_{i} (t) = Γ_{i} w_{i} (t)

where

Γ_{i}

is a D x M sensing matrix.

end

For each node i = 1,2,…,N

Combination

{\bar{φ}}_{i} (t + 1) = γ_{ii} {\bar{w}}_{i} (t) + \sum_{j \in Ω_{i}, i \neq j} γ_{ij} {\bar{w}}_{j} (t)

Reconstruction

φ_{i} (t + 1) = ℏ ({\bar{φ}}_{i} (t + 1))

Adaptation

w_{i} (t + 1) = φ_{i} (t + 1) + {μ^{'}}_{i} u_{i} (t) (d_{i} (t) - u_{i}^{T} (t) φ_{i} (t + 1))

end

Stability analysis

To analyze the stability of the whole network, we introduce the global state–space model with representation to describe it. The quantities of the whole network are defined as the priori estimation vector of the whole network $φ (t) \overset{Δ}{=} col {\begin{matrix} φ_{1} (t), & \dots, & φ_{N} \end{matrix} (t)}$ and the estimation vector $W (t) \overset{Δ}{=} col {\begin{matrix} w_{1} (t), & \dots, & w_{N} \end{matrix} (t)}$ are MN x 1 dimensions;

The compressed priori estimation vector $\bar{φ} (t) \overset{Δ}{=} col {\begin{matrix} {\bar{φ}}_{1} (t), & \dots, & {\bar{φ}}_{N} \end{matrix} (t)}$ and the compressed estimation vector $\bar{W} (t) \overset{Δ}{=} col {\begin{matrix} {\bar{w}}_{1} (t), & \dots, & {\bar{w}}_{N} \end{matrix} (t)}$ are DN x 1 dimensions; the global regression matrix is $U (t) \overset{Δ}{=} [\begin{matrix} u_{1} (t) & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & u_{N} (t) \end{matrix}]$ with MN x N dimensions. The global measurement matrix $d (t) \overset{Δ}{=} col {\begin{matrix} d_{1} (t), & \dots, & d_{N} \end{matrix} (t)}$ and the global noise vector $V (t) \overset{Δ}{=} col {\begin{matrix} v_{1} (t), & \dots, & v_{N} (t) \end{matrix})}$ are N x 1 dimensions. The $μ' \overset{Δ}{=} diag {\begin{matrix} {μ^{'}}_{1} I_{M}, & \dots, & {μ^{'}}_{N} I_{M} \end{matrix}}$ is the step size matrix with NM x NM dimensions, where the $I_{M}$ is the identity matrix with M x M dimensions; $Λ \overset{Δ}{=} [\begin{matrix} γ_{11} & \dots & γ_{1 N} \\ ⋮ & ⋱ & ⋮ \\ γ_{N 1} & \dots & γ_{NN} \end{matrix}] \otimes I_{M} = γ \otimes I_{M}$ is the global combination matrix with NM x NM, where ⊗ denotes the Kronecker product; $Γ \overset{Δ}{=} [\begin{matrix} Γ_{1} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & Γ_{N} \end{matrix}]$ is the global sensing matrix with DN x NM dimensions.

We get the global compressed stage representation: $\bar{W} (t) = Γ W (t)$ and $\bar{φ} (t) = Γ φ (t)$ . Then, the combination stage is formed in state-space: $φ (t + 1) = Λ W (t)$ , $\bar{φ} (t + 1) = Λ \bar{W} (t)$ . The reconstruction StOMP algorithm of the global model is denoted by $H (•)$ . And the global reconstruction stage is obtained by $φ (t + 1) = H (\bar{φ} (t + 1))$ .

The simple global update function of the linear regression network model without cooperation can be represented as in equation (20)

W (t + 1) = φ (t + 1) + μ' U (t) (d (t) - U^{T} (t) φ (t + 1))

(20)

The global update function of the network with the proposed distributed CCRA can be illustrated as in equation (21)

\begin{matrix} W (t + 1) & = H (Λ Γ W (t)) \\ + μ' U (t) (d (t) - U^{T} (t) H (Λ Γ W (t))) \end{matrix}

(21)

With the StOMP algorithm, we can reconstruct the compressed estimation fully by choosing appropriate sensing matrix and parameter D, the stability in the mean of the function (21) is similar to that of equation (22)

W (t + 1) = Λ W (t) + μ' U (t) (d (t) - U^{T} (t) Λ W (t))

(22)

The true parameter matrix $W_{o}$ is denoted as $W_{o} \overset{Δ}{=} col {w_{o} w_{o} \dots w_{o}}$ with MN x 1 dimensions. So we get the global regression model in state–space representation

d (t) = U^{T} (t) W_{o} + V (t)

(23)

Let us denote the estimation error matrix $Δ (t) \overset{Δ}{=} W_{o} - W (t)$ . So equation (21) can be demonstrated in another form as in equation (24)

\begin{matrix} Δ (t + 1) = W_{o} - Λ W (t) \\ - μ' U (t) (U^{T} (t) W_{o} + V (t) - U^{T} (t) Λ W (t)) \end{matrix}

(24)

Since $Λ W_{o} = W_{o}$ , the function is equal to the function in equation (25)

\begin{matrix} Δ (t + 1) = Λ W_{o} - Λ W (t) \\ - μ' U (t) (U^{T} (t) Λ W_{o} + V (t) - U^{T} (t) Λ W (t)) \\ = Λ Δ (t) - μ' U (t) (U^{T} (t) Λ Δ (t) - V (t)) \\ = (I_{MN} - μ' U (t) U^{T} (t)) Λ Δ (t) + μ' U (t) V (t) \end{matrix}

(25)

Assuming that the regressions are temporal and independent, we take the expectation to the both sides of the equation

E (Δ (t + 1)) = (I_{MN} - μ' R) Λ E (Δ (t))

(26)

where $R = diag {\begin{matrix} R_{1}, & \dots, R_{i}, \dots, & R_{N} \end{matrix}}$ and $R_{i} = E (u_{i} (t) u_{i}^{T} (t))$ .

To achieve the stability in the main of the global network model, the function in equation (27) should hold

| λ (I_{MN} - μ' R) Λ | < 1

(27)

Let the data matrix denotes $X = I_{MN} - μ' R$ . The convergence of estimation for the whole network depends on the time-variant data matrix $X$ and the combination matrix $Λ$ . If the whole network operates without cooperation, the function can be illustrated as in equation (28)

\begin{matrix} E (Δ (t + 1)) = (I_{MN} - μ' R) E (Δ (t)) \\ | λ (X) | < 1 \end{matrix}

(28)

While the network operates with cooperation, we should consider $λ (X Λ)$ . Since $X$ is a symmetric matrix, then equation (29) can hold

‖ X Λ ‖_{2} \leq ‖ X ‖_{2} • ‖ Λ ‖_{2}

(29)

So that $σ_{max} (X Λ) \leq σ_{max} (X) • σ_{max} (Λ)$ , where $σ_{max}$ is the max singular value of the matrix. Since the $Λ$ is symmetric and $σ_{max} (Λ) = 1$ , it is known that $σ_{max} (X) = | λ_{max} (X) |$ and $| λ_{max} (X Λ) | \leq σ_{max} (X Λ)$ , and then we have equation (30)

| λ_{max} (X Λ) | \leq | λ_{max} (X) |

(30)

In the other words, the spectral radius of $X Λ$ is smaller than that of $X$ . Therefore, the stability in mean of the proposed CCRA method has a stabilizing effect on the LMS algorithm.

Simulation results

To illustrate the performance of the proposed method in this article, we compare it with that obtained by other LMS methods such as DLMS, NDLMS (normalized version), ATIC⁸ and RC-DLMS.⁹ The considered network topology is in Figure 4 with N = 20 nodes.

Figure 4.

Network topology in the simulations.

The input regressors are M = 32 vector and generated as sample vectors $u_{i, t} = [u_{i} (t) u_{i} (t - 1) \dots u_{i} (t - {M + 1)]}^{T}$ of an AR-1 process of the form $u_{i} (t) = x_{i} (t) + ρ_{i} u_{i} (t - 1)$ , where $ρ_{i}$ is a correlation coefficient and $x_{i} (t)$ is a white process with $σ_{x, i} = 1$ . The trace of each node’s regression matrix $R_{i}$ is shown in Figure 5. The noise input $v_{i} (t)$ at each node is zero-mean Gaussian, and we show the variant of each node’s noise in Figure 6. The input regressors and noise are temporary and spatially independent of each other. The parameter vector to be estimated has S sparsity where S = 2. The sensing matrix $Γ_{i}$ is an i.i.d. Gaussian matrix with D x M dimensions, where D = 12 and keep constant. The step size of DLMS, DLMS without cooperation, ATIC and RC-DLMS is set to $μ = 0.01$ . The step of the normalized version as NDLMS and the proposed CCRA is set to $μ = 0.4 / u_{i}^{T} (t) u_{i} (t)$ . The period length of ATIC method is set to 2, and the selection probability parameter of RC-DLMS is 0.5. All the curves shown in the figures are the average results of 50 independent runs.

Figure 5.

Trace of regression matrices.

Figure 6.

Noise variance.

For performance metric, we used mean-square error (MSE) of the whole network defined as $MSE (dB) = 20 \log (\frac{1}{N} E {‖ d (t) - U^{T} (t) W (t) ‖}_{2})$ . Figure 7 shows that the proposed CCRA method has a faster convergence compared with the other methods. Then, we use mean-square deviation (MSD) of the whole network defined as $MSD (dB) = 20 \log (\frac{1}{N} E {‖ W (t) - W_{o} ‖}_{2})$ and it is shown in Figure 8. At the instant time 520, the network MSD of the proposed CCRA is below −50 dB, while the other methods with cooperation is nearly −50 dB about at the time 1200. As a result in Figures 7 and 8, the proposed CCRA achieves a better convergence rate and a same or better performance in network MSD or network MSE than other methods.

Figure 7.

Network MSE performance.

Figure 8.

Network MSD performance.

To evaluate the performance in the steady state, we average the data of the last 500 time instant as steady state. We define the steady state MSE of node i as $MS E_{i} (dB) = 20 \log (E {‖ d_{i} (t) - u_{i}^{T} (t) w_{i} (t) ‖}_{2})$ and steady state MSD as $MS D_{i} (dB) = 20 \log (E {‖ w - w_{i} (t) ‖}_{2})$ . In Figure 9, the steady state MSE of CCRA method in this article is about −60 dB which is slightly smaller than the MSE of the other methods with cooperation. Figure 10 shows that the steady state MSE of the CCRA is about −55 dB while the MSD of the other methods with cooperation is between −49 dB and −47 dB.

Figure 9.

Steady state MSE performance per node.

Figure 10.

Steady state MSD performance per node.

Since the DLMS and NDLMS method use a standard diffusion method in communication, ATIC, RC-DLMS and the proposed CCRA are designed to reduce the communication load in different ways. Figure 11 shows the communication load under our simulation for each method. It can be seen that the standard diffusion method has 12,800 dimensions to send and receive in the whole network. The ATIC and RC-DLMS can reduce the communication load to 6400 on average, while they almost achieve the performance of the standard method. Furthermore, the proposed CCRA method in this article reduces the communication load to 4800 without affecting the performance of the network.

Figure 11.

Communication load of the network.

Conclusion

We present a distributed estimation method over network called CCRA to reduce the communication load based on CS. The derivation of diffusion algorithm is illustrated. In this method, only a compressed estimation is distributed between nodes in the network as the unknown parameter is sparse. We describe the whole network in a global state–space model and analyze the global stability in the proposed CCRA method. Compared with other existing methods in the simulation, we achieve the same or better performance (including network MSD, network MSE, steady-state MSD and steady-state MSD) and faster convergence by using the proposed CCRA while reducing the communication load significantly.

Footnotes

Handling Editor: Antonio Lazaro

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Lin Li

References

Abdolee

Champagne

. Diffusion LMS strategies in sensor networks with noisy input data. IEEE ACM T Network 2015; 24(1): 3–14.

Arablouei

Huang

Werner

et al . Reduced-communication diffusion LMS strategy for adaptive distributed estimation. Signal Process 2014, https://www.researchgate.net/publication/265089578_Reduced-Communication_Diffusion_LMS_Strategy_for_Adaptive_Distributed_Estimation

Sahoo

Panda

Mulgrew

et al . Robust incremental adaptive strategies for distributed networks to handle outliers in both input and desired data. Signal Process 2014; 96(5): 300–309.

Cattivelli

Sayed

. Analysis of spatial and incremental LMS processing for distributed estimation. IEEE T Signal Process 2011; 59(4): 1465–1480.

Lopes

Sayed

. Diffusion least-mean squares over adaptive networks: formulation and performance analysis. IEEE T Signal Process 2008; 56(7): 3122–3136.

Cattivelli

Sayed

. Diffusion LMS strategies for distributed estimation. IEEE T Signal Process 2010; 58(3): 1035–1048.

Chen

Sayed

. Diffusion adaptation strategies for distributed optimization and learning over networks. New York: IEEE Press, 2012.

Fernandez-Bes

Azpicueta-Ruiz

Silva

MTM

et al . A novel scheme for diffusion networks with least-squares adaptive combiners. In: 2012 IEEE international workshop on machine learning for signal processing, Santander, Vol. 248, 23–26 September 2012, pp.1–6. New York: IEEE.

Tewari

Vaisla

. Performance study of SEP and DEC hierarchical clustering algorithm for heterogeneous WSN. In: 2014 6th international conference on computational intelligence and communication networks, Bhopal, India, 14–16 November 2014, pp.385–389. New York: IEEE.

10.

Senouci

Mellouk

Senouci

et al . Performance evaluation of network lifetime spatial-temporal distribution for WSN routing protocols. J Netw Comput Appl 2012; 35(4): 1317–1328.

11.

Arablouei

Werner

Doğançay

et al . Analysis of a reduced-communication diffusion LMS algorithm. Signal Process 2015; 117: 355–361.

12.

Werner

Huang

Campos

MLRD

et al . Distributed parameter estimation with selective cooperation. In: IEEE international conference on acoustics, speech and signal processing, Taipei, Taiwan, 19–24 April 2009, pp.2849–2852. New York: IEEE.

13.

Wang

Tay

. An energy-efficient diffusion strategy over adaptive networks. In: International conference on information, communications and signal processing, Singapore, 2–4 December 2015, pp.1–5. New York: IEEE.

14.

Sayed

. Diffusion strategies outperform consensus strategies for distributed estimation over adaptive networks. IEEE T Signal Process 2012; 60(12): 6217–6234.

15.

Chen

Hero

. Sparse LMS for system identification. In: IEEE international conference on acoustics, speech and signal processing, Taipei, Taiwan, 19–24 April 2009, pp.3125–3128. New York: IEEE.

16.

Di Lorenzo

Barbarossa

Sayed

. Distributed spectrum estimation for small cell networks based on sparse diffusion adaptation. IEEE Signal Proc Let 2013; 20(12): 1261–1265.

17.

Yim

Lee

Song

. A proportionate diffusion LMS algorithm for sparse distributed estimation. IEEE T Circuits II 2015; 62(10): 992–996.

18.

Huang

Misra

Tang

et al . Applications of compressed sensing in communications networks. arXiv:1305.3002, 2014.

19.

Candes

Wakin

. An introduction to compressive sampling: a sensing/sampling paradigm that goes against the common knowledge in data acquisition. IEEE Signal Proc Mag 2008; 25(2): 21–30.

20.

Di Lorenzo

Sayed

. Sparse distributed learning based on diffusion adaptation. IEEE T Signal Process 2013; 61(6): 1419–1433.

21.

de Lamare

Poor

. Distributed compressive estimation based on compressive sensing. IEEE Signal Proc Let 2015; 22(9): 1311–1315.

22.

de Lamare

Poor

. Distributed low-rank adaptive estimation algorithms based on alternating optimization. Signal Process 2018; 144: 41–51.

23.

Baraniuk

. Compressive sensing [lecture notes]. IEEE Signal Proc Mag 2007; 24(4): 118–121.

24.

Candès

Recht

. Exact matrix completion via convex optimization. Found Comput Math 2009; 9(6): 717.

25.

Pati

Rezaiifar

Krishnaprasad

. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th Asilomar conference on signals, systems & computers, Pacific Grove, CA, 1–3 November 1993, pp.40–44. New York: IEEE.

26.

Zachariah

Chatterjee

Jansson

. Dynamic subspace pursuit. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), Kyoto, Japan, 25–30 March 2012, pp.3605–3608. New York: IEEE.

27.

Wei

Rodrigues

Wassell

. Distributed compressive sensing reconstruction via common support discovery. In: IEEE international conference on communications, Kyoto, Japan, 5–9 June 2011. New York: IEEE.

28.

Donoho

Tsaig

Drori

et al . Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. IEEE T Inform Theory 2012; 58(2): 1094–1121.