On Linear Coding Schemes for Stabilizing LTI Control with Multiple Sensors

Abstract

We examine the problem of designing the encoding and control policies of a linear stochastic control system, where the communication channel between the plant state observer (sensor) and the controller is a lossy wireless channel that is constrained in terms of transmit power and bandwidth. For a first-order ARMA modeled plant with Gaussian statistics, when there are two sensors observing the plant, nonlinear encoding is shown to result in smaller cost at time instant $T = 1$ compared to the linear schemes, if transmissions are carried out over parallel Gaussian independent channels. In this paper, optimal linear coding schemes for the case of multiple sensors are examined. They are shown to minimize the control cost at the infinite time horizon, when the wireless channel is accessed using time division multiplexing. Our analysis is carried out for when separation between the state estimation and control is possible, and the optimal steady state control law is certainty equivalent. The distortion lower bound for estimating the plant state is derived, along with the necessary conditions on the transmit power that minimize the steady state control cost. We also propose a linear scheme that reaches the distortion bound asymptotically under relaxed conditions.

1. Introduction

Wireless sensors and communication have become an integral part of closed-loop control systems in the recent years. Consider for example, the following simple control application. We wish to regulate the temperature of a room. Multiple sensor nodes are placed at different locations in the room for taking temperature measurements. The measurements are then sent to the controller, which utilizes the sensor observations when issuing new control commands to the heating or cooling elements. Naturally, benefits such as flexible placement, reduced costs on installation, and easier maintenance can be expected if sensor nodes communicate to the controller via wireless channels. This, however, also means that the sensor nodes are now sharing the communication medium, which is lossy and in general also restricted in terms of bandwidth and power. Such information constraints make the networked control systems differ from the classic wired setting. As pointed out in [1], designing the optimal communication and control policies in this case is no simple task, since the changes in one directly influence the outcome of the other. The controller may have to tolerate large errors and/or delays in sensor observations, while the sensors may have to encode and transmit their measurements subject to strict controller requirements, in addition to those imposed by the wireless channel. The situation may be further complicated by the network topology if we take into account possible multihop and relaying between sensor nodes and sensor/controller. Although many challenges in optimization of communication and control in these scenarios remain open [1], if we restrict our attention to linear and stochastic plants, and if the control objective is to minimize the linear quadratic Gaussian (LQG) control cost, there are a few nice, established results.

In particular, suppose a memoryless additive Gaussian noise channel is between the plant state observation (performed by a single sensor), and the controller, and the plant is described by the first-order ARMA (autoregressive moving average) model. And further, suppose that the encoder at the sensor has access to the current plant state and all past channel outputs, while the controller only has access to all past and current channel outputs. Simultaneous optimization of the encoding and control strategy that minimizes the LQG cost with finite and infinite time horizon leads to linear schemes for both the encoder at the sensor node and the controller [2].

In practice, the sensor observation may experience noise. In the case that the observation noise is additive memoryless Gaussian, optimality of linear encoding and control still holds when the noisy observations are transmitted over parallel Gaussian channels [3]. If we extend this setting to one that includes two sensor nodes, we arrive at a rather different result.

When both sensors are used to measure the plant state with certain observation noise, their measurements are correlated. If these measurements (intercorrelated Gaussian) are transmitted over two parallel Gaussian channels, it was shown in [4] that for time horizon of $T = 1$ , a nonlinear encoding scheme can result in a lower LQG cost, compared to the optimal linear strategy.

The findings of [4] is predominantly of a communication nature. The control applied to the plant is limited to time horizon $T = 0$ and is also linear. In this paper, we further extend the network topology to more than two sensors. The angle of our approach to the multisensor control problem is also rooted in communications. Namely, we ask ourselves the question “exactly how well can the linear coding schemes perform in such multisensor control system?”. Naturally, the answer is trivial if we restrict the control horizon to $T = 0$ , or the problem can be unmistakably complex if our aim is to simultaneously optimize communication and control for an infinite time horizon. We can, however, simplify our problem at hand by using the divide and conquer strategy. That is, we look at the case when communication (i.e., estimation of state) and control can be separated and then focus solely on the communication aspect of the problem. Note also that the term “communication” in this paper is confined to the application and physical layer of the communication protocol stack. Unquestionably, other layers can/will be involved when a more complex network is considered.

Our contribution in this paper is the following. We first identify the condition under which communication and control can be separated, when our goal is to minimize the control cost at the infinite time horizon. We then derive the distortion lower bound for multiple sensors observing the plant state and transmitting over the AWGN channel. We show when this distortion lower bound is achievable using linear coding schemes. Using these results, we present the necessary and sufficient conditions for stabilizing the plant. We then propose another linear coding scheme which achieves the minimum distortion asymptotically under relaxed conditions.

The basis of much of our findings is not new within the communications domain. In particular, linear coding and transmission with and without feedback have existed since the 1960s. However, further insights can be gained by taking these approaches and the new variations within the context of control. For instance, with a defined communication cost required to stabilize the control system, tradeoffs can be determined when designing implementable schemes. Note also that in this paper we restrict our attention to a simple first-order plant model. The main motivation is that certain issues regarding multisensor communication in this setting are not resolved until now. Our contribution is one step towards studying communication and control that involves a more complex plant. Keep in mind also that the more general linear systems constitute a significant portion of control systems, where the plant dynamics are more easily tractable. In addition, as we will show in the paper, linear control policies play a significant role in communication and control separation.

The rest of the paper is organized as follows. In Section 2, we present the proper formulation of our problem and show the conditions under which separation of communication and control can take place. The main results of the paper are summarized in Section 3. Numeric examples are given in Section 4 to provide further insights on how the different linear schemes perform under different sensor network conditions. We conclude the paper in Section 5 along with possible future work.

2. Problem Formulation and the Preliminaries

Figure 1 depicts the general structure of the closed-loop control system. The plant state $s (t)$ at time instant t is observed by M sensors, which are placed physically apart over a defined area. Each sensor experiences observation noise $w_{m} (t)$ . The sensor measurements are encoded with the help of additional side information $\vec{a} (t - 1)$ to produce channel input symbols $x_{m} (t)$ . The side information $\vec{a} (t - 1)$ constitutes what is available at the controller prior to receiving the current channel output $y_{m} (t)$ . It is made available at the encoders via a noiseless feedback channel. The exact content of $\vec{a} (t)$ will be discussed later. On the receiving side of the lossy wireless communication channel, we consider a joint decoder that processes all channel outputs. The decoder is set to be colocated with the controller, so it has access to the controller information. This is indicated by the dotted arrow to the decoder. The decoder output $z (t)$ is used by the controller to produce the control signal $r (t)$ , which is subsequently fed back to the plant.

Figure 1

Closed-loop control system with M sensors.

We consider a discrete time, LTI (linear-time-invariant) scalar plant that is defined by the following state equation:

\begin{matrix} s (t + 1) = γ s (t) + r (t) + d (t), \forall t \geq 0 \end{matrix},

(1)

where γ is a known constant,

d (t)

is the plant disturbance and

s (t), d (t), r (t) \in ℝ

. Both

s (0)

and

d (t)

are independently and identically distributed (i.i.d.) zero mean memoryless Gaussian random variables (r.v.'s) of variance

σ_{s}^{2} (0)

and

σ_{d}^{2}

, respectively. Note that we do not restrict γ, so the plant may be unstable without control.

Figure 2 illustrates the communication part of the closed-loop control system in detail. Here we have

\begin{matrix} u_{m} (t) = s (t) + w_{m} (t) m = 1,2, \dots, M \end{matrix},

(2)

where

u_{m} (t)

is the observation of

s (t)

at sensor m. The observation noise

w_{m} (t)

is zero mean, memoryless Gaussian with variance

σ_{w_{m}}^{2} (t)

and is independent of

s (t)

. In addition, the observations

u_{m} (t), m = 1,2, \dots, M

are conditionally independent of each other, given

s (t)

. Each sensor node is equipped with its own encoder, denoted by function

ℱ_{m} (\cdot)

. So

x_{m} (t) = ℱ_{m} (u_{m} (t), \vec{a} (t - 1))

. We consider an information pattern where

\vec{a} (t - 1)

consists of past controls

r_{0}^{t - 1}

and decoder output

z_{0}^{t - 1}

, where the notation

r_{0}^{t - 1}

means

[r (0), \dots (t - 1)]

. In other words, the encoder has full knowledge. Naturally, depending on the application, the amount of feedback information available at the encoder may vary, or the feedback quality itself may be limited. It is, however, also important to examine the system under rather idealized scenario, so that we can determine the respective performance bounds. A brief discussion on other information pattern and feedback limitations is presented at the end of the paper.

Figure 2

A sensor network with M sensors and orthogonal transmission.

The channel input $x_{m} (t)$ from node m is subjected to an average transmit power constraint $P_{m}$ , which is defined as

\begin{matrix} \frac{1}{T} \sum_{t = 0}^{T} E [x_{m} {(t)}^{2}] \leq P_{m} . \end{matrix}

(3)

The sensor nodes access the wireless channel using orthogonal transmission, for example, TDMA or FDMA (time or frequency division multiple access). This is illustrated in the figure as M parallel independent subchannels. Furthermore, we assume that if the plant produces the state signal at a rate of

2 B

symbols per second, that is, the state signal occupies a bandwidth of B symbols per second, then the wireless channel is said to have a total bandwidth of

M \cdot B

, where each subchannel has bandwidth B. The subchannels are corrupted by additive white Gaussian noise

n_{m} (t)

of zero mean and variance

σ_{n}^{2}

. The channel noises are also mutually independent. At the receiver side, the decoder output

z (t)

is produced using the decoding function 𝒢, and past control signals. So

z (t) = 𝒢 (y_{m = 1 ~ M} (t), r_{0}^{t - 1})

We define the steady state Linear Quadratic Gaussian (LQG) cost of the above control system as

\begin{matrix} J = \lim_{T \to \infty} \sup \frac{1}{T} E [\sum_{t = 0}^{T - 1} s^{2} (t) + κ r^{2} (t)] \end{matrix}

(4)

for some

κ > 0

. Our goal is to design the encoder-controller pair, such that the cost function is minimized for the plant defined by (1), under the above specified communication constraints.

Before proceeding further, we will first simplify our problem by identifying the optimal control policy.

Proposition 1.

The optimal control for the system is the certainty equivalence control law, given that the encoding functions $ℱ_{m}$ are linear.

Proof.

The proof follows the same argument as shown in [5]. We present here a brief summary of the main points.

For the plant in (1), under full state observation, that is, when the controller has direct access to $s (t)$ , the optimal steady state control law is the certainty equivalent (CE) control law [6]. That is, the control signal is the state signal with a linear gain: $r (t) = l \cdot s (t)$ , where

\begin{matrix} l = \frac{- h γ}{(h + κ)}, \end{matrix}

(5)

and h satisfies the scalar Riccati equation:

\begin{matrix} h = 1 + \frac{γ^{2} h κ}{h + κ} . \end{matrix}

(6)

The communication channel from the sensors to controller turns the fully observed plant into a partially observed one. In order for the CE control law also to be optimal for system (1), that is, $r (t) = l \cdot z (t)$ , where $z (t)$ is the decoder's estimate of the plant state, the control must have no dual effect (as stated in [5, Proposition 3.1]). This means that the estimation error variance of the state signal is independent of the control applied for all t. Memoryless Gaussian channels with any linear encoders in fact satisfy the necessary conditions for insuring no dual effect of the control signal, as shown in [5, Lemma 3.1]. Subsequently, for system (1), the optimal control remains to be certainty equivalent as given by (5).

The partially observed control problem is then reduced to a fully observed one, by modeling the estimated state signal as the new state:

\begin{matrix} z (t + 1) = γ z (t) + r (t) + \tilde{d} (t), \end{matrix}

(7)

where

\tilde{d} (t)

is the new information being transmitted over the channel.

And the optimal control is

\begin{matrix} r (t) = - \frac{h γ}{h + κ} z (t) = - \frac{h γ}{h + κ} \hat{s} (t), \\ where \hat{s} (t) = E [s (t) ∣ y_{m = 1 ~ M} (t), r_{0}^{t - 1}] . \end{matrix}

(8)

Define

Δ (t, M) = E [{| s (t) - \hat{s} (t) |}^{2}]

. The cost function can be rewritten as [5]

\begin{matrix} \tilde{J} (F_{m = 1 ~ M}, G) = \frac{h^{2} γ^{2}}{h + κ} Δ_{t \to \infty} (t, M) + h σ_{d}^{2} . \end{matrix}

(9)

Since the second term is a constant that depends on the control system parameters $γ, κ$ and plant disturbance $σ_{d}^{2}$ . Our objective of minimizing the cost function as given in (9) is met, if there exists linear encoding schemes that in fact minimizes $Δ (t, M)$ , for all t, under the given communication constraints. We will show next that this is indeed possible.

3. Summary of Results

In this section, we present a summary of our findings in the form of lemmas and theorems, along with the necessary proofs.

We begin by analyzing the source of observation in the sensor network, $s (t)$ . Given that the side information $\vec{a} (t - 1) = [z_{0}^{t - 1} r_{0}^{t - 1}]$ is available at the sensor nodes, each encoder can compute the following:

\begin{matrix} s (t) + w_{m} (t) - E [s (t) ∣ z_{0}^{t - 1}] & = γ (s (t - 1) - \hat{s} (t - 1)) \\ + d (t - 1) + w_{m} (t) \\ = γ λ (t - 1) + d (t - 1) + w_{m} (t), \end{matrix}

(10)

where

λ (t - 1) = s (t - 1) - \hat{s} (t - 1)

is the estimation error of the state signal at instant

t - 1

Define the innovation process $i (t)$ as

\begin{matrix} i (t) = γ λ (t - 1) + d (t - 1) . \end{matrix}

(11)

The communication scenario we consider is thus equivalent to having the sensors observing the innovation

i (t)

, without any side information. As the coding operations we consider here are linear, we can assume therefore, without loss of generality, that the innovation is memoryless and Gaussian distributed with zero mean and variance

σ_{i}^{2} (t)

. It is then sufficient to evaluate

Δ (t, M)

in terms of the innovation's variance.

Lemma 1.

Given a sensor network as depicted in Figure 2, at time t, the distortion $Δ (t, M)$ is lower bounded by $Δ_{l} (t, M)$ :

\begin{matrix} Δ_{l} (t, M) & = \frac{σ_{i}^{2} (t)}{1 + σ_{i}^{2} (t) \sum_{1}^{M} (1 / σ_{w_{m}}^{2} (t))} \\ \times (1 + \frac{σ_{i}^{2} (t) \sum_{1}^{M} (1 / σ_{w_{m}}^{2} (t))}{\prod_{1}^{M} (1 + P_{m} / σ_{n}^{2})}), \end{matrix}

(12)

where

σ_{i}^{2} (t)

is the variance of innovation.

This statement remains true even when arbitrary collaborations between the sensor nodes are allowed or when feedbacks are present at some or all sensor nodes.

Proof.

We prove the above distortion lower bound by considering an idealized communication scenario. This approach has been applied in other context (see [7, 8]).

Let the M sensor nodes be connected by wires or ideal links (Ideal links in the communication sense refer to lossless channels with infinite capacity.), so that full collaboration is possible. This is equivalent to replacing the sensors with a single “fusion node”. We have then a point-to-point (p2p) channel between the fusion node and the decoder. The resulting structure constitutes the so-called “remote sensing” problem [9]. The optimal performance of a p2p communication system is characterized by equality between the rate-distortion function of the source and the channel capacity $R (D_{r}) = C$ . The M parallel memoryless Gaussian subchannels, now fused into one, has a total capacity of

\begin{matrix} C = \sum_{1}^{M} \log_{2} (1 + \frac{P_{m}}{σ_{n}^{2}}) . \end{matrix}

(13)

The distortion-rate function of the remote sensing problem is [9]

\begin{matrix} D_{r} (t, R) = \frac{σ_{i}^{2} (t)}{1 + Ψ} (1 + 2^{- 2 R} Ψ), \\ where Ψ = σ_{i}^{2} (t) \sum_{1}^{M} \frac{1}{σ_{w_{m}}^{2} (t)} . \end{matrix}

(14)

Replacing R with C in the above equation, we arrive at the claimed formula.

Remark 1.

If the sensor observations are perfect, that is, $w_{m} (t) = 0$ , and if all the sensors have the same transmit power constraint, that is, $P_{m} = P$ , then the distortion lower bound becomes

\begin{matrix} Δ_{l}^{*} (t, M) = \frac{σ_{i}^{2} (t)}{{(1 + P / σ_{n}^{2})}^{M}} . \end{matrix}

(15)

This is a known result which describes the optimal performance theoretically attainable (OPTA) for transmitting a memoryless Gaussian source over a memoryless Gaussian channel that has M times higher bandwidth [10]. When

M = 1

, linear encoding, that is, simple scaling of the source symbols to meet the transmit power constraint, is optimal. For

M > 1

, nonlinear coding schemes can easily outperform their linear counter parts. The optimal linear scheme, which is also referred to as BPAM (block pulse amplitude modulation) offers only 3 dB gain in source signal-to-distortion ratio (SDR), with each doubling of M [11]. Nonlinear scheme such as that proposed in [12], which uses direct analogue mappings, offers significant gain in SDR compared to BPAM, especially when the channel signal-to-noise ratio (CSNR) is high. However, it is still 6 dB away from OPTA. In addition, it was proven in [13] that nonlinear coding schemes will have the quadratic distortion bounded away from

Δ_{l}^{*} (t, M)

, as M increases, when the source dimensionality is large but finite.

On the other hand, if the wireless channel is accessed via TDMA, causal feedback from the decoder can be utilized together with linear schemes. We will show next that this approach can be extended to sensors with noisy observations.

Lemma 2.

In the case of TDMA, the distortion lower bound of (12) can be achieved using a linear scheme with noiseless causal feedback and full sensor collaboration.

Proof.

Consider again that the sensor nodes are connected by wires to allow full collaboration. An MMSE (minimum-mean-squared-error) estimation of $i (t)$ can be made using all available noisy observations. Denote this estimation as $\tilde{i} (t)$ . And

\begin{matrix} \tilde{i} (t) = \frac{E [i (t) \sum_{1}^{M} u_{m} (t)]}{E [{(\sum_{1}^{M} u_{m} (t))}^{2}]} \sum_{1}^{M} u_{m} (t) \end{matrix}

(16)

which has variance:

\begin{matrix} σ_{\tilde{i}}^{2} (t) = \frac{σ_{i}^{4} (t) \sum_{1}^{M} (1 / σ_{w_{m}}^{2} (t))}{σ_{i}^{2} (t) \sum_{1}^{M} (1 / σ_{w_{m}}^{2} + 1)} . \end{matrix}

(17)

The variance of the estimation error is

\begin{matrix} D_{e} (t, M) = \frac{σ_{i}^{2} (t)}{1 + σ_{i}^{2} (t) \sum_{1}^{M} (1 / σ_{w_{m}}^{2} (t))} . \end{matrix}

(18)

Note that this is also the term outside the parenthesis of (12).

Let one of the M sensors, call it sensor 1, transmit the estimation $\tilde{i} (t)$ with power constraint $P_{1}$ , using a linear scaling factor $F_{1} (t)$ . The decoder $G_{1} (t)$ makes MMSE estimation of $\tilde{i} (t)$ and feeds back the first reconstruction ${\hat{i}}_{1} (t)$ to the fusion node via a noiseless feedback channel. At the next instant, sensor 2 transmits only the innovation $\tilde{i} (t) - {\hat{i}}_{1} (t)$ . At the decoder, the newly received and decoded innovation is added to the previous reconstruction to produce the signal for feedback, so

\begin{matrix} {\hat{i}}_{2} (t) = {\hat{i}}_{1} (t) + G_{2} (t) (F_{2} (\tilde{i} (t) - {\hat{i}}_{1} (t)) + n_{2} (t)) . \end{matrix}

(19)

Repeat this process with the corresponding

F_{m} (t)

and

G_{m} (t)

\begin{gathered} F_{m} (t) = \sqrt{\frac{P_{m}}{D_{m - 1}}}, G_{m} (t) = \frac{F_{m} (t) D_{m - 1}}{P_{m} + σ_{n}^{2}}, \\ D_{m} (t) = \frac{D_{m - 1} (t) σ_{n}^{2}}{P + σ_{n}^{2}}, D_{0} (t) = σ_{\tilde{i}}^{2} (t) . \end{gathered}

(20)

After M channel uses, we have

\begin{matrix} D_{M} (t, M) = \frac{σ_{\tilde{i}}^{2} (t)}{\prod_{1}^{M} (1 + P_{m} / σ_{n}^{2})} . \end{matrix}

(21)

Since the two distortions due to estimation and distortion from transmission are uncorrelated, we can simply sum

D_{e} (t, M)

and

D_{M} (t, M)

to get the end-to-end distortion, which is the same as

Δ_{l} (t, M)

as desired.

Remark 2.

The basic principle behind our proof can be traced back to [14–16], and it is also known as the Schalkwijk-Kailath scheme. By transmitting only the innovation at each channel use, optimal distortion can be reached for memoryless Gaussian source channels with integer channel to source bandwidth ratio. In other words, if $w_{m} (t) = 0$ , the above described linear coding scheme will lead to $Δ_{l}^{*} (t, M)$ of (15). With each feedback, the contribution of the channel noise is reduced. It has, however, no effect on the observation noise. In fact $D_{e} (t, M)$ is the absolute distortion lower bound of the M sensor network since this is same as having the fusion node wired directly to the decoder.

We are now ready to compute the steady state cost function J and determine the necessary conditions the cost is minimized. The result is the following theorem.

Theorem 1.

The necessary condition to stabilize the scalar LTI control system of (1) is that the transmit power $P_{m}$ of each sensor m satisfies

\begin{matrix} \prod_{1}^{M} (1 + \frac{P_{m}}{σ_{n}^{2}}) > γ^{2} . \end{matrix}

(22)

This condition is sufficient under TDMA with a linear sensing strategy, with full sensor collaboration and using noiseless causal feedback.

Proof.

We need to determine the steady state distortion $Δ_{t \to \infty} (t, M)$ and identify the conditions when it is nonnegative and bounded.

From (11), we see that the variance of innovation at time t is

\begin{matrix} σ_{i}^{2} (t) = γ^{2} Δ (t - 1, M) + σ_{d}^{2} . \end{matrix}

(23)

Substitute

σ_{i}^{2} (t)

to (12) of Lemma 1. At the steady state,

Δ (t, M)

and

Δ (t - 1, M)

become

Δ_{t \to \infty} (t, M)

. Equation (12) can then be rewritten as the following second-order polynomial:

\begin{matrix} C_{1} Δ_{t \to \infty} {(t, M)}^{2} + C_{2} Δ_{t \to \infty} (t, M) + C_{3} = 0, \end{matrix}

(24)

where

\begin{gathered} C_{1} = γ^{2} \sum_{1}^{M} \frac{1}{σ_{w_{m}}^{2} (t)} (1 - \frac{γ^{2}}{Φ}), \\ C_{2} = 1 - γ^{2} + σ_{d}^{2} (1 - \frac{2 γ^{2}}{Φ}) \sum_{1}^{M} \frac{1}{σ_{w_{m}}^{2}}, \\ C_{3} = - σ_{d}^{2} - \frac{σ_{d}^{4} \sum_{1}^{M} (1 / σ_{w_{m}}^{2})}{Φ}, Φ = \prod_{1}^{M} (1 + \frac{P_{m}}{σ_{n}^{2}}) . \end{gathered}

(25)

To ensure that there exists a nonnegative root, we need

C_{1} > 0

and thus

Φ > γ^{2}

, which is (22) as expected.

The sufficient condition follows directly from Lemma 2.

Remark 3.

This result in fact coincides with the necessary condition for stabilizing system (1), when encoders observe $s (t)$ directly without noise and transmit over a memoryless Gaussian channel. In other words, the observation noise at the sensor nodes does not contribute to the overall information exchanged within the control system. Therefore, ideally, no communication resources should be allocated to them.

Remark 4.

To achieve $Δ_{l} (t, M)$ using the linear scheme described in Lemma 2, it is required that $(1)$ the signalling rate of the wireless channel is M times higher than the sampling rate of $s (t)$ , and $(2)$ full cooperation is possible between the sensor nodes. The first requirement is inherent of time division multiple access, which is in general not a concern. The second requirement can be realized by either connecting the sensor nodes by wire or allocating additional transmit power and bandwidth for inter-sensor communications. However, if either wiring is unfeasible or when long sensor network lifetime is desired, other means of coding and transmission have to be considered. Next, we present a simple alternative which is also linear.

Theorem 2.

Under TDMA, there exists a linear coding scheme which does not require collaboration between the sensor nodes and is able to reach $Δ_{l} (t, M)$ asymptotically, when either $M \to \infty$ or when $P_{m} / σ_{n}^{2} ≫ σ_{i}^{2} (t) / σ_{w_{m}}^{2} (t)$ .

Proof.

The basic scheme is similar to that given in the proof for Lemma 2, in the sense that noiseless causal feedback is required. The main difference is that here the feedback is made available to each sensor prior to its channel use, instead of the ideal fusion node. This means that we can effectively utilize the same feedback channel that made side information $\vec{a} (t)$ available at the sensor nodes.

The M sensors take turns in transmitting the difference between its observation and the previous reconstruction from the feedback channel, subject to its transmit power constraint. At the decoder, the newly received and decoded innovation is summed together with the previous reconstructions to produce the current estimation of the source. The set of linear coder/decoders are

\begin{align} {\tilde{F}}_{m} (t) & = \sqrt{\frac{P_{m}}{σ_{i}^{2} (t) + σ_{w_{m}}^{2} (t)}}, {\tilde{G}}_{m} (t) = \frac{{\tilde{F}}_{m} (t) {\tilde{D}}_{m - 1}}{P_{m} + σ_{n}^{2}}, \end{align}

(26)

\begin{align} {\tilde{D}}_{m} (t) & = \frac{{\tilde{D}}_{m - 1} (t) σ_{w_{m}}^{2} (t)}{{\tilde{D}}_{m - 1} (t) + σ_{w_{m}}^{2} (t)} (1 + \frac{{\tilde{D}}_{m - 1} (t)}{σ_{w_{m}}^{2} (t) (1 + P_{m} / σ_{n}^{2})}) \end{align}

(27)

and

{\tilde{D}}_{0} (t) = σ_{i}^{2} (t)

When $P_{m} / σ_{n}^{2} ≫ σ_{i}^{2} (t) / σ_{w_{m}}^{2} (t)$ , we can approximate ${\tilde{D}}_{m} (t)$ as

\begin{matrix} {\tilde{D}}_{m} (t) \approx \frac{{\tilde{D}}_{m - 1} (t) σ_{w_{m}}^{2} (t)}{{\tilde{D}}_{m - 1} + σ_{w_{m}}^{2} (t)} . \end{matrix}

(28)

After M transmissions, we have

\begin{matrix} {\tilde{D}}_{M} (t) = \frac{σ_{i}^{2} (t)}{1 + σ_{i}^{2} (t) \sum_{1}^{M} (1 / σ_{w_{m}}^{2} (t))}, \end{matrix}

(29)

which equals

Δ_{l} (t, M)

under the same conditions.

When $M \to \infty$ , from (12) it is clear that the distortion lower bound approaches 0. Rewrite (27) as ${\tilde{D}}_{m} (t) = f ({\tilde{D}}_{m - 1} (t))$ . The fixed point p of the iterative function $f (\cdot)$ is when $p = f (p)$ . A direct solution of p shows that ${\tilde{D}}_{m} (t)$ goes to 0 as $M \to \infty$ . If we apply the convergence test by evaluating $| f^{'} (p) |$ , where $f^{'} (\cdot)$ is the first-order derivative of the function, we can see that ${\tilde{D}}_{m} (t)$ converges linearly to 0 as $M \to \infty$ , instead of the exponential convergence in $Δ_{l} (t, M)$ .

4. Numerical Examples

We evaluate the different linear coding schemes from the previous section using numerical examples and compare them with the distortion bound $Δ_{l} (t, M)$ . In addition, we will examine how observation noise and size of the sensor network M affect the overall performance. For sake of simplicity and easier graphical interpretation, we set $P_{m} = P$ . And performance is assessed in terms of SDR (i.e., the inverse of the distortion normalized by the variance of $i (t)$ ) over a range of CSNR.

We first look at the simple case of two sensor nodes. To see the effect of observation noise $w_{1} (t), w_{2} (t)$ , we define the correlation coefficient between the sensor observations as

\begin{matrix} ρ (t) = \frac{E [i^{2} (t)]}{E [(i (t) + w_{1} (t)) (i (t) + w_{2} (t))]}, \end{matrix}

(30)

0 < ρ \leq 1

. The best linear scheme without causal feedback has an encoder that scales the source to satisfy the transmit power constraint, while at the decoder an MMSE estimation of

i (t)

is made using both received channel symbols. We have then the following end-to-end distortion:

\begin{matrix} δ (t, 2) = σ_{i}^{2} (t) - \frac{E {[i (t) y_{1} (t)]}^{2}}{E [y_{1}^{2} (t)]} - \frac{E {[i (t) y_{2} (t)]}^{2}}{E [y_{2}^{2} (t)]} . \end{matrix}

(31)

If the observation noise has the same variance

σ_{w}^{2} (t)

then we can express

δ (t, 2)

\begin{matrix} δ (t, 2) = σ_{i}^{2} (t) \frac{(1 - ρ) P + σ_{n}^{2}}{(1 + ρ) P + σ_{n}^{2}} . \end{matrix}

(32)

Note when

ρ = 1

, that is, perfect observation, we have

\begin{matrix} δ^{*} (t, 2) = \frac{σ_{i}^{2} (t)}{2 P / σ_{n}^{2} + 1} . \end{matrix}

(33)

So the best linear scheme is the same as transmitting the source over one channel with double the power.

We will see how “poorly” the best linear scheme without feedback performs relative to $Δ_{l} (t, M)$ for various ρ. The result is shown in Figure 3. Interestingly enough, the largest gap to OPTA (that from $Δ_{l} (t, 2)$ , (12)) occurs when correlation is very high, or in other words, when the observation noise variance is very small relative to the variance of innovation. For smaller ρ (below 0.7) or noisier observations, one may well choose the simple linear scheme without much sacrifice in performance.

Figure 3

Performance with two sensors for various observation noise.

For a network with more than 2 sensors, we compare the performance of the linear schemes proposed in Theorem 2 using causal feedback, along with the best linear scheme without feedback, relative to OPTA for different M. Let $σ_{i}^{2} (t) = 2$ and the observation noise variance $σ_{w_{m}}^{2} (t)$ be uniformly distributed between 0 and 1. We present the SDR versus CSNR results in Figure 4. As expected, the linear scheme with feedback closes the gap to OPTA significantly compared to the one without feedback in very low CSNR. When CSNR is at 15 dB, the gap to OPTA is already negligible for $M = 5$ . This overlapping point moves further in the direction of lower CSNR as M increases.

Figure 4

Performance with M sensors and unequal observation noise.

5. Conclusions and Future Work

In this paper, we surveyed a few established coding and transmission techniques in communication and re-examined their functions within the framework of closed-loop control. We studied linear coding schemes for an LQG control with M sensors for plant observation and an AWGN communication channel between the sensors and controller of constrained bandwidth and transmit power. We showed that, contrary to the counter example given in [4], when the orthogonal channel access scheme is TDMA, causal feedback can be utilized so that optimal and asymptotically optimal linear coding schemes exist for the encoders at the sensor nodes. Since plant estimation and control can be separated, the associating control policy in this case is the certainty equivalent law. We also showed that when the encoders have full information, that is, both the past channel outputs and control signals, the necessary powers that minimize the steady state LQG cost correspond to that when there is no observation noise at the sensor nodes.

Naturally if the channel access scheme is for example FDMA, causal feedback will not be feasible, and the above mentioned linear schemes will be reduced to that in (31). However, as we pointed out in the numerical example, gains of nonlinear encoding schemes are only meaningful when observations are highly correlated. Another issue with the use of causal feedback is the quality of the feedback channel. If the feedback channel is bandwidth limited and has constrained power, losses are expected, as noted in [17], for the case of memoryless Gaussian source with no observation noise. In the remote sensing scenario as we consider here, the effect of noise from the feedback channel should be analyzed while taking the observation noise and the number of sensors into account. As we can see from (12), it is the term outside the parenthesis that dominates the end-to-end distortion.

We would also like to note that, if the wireless channel is a Gaussian multiple access channel (MAC), then linear coding scheme is again optimal when the sensor network is symmetric, that is, equal transmit power and equal observation noise variance for all sensor nodes [18]. The sensor nodes only need to scale their observations to satisfy the power constraint, while at the decoder side, MMSE estimations of the source is made from the sum of the received channel symbols.

The information pattern we consider in this work is one that results in full information at the encoders. If, for example, that the prior channel outputs are not available at the encoders, as shown in [5], that linear scheme that calculates the innovation without prior channel outputs will not be able to stabilize an unstable plant due to error propagation of unsynchronized encoder decoder. For an already stable plant, to minimize the LQG cost in this case requires coding schemes that operate across correlated sensor observations and along the correlation in time of the plant state. Keep in mind that, like FDMA, causal feedback is not applicable here. The best possible linear coding scheme is that discussed in [19]. How to incorporate in the control setting is part of our ongoing work.

Footnotes

Acknowledgment

The author is grateful to Professor Tor Arne Johansen of Department of Engineering Cybernetics, NTNU for valuable discussions.

References

Nair

G. N.

Fagnani

Zampieri

Evans

R. J.

Feedback control under data rate constraints: an overview

Proceedings of the IEEE 2007 95 1 108 137

2-s2.0-64149096040

10.1109/JPROC.2006.887294

Bansal

Başar

Simultaneous design of measurement and control strategies for stochastic systems with feedback

Automatica 1989 25 5 679 694

2-s2.0-0024736893

Bansal

Başar

Solutions to a class of linear-quadratic-Gaussian (LQG) stochastic team problems with nonclassical information

Systems and Control Letters 1987 9 2 125 130

2-s2.0-0023399811

Yüksel

Takikonda

A counter example in distributed optimal sensing and control

IEEE Transactions on Automatic Control 2009 54 4 841 844

Tatikonda

Sahai

Mitter

Stochastic linear control over a communication channel

IEEE Transactions on Automatic Control 2004 49 9 1549 1561

2-s2.0-4644296232

10.1109/TAC.2004.834430

Bertsekas

D. P.

Dynamic Programming and Optimal Control 2007 3rd

Belmont, Mass, USA

Athena Scientific

Gasptar

Vetterli

Source-Channel Communication in Sensor Networks 2003

Berlin, Germany

Springer

Lecture notes in Computer Science

Behroozi

Alajaji

Linder

On the optimal power-distortion region for asymmetric Gaussian sensor networks with fading

Proceedings of IEEE International Symposium on Information Theory (ISIT ′08)

July 2008

1538 1542

2-s2.0-52349122815

10.1109/ISIT.2008.4595245

Dobrushin

Tsybakov

B. S.

Information transmission with additional noise

IRE Transactions on Information Theory 1962 8 5 S293 S304

10.

Berger

Rate Distortion Theory: Mathematical Basis for Data Compression 1971

Upper Saddle River, NJ, USA

Prentice-Hall

Prentice Hall Series in Information System Sciences

11.

Lee

K. H.

Petersen

D. P.

Optimal linear coding for vector channels

IEEE Transactions on Communications 1976 24 12 1283 1290

2-s2.0-0017243736

12.

Hekland

Floor

P. A.

Ramstad

T. A.

Shannon-Kotel'nikov mappings in joint source-channel coding

IEEE Transactions on Communications 2009 57 1 94 105

2-s2.0-60149094500

10.1109/TCOMM.2009.0901.070075

13.

Ingber

Leibowitz

Zamir

Feder

Distortion lower bounds for finite dimensional joint source-channel coding

Proceedings of the IEEE International Symposium on Information Theory (ISIT ′08)

July 2008

1183 1187

2-s2.0-52349083000

10.1109/ISIT.2008.4595174

14.

Cruise

T. J.

Achievement of rate-distortion bound over a additive white noise channel using a noisless feedback link

Proceedings of the IEEE 1967 55 583 584

15.

Schalkwijk

J. P. M.

Bluestein

L. I.

Transmission of analog waveforms through channels with feedback

IEEE Transactions on Information Theory 1967 13 4 617 619

16.

Kailath

An application of Shannon's rate-distortion theory to analog communication over feedback channels

Proceedings of the IEEE 1967 55 6 1102 1103

17.

Kim

A. N.

Ramstad

T. A.

On bandwidth expansion with noisy feedback

IEEE Communications Letters. In press

18.

Gastpar

Uncoded transmission is exactly optimal for a simple Gaussian “sensor” network

IEEE Transactions on Information Theory 2008 54 11 5247 5251

2-s2.0-55349115103

10.1109/TIT.2008.929967

19.

Hjorungnes

Ramstad

T. A.

Linear solution of the combined source-channel coding problem using joint optimal analysis and synthesis filter banks

Proceedings of the 31st Asilomar Conference on Signals, Systems & Computers

November 1998

Pacific Grove, Calif, USA

990 994