Sage Journals: Discover world-class research

Abstract

The paper addresses the problem of improving the accuracy of the measurements collected by a sensor network, where simplicity and cost-effectiveness are of utmost importance. An adaptive Bayesian approach is proposed to this aim, which allows improving the accuracy of the delivered estimates with no significant increase in computational complexity. Remarkably, the resulting cooperative algorithm does not require prior knowledge of the (hyper)parameters and is able to provide a “denoised” version of the monitored field without losing accuracy in detecting extreme (less frequent) values, which can be very important for a number of applications. A novel performance metric is also introduced to suitably quantify the capability to both reduce the measurement error and retain highly-informative characteristics at the same time. The performance assessment shows that the proposed approach is superior to a low-complexity competitor that implements a conventional filtering approach.

1. Introduction and Motivations

In the last years, sensor networks have started to be deployed for an increasing number of different applications [1]. The availability of low-cost commercial off-the-shelf nodes, fostered by significant advances in wireless communication technologies and size scaling of integrated circuits, has enabled the deployment of small low-cost sensor nodes with increased lifetime [2]. Typical applications are sensing/estimation of some parameters [3, 4] such as temperature, pollution level [5], electromagnetic exposure [6, 7], or field reconstruction [8, 9]. Such problems are particularly important in environmental monitoring [10], ecology [11], meteorology, agriculture, and related fields as reported in a number of case studies [12, 13]; see also [14] and references therein. More in general, sensing capabilities are currently regarded as a key enabler for smart applications in contexts as diverse as transportation systems [15–17], cyber-physical systems [18–20], and ad hoc networks [21] and in (opportunistic) applications like position estimation for location awareness [22–25]. Finally, interconnection of standalone systems can lead to advanced sensing capabilities, for example, in radar applications [26].

Both centralized and distributed approaches can be adopted to process the information collected through a sensor network [27]. In the centralized approach, data are sent to a fusion center (FC) performing the whole computation [28], while in the distributed one neighboring nodes cooperate in a peer-to-peer fashion until convergence [29, 30]. Regardless of implementation aspects, the goal can be formalized as an inference problem, based on sensor observations, about an underlying physical phenomenon. Clearly, observations are affected by errors introduced in the sensing/measurement process, for instance, due to thermal noise, atmospheric effects, and residual sensor calibration errors. This is especially true when data are collected through general-purpose devices, even smartphones, [31] instead of dedicated (expensive) sophisticated sensors. Therefore, techniques aimed at improving the accuracy of sensor measurements are highly desirable [32, 33].

Unfortunately, a peculiarity of sensor networks is that each sensor has quite limited power supply and computation capabilities; since advanced processing techniques cannot be often implemented with reduced effort, novel low-complexity approaches are needed to actually make sensor network applications feasible in most real-world contexts. As a matter of fact, systems designed to be effectively deployable have to face a number of practical difficulties; thus things are kept as simple and cheap as possible; this, however, may negatively have an impact on the final accuracy [12, 34]. In particular, a coarse granularity may be reported on reconstructed sensing maps, with resulting sharp edges between contiguous levels (as, e.g., in [16]). In other cases, values are averaged [5] or, if their spatial distribution is of concern, smoothed via suitable low-pass filters [35]. A simple moving average is often used in practice, which can be easily implemented as a correlation with a weight mask through a sliding window approach.

In this work, we propose a low-complexity Bayesian approach for improving the accuracy of the measurements, so that a more reliable value of the monitored field is obtained without the need for a sophisticated processing. We will show that, with other things being equal, few lines of code can be effective to improve the final accuracy if the information coming from other sensors is exploited in a cooperative way. The advantage of this approach is that the “filtering” procedure takes into account the statistical relationships in the data at hand. We consider a quite general observation model, where the measurement error is modeled through a Gaussian law. The latter is a versatile model for measurement errors and other random effects [36], supported by the central limit theorem (CLT). A Bayesian approach is used to estimate the value of the field by minimizing the mean squared error at each monitoring point. Different from conventional Bayesian techniques, which require full prior knowledge of the data distribution, we follow an Empirical Bayes approach [37], where the parameters of the prior (hyperparameters) are unknown. We derive their Maximum Likelihood (ML) estimator and show that it has a simple closed form amenable to practical implementation in low-cost devices. An application to spatial field monitoring and reconstruction is reported to highlight the performance improvement compared to a conventional “denoising” technique based on low-pass two-dimensional filtering (moving average).

The rest of the paper is organized as follows. In Section 2, we formulate the problem within the reference scenario. Then, in Section 3, we introduce the proposed filtering approach, which includes the derivation of the Minimum Mean Square Error (MMSE) estimator of the field and the Maximum Likelihood (ML) estimator of the hyperparemters. Besides the mathematical derivation, we provide also a scheme (and the corresponding algorithm's pseudocode) for practical application. In Section 4, we evaluate the proposed approach, showing that it can improve the estimation accuracy without significantly increasing the complexity. To better spotlight the ability to represent correctly specific characteristics of the monitored field, a novel metric is preliminarily introduced. Finally, Section 5 contains the conclusions of the work.

2. Problem Formulation

A general scenario is considered, where N sensor nodes observe a given phenomenon. We denote by $x_{i}$ , $i = 1, \dots, N$ , the measurement at sensor (equivalently, location) i relative to an unknown local parameter $m_{i}$ ; that is,

\begin{matrix} x_{i} = m_{i} + ϵ_{i}, \end{matrix}

(1)

where

ϵ_{i}

is a zero-mean error term with variance

σ^{2}

. Lacking specific models, it is customary to consider

ϵ_{i}

as normally distributed result of the CLT. The reference scenario is depicted in Figure 1. The monitored field is represented with shades proportional to the intensity of the parameter under investigation. Circles indicate the locations of the deployed sensors, with (some of the) links reported as edges of a (time-varying) graph, which depends on the communication policies of the deployed network and on environmental conditions. Measurements may be sent to a fusion center if the processing is centralized; in a distributed setup, conversely, nodes will exchange status updates by combining their measurement with the status updates of neighboring nodes until convergence.

Figure 1

Reference scenario of a monitoring sensor network.

One can exploit the fact that sensors are deployed over a “continuous” field for monitoring purposes; hence, they measure a variable which represents a “sampling” of the underlying whole process. As a consequence, some correlation can be expected according to the spatial proximity (distance) between sensors. However, correlations are difficult to model exactly, since they require deep knowledge of the process at hand. Moreover, they are very site-specific and change with time. Complicated models, if available, require in turn computational-intensive techniques; conversely, as mentioned, simple approaches are needed in low-cost sensor networks.

To this aim, we propose modelling the relationships between the sensed points of the field by means of a prior distribution on the measurements with unknown hyperparameters. In particular, given the Gaussian model for $x_{i}$ , we consider a conjugate prior distribution for the mean $m_{i}$ , which is again a Gaussian distribution. Thus, the model can be written as

\begin{matrix} x_{i} | m_{i} ~ N (m_{i}, σ^{2}), \\ m_{i} ~ N (μ, ν^{2}) . \end{matrix}

(2)

The goal is to recover each

m_{i}

, based on the statistic

x_{i}

, through the lens of the Bayesian hierarchy, that is, to enhance the estimation quality by considering the probability of observing specific field values and computing the estimator that minimizes the mean squared error (MMSE). However, the fact that the hyperparameters

(μ, ν^{2})

are unknown makes this approach different from the classical Bayesian framework, where the prior distribution is completely known. In particular, in Section 3.2, we derive in closed form the Maximum Likelihood (ML) estimators for

(μ, ν^{2})

, following the so-called Empirical Bayes (EB) rationale [37]. To this aim, we consider the whole statistic

x = [x_{1} \dots x_{N}]^{T}

and exploit the fact that observations share the same probability distribution. This allows fitting the Bayesian model by the most suitable (hyper)parameters, which are “learned” adaptively from the data, thus allowing for generality. Beforehand, we derive below the MMSE estimator for the value of the field at the sensor locations.

3. Empirical Bayes-Based Measurement Filtering

3.1. Minimum Mean Square Error Field Estimation

The conditional distribution of $x_{i}$ given $m_{i}$ is

\begin{matrix} p (x_{i} | m_{i}) = \frac{1}{\sqrt{2 π σ^{2}}} exp \{- \frac{{(x_{i} - m_{i})}^{2}}{2 σ^{2}}\} \end{matrix}

(3)

and, as mentioned, we consider a normal prior for

m_{i}

so that

\begin{matrix} p (m_{i}) = \frac{1}{\sqrt{2 π ν^{2}}} exp \{- \frac{{(m_{i} - μ)}^{2}}{2 ν^{2}}\} . \end{matrix}

(4)

We can derive the joint probability distribution of

x_{i}

and

m_{i}

\begin{array}{l} p (x_{i}, m_{i}) = p (x_{i} | m_{i}) = p (x_{i} | m_{i}) p (m_{i}) = \frac{1}{2 π σ v} \\ \cdot exp {- \frac{1}{2} [(\frac{1}{v^{2}} - \frac{1}{μ^{2}}) m_{i}^{2} - 2 (\frac{x_{i}}{σ^{2}} + \frac{μ^{2}}{v^{2}}) m_{i} + \frac{μ^{2}}{v^{2}} \\ + \frac{x_{i}^{2}}{σ^{2}}]} . \end{array}

(5)

The Minimum Mean Square Error (MMSE) estimator of

m_{i}

is obtained by computing the conditional mean [37]; therefore, it is first necessary to calculate the posterior distribution of

x_{i}

, according to Bayes’ theorem:

\begin{matrix} p (m_{i} | x_{i}) = \frac{p (x_{i}, m_{i})}{p (x_{i})} . \end{matrix}

(6)

The unconditional distribution of

x_{i}

p (x_{i})

, can be obtained from the joint distribution

p (x_{i}, m_{i})

by integrating

m_{i}

out:

\begin{array}{l} p (x_{i}) = \int_{- \infty}^{+ \infty} p (x_{i}, m_{i}) d m_{i} \\ = \frac{1}{2 π σ ν} exp \{- \frac{1}{2} (\frac{μ^{2}}{ν^{2}} + \frac{x_{i}^{2}}{σ^{2}})\} I_{x_{i}}, \end{array}

(7)

where

\begin{array}{l} I_{x_{i}} \\ = \int_{- \infty}^{+ \infty} exp \{- \frac{1}{2} (\frac{1}{ν^{2}} + \frac{1}{σ^{2}}) m_{i}^{2} + (\frac{μ}{ν^{2}} + \frac{x_{i}}{σ^{2}}) m_{i}\} d m_{i} . \end{array}

(8)

The integral above can be computed by completing the square to a Gaussian kernel. After some calculation, the result is

\begin{matrix} I_{x_{i}} = exp \{\frac{{(μ σ^{2} + x_{i} ν^{2})}^{2}}{2 σ^{2} ν^{2} (σ^{2} + ν^{2})}\} \frac{\sqrt{2 π}}{\sqrt{(σ^{2} + ν^{2}) / ν^{2} σ^{2}}} . \end{matrix}

(9)

Substituting (9) into (7) yields

\begin{array}{l} p (x_{i}) = \frac{1}{\sqrt{2 π (σ^{2} + ν^{2})}} \\ \cdot exp \{- \frac{1}{2} (\frac{μ^{2}}{ν^{2}} + \frac{x_{i}^{2}}{σ^{2}}) + \frac{{(μ σ^{2} + x_{i} ν^{2})}^{2}}{2 σ^{2} ν^{2} (σ^{2} + ν^{2})}\} . \end{array}

(10)

Using (6) (together with (5) and (10)) we obtain the final expression of the posterior distribution:

\begin{array}{l} p (m_{i} | x_{i}) = \frac{1}{\sqrt{2 π (σ^{2} v^{2} / (σ^{2} + v^{2}))}} \\ \cdot exp {- \frac{{(μ σ^{2} + x_{i} v^{2})}^{2}}{2 σ^{2} v^{2} (σ^{2} + v^{2})} - \frac{1}{2} (\frac{1}{v^{2}} + \frac{1}{σ^{2}}) m_{i}^{2} \\ - (\frac{μ}{v^{2}} + \frac{x_{i}}{σ^{2}}) m_{i}} = \frac{1}{\sqrt{2 π (σ^{2} v^{2} / (σ^{2} + v^{2}))}} \\ \cdot exp {- \frac{{(m_{i} - (μ σ^{2} + x_{i} v^{2}) / (σ^{2} / v^{2}))}^{2}}{2 (σ^{2} v^{2} / (σ^{2} + v^{2}))}} \end{array}

(11)

which can be recognized as a Gaussian distribution with mean

(σ^{2} μ + ν^{2} x_{i}) / (σ^{2} + ν^{2})

and variance

σ^{2} ν^{2} / (σ^{2} + ν^{2})

. Such a result is important because it yields a very tractable posterior distribution.

Given the result above, the MMSE is obtained in a simple closed form as

\begin{matrix} {\hat{m}}_{i} = E [m_{i} | x_{i}] = \frac{σ^{2} μ + ν^{2} x_{i}}{σ^{2} + ν^{2}} = γ μ + (1 - γ) x_{i}, \end{matrix}

(12)

where

\begin{matrix} γ = \frac{σ^{2}}{σ^{2} + ν^{2}} \in (0,1) . \end{matrix}

(13)

Equation (12) has a revealing interpretation as convex combination of the measurement

x_{i}

and the prior mean μ, according to the parameter γ. The latter depends on the relative importance of the measurement uncertainty (quantified by the variance

σ^{2}

) with respect to the overall uncertainty (i.e., including also the prior variance

ν^{2}

). It is obvious, therefore, that the hyperparameters

(μ, ν^{2})

cannot be assumed known a priori as in the conventional Bayesian approach; we propose instead fitting them to the observed data by leveraging cooperation between the nodes, that is, using the whole statistic x. In particular, in the next section, we derive the Maximum Likelihood (ML) estimator of the hyperparameters, finally obtaining the so-called Empirical Bayes (MMSE) estimator for

m_{i}

by replacing the estimates

(\hat{μ}, {\hat{ν}}^{2})

into (12).

3.2. Maximum Likelihood Hyperparameter Estimation

The joint probability distribution of the sensor measurements is obtained from (10) as

\begin{array}{l} p (x | μ, ν^{2}) = \prod_{i = 1}^{N} p (x_{i} | μ, ν^{2}) = {[2 π (σ^{2} + ν^{2})]}^{- N / 2} \\ \cdot exp \{\sum_{i = 1}^{N} [- \frac{1}{2} (\frac{μ^{2}}{ν^{2}} + \frac{x_{i}^{2}}{σ^{2}}) + \frac{{(μ σ^{2} + x_{i} ν^{2})}^{2}}{2 σ^{2} ν^{2} (σ^{2} + ν^{2})}]\}, \end{array}

(14)

where the conditioning stresses the fact that the hyperparameters are the unknown quantities to be estimated.

It is easy to verify that the expression between square brackets above is simplified as $- (1 / 2 (σ^{2} + ν^{2})) (x_{i} - μ)^{2}$ , so that (14) can be rewritten as

\begin{array}{l} p (x | μ, ν^{2}) \\ = {[2 π (σ^{2} + ν^{2})]}^{- N / 2} exp \{- \frac{\sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}{2 (σ^{2} + ν^{2})}\} \end{array}

(15)

to be maximized with respect to

(μ, ν^{2})

. It is convenient to rewrite this optimization problem after a logarithmic transformation; negating the result and omitting an irrelevant term, we obtain the equivalent problem:

\begin{array}{l} (\hat{μ}, {\hat{ν}}^{2}) \\ = \underset{μ \in R, ν^{2} \geq 0}{argmin} \{N log (σ^{2} + ν^{2}) + \frac{\sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}{σ^{2} + ν^{2}}\} . \end{array}

(16)

We can write the Karush-Kuhn-Tucker conditions [38] for this constrained problem as follows:

\begin{array}{l} 2 \sum_{i = 1}^{N} (x_{i} - μ) = 0 \\ \frac{N}{σ^{2} + ν^{2}} - \frac{\sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}{{(σ^{2} + ν^{2})}^{2}} - λ = 0, \\ - ν^{2} \leq 0, λ ν^{2} = 0, λ \geq 0, \end{array}

(17)

where the first two equations are the derivative of the Lagrangian function

\begin{matrix} L (μ, ν^{2}, λ) = N log (σ^{2} + ν^{2}) + \frac{\sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}{σ^{2} + ν^{2}} - λ ν^{2} \end{matrix}

(18)

with respect to μ and

ν^{2}

, the third line is the positivity constraint on the variance (primal feasibility), the fourth equation is the complementary slackness condition, and the last inequality is the dual feasibility condition. From the first equation, we obtain

\begin{matrix} \hat{μ} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}, \end{matrix}

(19)

that is, the sample mean, denoted by

\bar{x}

. From the feasibility conditions, it is easy to realize that the Lagrange multiplier λ is null at the stationary point if

ν^{2} > 0

; in that case, from the second equation, we obtain

{\hat{ν}}^{2} = s^{2} - σ^{2}

, where

\begin{matrix} s^{2} = \frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - \hat{μ})}^{2} \end{matrix}

(20)

denotes the sample variance. Any

λ > 0

necessarily implies

{\hat{ν}}^{2} = 0

in force of the complementary slackness. The ML estimator of

ν^{2}

is therefore given by the following expression:

\begin{matrix} {\hat{ν}}^{2} = max (0, s^{2} - σ^{2}) . \end{matrix}

(21)

This automatically accounts for the case

s^{2} - σ^{2} < 0

, which would lead to a (unfeasible) negative value for

{\hat{ν}}^{2}

3.3. Practical Application of the Algorithm

The approach developed above can be applied in a straightforward way to a monitoring network deployed in a given area. A schematic representation is depicted in Figure 2, which details the scheme of Figure 1. Some of the nodes are exploded to indicate the local processing, which can be summarized as a function f of the local measurement and of the “state” $(\bar{x}, s^{2})$ . The latter is the result of the cooperation in the network and as mentioned can be computed by a fusion center by means of a distributed processing. Clearly, we have that

\begin{matrix} f (x_{i}; \bar{x}, s^{2}) = γ \bar{x} + (1 - γ) x_{i}, γ = min (1, \frac{σ^{2}}{s^{2}}), \end{matrix}

(22)

where the expression for γ has been obtained by simple manipulations of (13).

Figure 2

Schematic representation of the proposed estimation approach.

It is reasonable to expect that, except for small-area networks, nodes may not be able to communicate in a completely meshed way, but rather they have limited connectivity dictated by their communication range. For each node i, we can define the set of neighbors $N_{i} (t)$ according to the coverage area allowed by the communication technology available in the network; then, the proposed algorithm can be applied locally, with each node exchanging a limited number of communication packets to neighboring nodes in order to calculate the value of the “state” $(\bar{x}, s^{2})$ for its local area, that is, by interacting with nodes $k \in N_{i} (t)$ . Clearly, each node may contribute (as neighbor) to the calculation of different nodes, according to the extent of the coverage areas. At limit, one could even have a single area enclosing all N nodes; otherwise, a certain number of “clusters” will arise, where the (possibly distributed) computation is performed independently, though it may share some information with gateway nodes. This means that the approach is consistent and very scalable.

Summing up, despite the lenghty calculation in Sections 3.1 and 3.2 (necessary for a rigorous derivation), the result is very handy and can be easily implemented in a few lines of code for a generic cluster as follows.

Pseudocode of the Proposed Algorithm

compute $\bar{x} = (1 / N) \sum_{i = 1}^{N} x_{i}$ and $s^{2} = (1 / N) \sum_{i = 1}^{N} (x_{i} - \bar{x})^{2}$

compute $γ = m i n (1, σ^{2} / s^{2})$

compute ${\hat{m}}_{i} = γ \bar{x} + (1 - γ) x_{i}$ at each node i

4. Performance Assessment

In this section, we show how the proposed approach can be used to improve the accuracy of a sensor network without significantly increasing the computational cost. We resort to simulations to control the ground truth; that is, we simulate the true value of the field (of some physical quantity, namely, temperature) plus additive noise that models measurement errors. Performance are assessed as function of the power of the noise, that is, the variance $σ^{2}$ of the Gaussian disturbance.

4.1. A Novel Metric for Accuracy Evaluation

In order to reduce the error introduced in the measurement process, a smoothing filter is typically used. However, simple approaches actually used in real networks just rely on techniques that ignore the statistical properties of the data; in particular, data are often processed through a sliding window where measurements are low-pass filtered to reduce the disturbance, similarly to a basic image denoising algorithm. Although this approach provides reasonable results on average, it may negatively have an impact on the values that deviate from the mean. The latter are conversely the most interesting data in a number of monitoring applications, for instance, to detect extreme events. As a result, it is important to measure the ability of a smoothing approach to retain the low-probability characteristics of the monitored field.

On the other hand, an algorithm that focuses too much on extreme events tends to lose its ability to “denoise” areas where the field is almost stationary, that is, plateaux with very similar values that fluctuate just because of the intrinsic uncertainty introduced by the measurement process. To correctly evaluate the overall performance, thus, it is necessary that the performance metric takes into account both these conflicting objectives, that is, ability to retain low-probability values and ability to simultaneously ensure a satisfactory smoothing.

To this aim, in the following, we propose a novel compound metric based on a weighted version of the Frobenius norm. More precisely, denoting by M the matrix of the true field values $m_{i, j}$ , where $(i, j)$ indicates a possible location of the monitored area (in this section we use matrix notation to link more clearly the measured value to the corresponding location where it has been measured; therefore we need a pair of indices, instead of the single index i formerly used to denote node-i-related quantities.), the overall error in the reconstructed field $\hat{M}$ (which is function of the measurement matrix X) can be evaluated as

\begin{matrix} D_{1} (\hat{M}) = {‖\hat{M} - M‖}_{F}, \end{matrix}

(23)

where

{‖\cdot‖}_{F}

is the Frobenius norm, that is, the sum of the squares of all elements of the matrix argument. In order to spotlight deviations in particular points, however, a weighting matrix must be introduced. We therefore define the following metric:

\begin{matrix} D_{W} (\hat{M}) = {‖W \circ (\hat{M} - M)‖}_{F}, \end{matrix}

(24)

where ∘ is the Hadamard (entry-wise) matrix product and W is a matrix of weights. Clearly, the metric

D_{1} (\cdot)

is obtained as particular case of

D_{W} (\cdot)

for all-ones matrix W.

Based on the two metrics above, we propose the following compound metric:

\begin{matrix} D (\hat{M}) = {‖\hat{M} - M‖}_{F} {‖I (M) \circ (\hat{M} - M)‖}_{F}, \end{matrix}

(25)

where

I (M)

denotes the matrix of self-information for the field, that is,

I_{i, j} = l o g (1 / P_{i, j}) = - l o g P_{i, j}

, where

P = [P_{i, j}]

is (an estimate of) the probability distribution of the field. This means that P is such that

\sum_{i = 1}^{N} \sum_{j = 1}^{N} P_{i, j} = 1

. More precisely, P can be taken as the normalized histogram of the true values of the field M resulting from some binning rule. (Since this information is not available to the algorithm, it is not possible to design an estimation scheme that minimizes the metric D. The use of D here is only for the purpose of performance assessment.)

The definition in (25) can be rewritten as the product of the distance between the real field and the reconstructed one in both the original and the transformed (weighted) spaces:

\begin{matrix} D (\hat{M}) = {‖\hat{M} - M‖}_{F} {‖{\hat{M}}^{'} - M^{'}‖}_{F}, \end{matrix}

(26)

where

{\hat{M}}^{'} = I (M) \circ \hat{M}

and

M^{'} = I (M) \circ M

can be interpreted as the transformed versions (through

I (M)

) of

\hat{M}

and M, respectively; this highlights the ability of the metric to account for both aspects discussed above.

In Figure 3(a), we show a clarifying example: the true values of the field M change very slowly but for a region near the left-bottom corner, where an event has occurred that perturbs the field (e.g., accidental overheating). The histogram of the field values is reported in Figure 3(b): it shows that more than 90% of the values fall in the first bin (ranging from 0 to about 2.5), while only few points are distributed over the rest of the span (the inset reports a zoomed version of the distribution tail for a better visualization). In this example, an algorithm that completely misses to accurately represent higher values but retains the lower ones would have a very good score for the metric $D_{1}$ ; however, it would perform badly through the lens of the metric $D_{I}$ (i.e., $D_{W}$ with $W = I (M)$ ), which weighs more heavily the points where the information is high. Accurate estimation of values deviating from the expected level is in fact of utmost importance in many cases, namely, for early detection of extreme events, with alarms typically triggered when exceeding a threshold. On the other hand, what we can measure is not M but rather a noisy version X as the one reported in Figure 3(c): the algorithm, thus, must be able to also smooth out the variations due to noise so as to not increase the false alarm rate due to values accidentally exceeding the threshold. To this aim, the metric $D_{1}$ is more appropriate as indicator of the goodness of the smoothing process. As mentioned, these aspects are both taken into account in D, resulting in a compound metric that reflects both the estimation root mean square error and the accuracy in representing less likely (hence more informative) values.

Figure 3

Example of monitoring application scenario: (a) ground truth (matrix M), (b) histogram of the values (leading to matrix P), and (c) noisy version (measurement matrix X) for $σ^{2} = 1$ .

We have applied the proposed Empirical Bayes approach to a sensor network deployed on the field shown in Figure 3(a). Following the cluster-based approach described above, we have run the algorithm on the noisy version resulting from the measurement process as function of the noise level $σ^{2}$ . We use normalized units in order to have a direct mapping between locations and index pairs in the matrix notation. Nodes embed a technology with communication range limited to a few units: in particular, we considered measurements from nodes closer than 3 units in range, which resulted in a minimum number of neighbors equal to 11; that is, $N_{i} (t) \geq 11$ .

As mentioned, as competitor, we consider the conventional moving-average filter, which is a typical smoothing (“denoising”) technique, to improve the quality of sensed images [39]. Such an algorithm uses a spatial mask centered in each location $(x, y)$ , with constant weights, and performs the local average by moving across the whole field; this corresponds to the two-dimensional convolution between the matrix X representing the measured field and the mask and produces a low-pass filtered output where noise has been reduced, while sharp edges have been better preserved compared to nonconstant weight masks.

The result of the filtering for the case of Figure 3(c), which is related to $σ^{2} = 1$ , is reported in Figure 4 together with the ground truth and the competitor moving average (on the same data and neighbor sets). The figure clearly shows that the proposed approach is able to smooth the measured data, reconstructing (an estimate of) the field in a way that also preserves the information about low-probability events, that is, the ones with higher values but occurring much less frequently. On the contrary, the moving average algorithm treats all locations in the same way, smoothing out too much the left-bottom corner; as a result, it completely misses the higher values of the field. This behavior is reflected by the metric D, which takes on a much smaller value for the Empirical Bayes than for the competitor. The proposed approach is able to adaptively “learn” from the data how the field varies in the local region from a statistical point of view, an information that is ultimately condensed in the parameter γ. The values of the latter at the different locations, reported in Figure 5, reveal that the algorithm can adaptively identify the points where the measurement must be prevalent (i.e., $γ \approx 0$ , dark locations), treating them differently from the ones where a smoothing can be safely performed due to the limited variability (i.e., $γ \approx 1$ , light locations). Intermediate conditions are automatically accommodated (shades of gray).

Figure 4

Result comparison for the proposed algorithm versus the moving average, $σ^{2} = 1$ .

Figure 5

Values of γ for the different points of the field, $σ^{2} = 1$ .

4.2. Numerical Results

It is worth noticing that the proposed Empirical Bayes approach leads to a better value of D, but it is also superior on $D_{1}$ and $D_{I}$ individually, as it can be noticed by comparing the individual scores in Figure 4.

By increasing the noise level, it turns out that the proposed algorithm still outperforms the competitor and, furthermore, remains able to retain the highly informative (low-probability) values while ensuring a satisfactory smoothing of the noise. Results for $σ^{2} = 4$ are reported in Figures 6 and 7. The value of D has increased, of course, but is still significantly lower than (the noisy version and) the competitor one. We notice, however, that the gap with respect to the latter is slightly less.

Figure 6

Result comparison for the proposed algorithm versus the moving average, $σ^{2} = 4$ .

Figure 7

Values of γ for the different points of the field, $σ^{2} = 4$ .

To investigate more thoroughly this point, we have evaluated how D varies with $σ^{2}$ . For a better understanding, the metrics $D_{1}$ and $D_{I}$ are also analyzed. We compare the proposed approach against the moving average, considering as reference the values of the metrics calculated on the raw measurements (X). In Figure 8, we report the resulting curves (from (a) to (c), D, $D_{I}$ , and $D_{1}$ ): as clearly visible, the proposed approach outperforms the competitor in the whole span of $σ^{2}$ . The two algorithms are comparable only in terms of $D_{1}$ at very high noise level when the measurements are unreliable; hence, no great improvement is possible in terms of denoising compared to a low-pass filtering. However, it is also evident that the Empirical Bayes approach always exhibits the best accuracy in terms of $D_{I}$ . In other words, while the overall performance obviously degrade as $σ^{2}$ increases, the moving average algorithm shows a roughly constant $D_{I}$ , meaning that even for very accurate measurements (low $σ^{2}$ ) such an algorithm is not able to retain the low-probability values. Therefore, it comes as little surprise that when measurements are accurate enough the moving average approach is worse (in terms of D) than simply taking the raw measurements. The proposed approach, conversely, yields best performance by automatically adapting its parameter γ to the different contexts.

Figure 8

Comparison between the raw measurements, proposed Empirical Bayes algorithm, and the competitor algorithm (moving average): from (a) to (c), the metrics D, $D_{I}$ , and $D_{1}$ for varying $σ^{2}$ .

Finally, the case of multiple “hot spots,” that is, multiple sources of far-from-average spikes is analyzed in Figure 9. In addition to the already observed properties of the proposed approach, which remain valid, here an additional feature can be observed in terms of resilience to masquerading effects. Indeed, when multiple sources of far-from-average events are present, conventional filtering techniques like the moving average do not have enough resolution power even for moderate noise level. This is due to the averaging of points in spatial proximity, so that sources not sufficiently separated become indistinguishable and collapse onto a same blurred blob (as in the top right corner of the field in Figure 9). Conversely, the proposed approach is able to detect all sources and also to maintain the proportionality in their intensity while ensuring denoising.

Figure 9

Result comparison for the proposed algorithm versus the moving average in case of multiple far-from-average events, $σ^{2} = 1$ .

5. Conclusions

The problem of improving the accuracy of the measurements collected by a sensor network has been addressed. Aiming at simplicity and cost-effectiveness, which are of utmost importance in application contexts, where sensor networks are often deployed, a low-complexity automatic approach has been proposed, which does not require manual setting of parameters nor recalibration. By following an adaptive (Empirical Bayes) rationale, the algorithm is able to improve the estimation accuracy by leveraging cooperation between nodes. Remarkably, it can provide a “denoised” version of the monitored field without losing accuracy in detecting less probable values. Using a novel performance metric, the capability to both reduce the measurement error and retain highly informative characteristics has been quantified, revealing that the proposed approach can outperform conventional low-pass filtering.

Footnotes

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

References

Akyildiz

I. F.

Sankarasubramaniam

Cayirci

A survey on sensor networks

IEEE Communications Magazine 2002 40 8 102 114

10.1109/mcom.2002.1024422

2-s2.0-0036688074

Dingcheng

Zhenghai

Lin

Tiankui

Online bayesian data fusion in environment monitoring sensor networks

International Journal of Distributed Sensor Networks 2014 2014 10

945894

10.1155/2014/945894

2-s2.0-84899538732

Tsai

Y.-R.

Chang

C.-J.

Cooperative information aggregation for distributed estimation in wireless sensor networks

IEEE Transactions on Signal Processing 2011 59 8 3876 3888

10.1109/TSP.2011.2153847

MR2847708

2-s2.0-79960398929

Agarwal

Jagannatham

A. K.

Distributed estimation in homogenous poisson wireless sensor networks

IEEE Wireless Communications Letters 2014 3 1 90 93

10.1109/wcl.2013.112313.130696

Kadri

Yaacoub

Mushtaha

Abu-Dayya

Wireless sensor network for real-time air pollution monitoring

Proceedings of the 1st International Conference on Communications, Signal Processing and Their Applications (ICCSPA' 13)

February 2013

10.1109/iccspa.2013.6487323

2-s2.0-84876051651

Djuric

Prsa

Kasas-Lazetic

Information network for continuous electromagnetic fields monitoring

International Journal of Emerging Sciences 2011 1 4 516 525

Hasenfratz

Sturzenegger

Saukh

Thiele

Spatially resolved monitoring of radio-frequency electromagnetic fields

Proceedings of the 1st International Workshop on Sensing and Big Data Mining (SenseMine ′13)

November 2013

Rome, Italy

ACM

1 6

10.1145/2536714.2536719

2-s2.0-84897369937

Nevat

Peters

G. W.

Collings

I. B.

Random field reconstruction with quantization in wireless sensor networks

IEEE Transactions on Signal Processing 2013 61 23 6020 6033

10.1109/TSP.2013.2280442

MR3138798

2-s2.0-84888106726

Bahceci

Khandani

A. K.

Linear estimation of correlated data in wireless sensor networks with optimum power allocation and analog modulation

IEEE Transactions on Communications 2008 56 7 1146 1156

10.1109/tcomm.2008.060396

2-s2.0-49049115725

10.

Wark

Swain

Crossman

Valencia

Bishop-Hurley

Handcock

Sensor and actuator networks: protecting environmentally sensitive areas

IEEE Pervasive Computing 2009 8 1 30 36

10.1109/mprv.2009.15

2-s2.0-70350339933

11.

Rundel

P. W.

Graham

E. A.

Allen

M. F.

Fisher

J. C.

Harmon

T. C.

Environmental sensor networks in ecological research

New Phytologist 2009 182 3 589 607

10.1111/j.1469-8137.2009.02811.x

2-s2.0-65249141609

12.

Barrenetxea

Ingelrest

Schaefer

Vetterli

Wireless sensor networks for environmental monitoring: the SensorScope experience

Proceedings of the International Zurich Seminar on Communications (IZS ′08)

March 2008

Zurich, Switzerland

IEEE

98 101

10.1109/izs.2008.4497285

2-s2.0-51349121203

13.

Jiang

Zhou

Liu

Wang

Wireless sensor networks for forest environmental monitoring

Proceedings of the IEEE 7th International Conference on E-Science

December 2011

14.

Corke

Wark

Jurdak

Valencia

Moore

Environmental wireless sensor networks

Proceedings of the IEEE 2010 98 11 1903 1917

10.1109/jproc.2010.2068530

2-s2.0-77958152088

15.

Guevara

Barrero

Vargas

Becerra

Toral

Environmental wireless sensor network for road traffic applications

IET Intelligent Transport Systems 2012 6 2 177 186

10.1049/iet-its.2010.0205

2-s2.0-84866853254

16.

McDonald

Geraghty

Humphreys

Farrell

Assessing environmental impact of transport noise with wireless sensor networks

Transportation Research Record 2008 2058 133 139

10.3141/2058-16

2-s2.0-56749107550

17.

Coluccia

Notarstefano

Distributed Bayesian estimation of arrival rates in asynchronous monitoring networks

Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP' 14)

May 2014

Florence, Italy

5050 5054

10.1109/icassp.2014.6854564

18.

Wang

T.-Y.

Chang

L.-Y.

Chen

P.-Y.

A Collaborative sensor-fault detection scheme for robust distributed estimation in sensor networks

IEEE Transactions on Communications 2009 57 10 3045 3058

10.1109/tcomm.2009.10.080244

2-s2.0-70449489554

19.

Coluccia

Notarstefano

A hierarchical Bayes approach for distributed binary classification in cyber-physical and social networks

Proceedings of the 19th World Congress of the IFAC

August 2014

20.

Lo Re

Milazzo

Ortolani

A distributed Bayesian approach to fault detection in sensor networks

Proceedings of the IEEE Global Communications Conference (GLOBECOM' 12)

December 2012

Anaheim, Calif, USA

634 639

10.1109/glocom.2012.6503184

2-s2.0-84877647852

21.

Coluccia

Notarstefano

Distributed estimation of binary event probabilities via hierarchical Bayes and dual decomposition

Proceedings of the 52nd Annual Conference on Decision and Control (CDC ′13)

December 2013

Firenze, Italy

IEEE

6753 6758

10.1109/cdc.2013.6760959

22.

Cho

J. J.

Ding

Chen

Tang

Robust calibration for localization in clustered wireless sensor networks

IEEE Transactions on Automation Science and Engineering 2010 7 1 81 95

10.1109/TASE.2009.2013475

2-s2.0-73849095924

23.

Coluccia

Reduced-bias ML-based estimators with low complexity for self-calibrating RSS ranging

IEEE Transactions on Wireless Communications 2013 12 3 1220 1230

10.1109/TWC.2013.011713.120557

2-s2.0-84875587066

24.

Coluccia

Ricciato

RSS-Based localization via bayesian ranging and iterative least squares positioning

IEEE Communications Letters 2014 18 5 873 876

10.1109/lcomm.2014.040214.132781

2-s2.0-84901485197

25.

Coluccia

Ricci

A tunable W-ABORT-like detector with improved detection vs rejection capabilities trade-off

IEEE Signal Processing Letters 2015 22 6 713 717

10.1109/lsp.2014.2364395

26.

Coluccia

Ricci

ABORT-like detection strategies to combat possible deceptive ECM signals in a network of radars

IEEE Transactions on Signal Processing 2015 63 11 2904 2914

10.1109/tsp.2015.2415754

27.

AlRegib

Distributed estimation in energy-constrained wireless sensor networks

IEEE Transactions on Signal Processing 2009 57 10 3746 3758

10.1109/TSP.2009.2022874

2-s2.0-70349640575

28.

Quer

Masiero

Pillonetto

Rossi

Zorzi

Sensing, compression, and recovery for WSNs: sparse signal modeling and monitoring framework

IEEE Transactions on Wireless Communications 2012 11 10 3447 3461

10.1109/twc.2012.081612.110612

2-s2.0-84867899154

29.

Schizas

I. D.

Ribeiro

Giannakis

G. B.

Consensus in ad hoc WSNs with noisy links. I. Distributed estimation of deterministic signals

IEEE Transactions on Signal Processing 2008 56 1 350 364

10.1109/tsp.2007.906734

MR2439836

2-s2.0-37749015518

30.

Barbarossa

Sardellitti

Di Lorenzo

Chellappa

Theodoridis

Distributed detection and estimation in wireless sensor networks

Communications and Radar Signal Processing 2014 2

Academic Press

329 408

31.

Hasenfratz

Saukh

Sturzenegger

Thiele

Participatory air pollution monitoring using smartphones

Proceedings of the 2nd International Workshop on Mobile Sensing

August 2012

Beijing, China

1620

32.

Zhang

Meratnia

Havinga

Outlier detection techniques for wireless sensor networks: a survey

IEEE Communications Surveys and Tutorials 2010 12 2 159 170

10.1109/surv.2010.021510.00088

2-s2.0-77955082590

33.

Richards

Ghanem

Guo

Hassard

Air pollution monitoring and mining based on sensor Grid in London

Sensors 2008 8 6 3601 3623

10.3390/s8063601

2-s2.0-47049108659

34.

Oliveira

L. M. L.

Rodrigues

J. J. P. C.

Wireless sensor networks: a survey on environmental monitoring

Journal of Communications 2011 6 2 143 151

10.4304/jcm.6.2.143-151

2-s2.0-79955503480

35.

De Vito

Fattoruso

Wireless chemical sensor networks for air quality monitoring

Proceedings of the 14th International Meeting on Chemical Sensors (IMCS ′12)

2012

36.

Osborne

M. A.

Roberts

S. J.

Rogers

Jennings

N. R.

Real-time information processing of environmental sensor network data using Bayesian Gaussian processes

ACM Transactions on Sensor Networks 2012 9, article 1

10.1145/2379799.2379800

2-s2.0-84870667283

37.

Lehmann

E. L.

Casella

Theory of Point Estimation 2008 2nd

Berlin, Germany

Springer

MR1639875

38.

Boyd

Vandenberghe

Convex Optimization 2004

Cambridge, UK

Cambridge University Press

10.1017/cbo9780511804441

MR2061575

39.

Mather

Koch

Computer Processing of Remotely-Sensed Images: An Introduction 2011 4th

John Wiley & Sons

A Low-Complexity Approach for Improving the Accuracy of Sensor Networks

Abstract

1. Introduction and Motivations

2. Problem Formulation

3. Empirical Bayes-Based Measurement Filtering

3.1. Minimum Mean Square Error Field Estimation

3.2. Maximum Likelihood Hyperparameter Estimation

3.3. Practical Application of the Algorithm

4. Performance Assessment

4.1. A Novel Metric for Accuracy Evaluation

4.2. Numerical Results

5. Conclusions

Footnotes

Conflict of Interests

References