Image-Based Object Identification for Efficient Event-Driven Sensing in Wireless Multimedia Sensor Networks

Abstract

Applications based on video and image in wireless sensor network are highly attractive due to their wealth of information. In this context, application for object recognition and tracking using image and video information is one the attractive approaches that can be applied for event detection and localization, security processes, following of the rare animal species and control of road traffic, and so forth. However, the implementation of such approach with WMSN requires a specific image processing scheme and efficient transmission protocol. In fact, because of the limited energy of the batteries embedded in motes, the power consumption is the major constraint facing network life time and reliability in WMSN. The efficiency and the validity of these multimedia applications over wireless sensor networks are then dependent on the capabilities of the designer to provide low-power scheme for data processing and energy-aware transmission protocols. This paper presents a contribution to the design of low complexity scheme based on object identification for efficient sensing of multimedia information in wireless multimedia sensor networks. It proposes a new solution and explores the associated performances of this scheme. The presented results in this paper attest the high efficiency to achieve low-power objects identification when implemented in wireless motes.

1. Introduction

Recently, research in the area of WSN is more focusing on the idea of enhancing the capabilities of the WSN in order to provide the end user with useful information gathered in a smart scheme instead of simply sending all the measurements to report about a single event occurrence or information of interest. This idea comes with big interest when talking about multimedia communication over wireless sensor networks. In fact, a wireless multimedia sensor network (WMSN) is built using wireless sensor nodes that integrate multimedia devices such as image and audio sensors enabling to retrieve video or audio data streams. The use of these sensors provides the application with rich visual verification, in-depth awareness of the real scene, and recognition with a lot of other interesting capabilities.

As the multimedia applications are characterized with the production of relatively huge data streams, the power consumption appears to be the major challenge to face when deploying the WSN for the transmission of multimedia data stream. Since the power consumption is proportional to the number of bits to be transmitted that represent the multimedia information, then, as a first reflection, reducing this amount of data will help to reduce the power consumption. Many research works which addressed image compression in WMSN have noticed that classical compression methods for video and image are not suitable to be processed within weak hardware capabilities characterizing the wireless sensor nodes [1, 2].

As a second alternative, to minimize the power consumption and to reduce the data stream amount of images transmitted through the WMSN, it will be recommended to first check if the image captured contains interesting phenomena to be notified to the end user and then to send the minimum of bytes that represents the detected object or the physical phenomena. This scheme will not only reduce the power consumption at the source mote but will also contribute to significantly unload the network from useless information to the end user. Consequently, this method will contribute to increase the network life time. The efficiency of this method to achieve the desired goals will depend on the scheme used to extract the useful information from the video stream at the source mote and its capabilities to detect the occurrence of a specific event. So while this idea looks very attractive to keep a strategic balance between the local processing resources of the mote and power saving, we think that more focus is still required to design an efficient scheme that will be implemented in the multimedia sensor mote.

The main contribution of this paper is to present a new scheme for sensing with multimedia capabilities based on event detection and useful information extraction in order to achieve low power notification to the end user. It focuses on the specification of a low complexity scheme intended to detect and identify a specific target or object based on image sensing and processing. The paper presents also the performances analysis of this scheme and shows the energy gain compared to other methods for multimedia sensing such as methods based on image compressions and transmissions.

In the remaining part of the paper we will first present the specified scheme for object detection and identification. Then we will present its performances to identify the target and it invariance for different parameters. We also will address the performances of this scheme when processed on MICA2.

2. Related Works and Motivation

WMSNs were deployed for remote object detection. Some research contributions were developed to detect a new object in the background of the video scene and then to remotely notify that to the end user. The image of the detected object may be then transmitted through the network according to the wish of the application. Tu et al. in [3] presented a new scheme to detect change in the background of a scene using WMSN. They have based their approach on the detection of a significant difference in the background. The performance of this scheme was addressed to show a low complexity at the implementation level and a high accuracy in the detection of new objects. Nevertheless, this solution was not specified to ensure the identification of a specific object that can be applied to enable potential application related to different areas.

Stojkoska et al. in [4] were interested to develop a solution for motion detection over WMSN that can be used in the area of surveillance. They specified a new scheme where the captured images will be divided in a set of small size blocks. This subdivision helps to detect using a comparison process a change compared to the reference subblock in the background of the scene. This approach helps to save energy and is very attractive for low-bandwidth communication. It can be well applied in the context of object detection; however, it was not developed to identify a specific object target.

Li et al. in [5] studied a new scheme to extract the information that represents an object with the use of clustering approach. The idea was to define the contour of this object using a snake algorithm for high level of accuracy. This scheme proved high efficiency with low load of processing but it was not intended for embedded system. Then further works has to be carried to check its adequacy for the case of WMSN.

A shape features methods for matching to identify a target objects is one of the most interesting schemes. The shape context has been presented by Belongie et al. in [6]. They developed a new solution for matching based on distance between shapes. This proposed scheme demonstrated good results regarding the simplicity and accuracy. An interesting survey in the image feature extraction is presented by Yang et al. in [7]. The authors divided the approaches to six main approaches (shape signature, polygonal approximation, space interrelation feature, moments, scale-space methods, and shape transform domains).

Ko and Berry in [8] was interested in the issue of low-power wireless image sensor. They developed an algorithm to minimize power consuming by reducing the data acquired for the images. This algorithm is based on the idea of minimizing the communication between nodes in the processing of the captured image and sharing the data result.

Another related research work has been presented in [9]. It concerns object motion detection in WMSN. The presented idea is based on background extraction. The authors divided the image into $m * m$ blocks of pixels using the expectation and variance that will help in the extraction of the background. Two steps are then used to detect the movement of the object:

(i)

background extraction and converting the color into grayscale pixels to make it easier to apply the extracting algorithm,

(ii)

using the threshold segmentation with the application of new formula for defining the threshold to detect the object.

This method seems to be very useful in the context of WMSN and might fit well with the requirement of our application for object identification.

In [10] Vasuhi et al. presented a new method for object detection and tracking for multiple objects in WMSN. They used for object detection the following steps:

(i)

background techniques,

(ii)

Haar Wavelet for the feature extraction method,

(iii)

the joint boosting algorithm for the classification with modified algorithm with less computation.

While this method looks to be very interesting, the authors did not addressed the complexity and the power consumption of this scheme in the context of WMSN to prove the adequacy to the constraints of these systems.

The authors in [11] presented a two-hop clustered image transmission scheme aiming to minimize the power consumption in WMSN. The approach is based on two types of sensor nodes: normal sensor nodes and camera-equipped nodes. The camera-equipped node acts as a cluster head. It distributes the compression's tasks in the nodes of the cluster. This approach unloads the camera node of the compression tasks. This later will be responsible to aggregate the coded data and transmit it through multihops to the end user. The other node will participate in the distributed compression scheme and in the transmission process. The idea of distributed compression scheme looks very attractive. It balances the power consumption over the nodes and therefore extends the whole network life time. However, the idea of transmitting a stream of images still exhausts the networks energy and increases the contention and congestion in the network. We believe that such compression scheme can be applied to transmit just the image of the specific target required by the end user.

In [12] Pham and Aziz proposed a new architecture for object extraction and reliable communication protocol over wireless sensor network. The approach is based on the concept of background subtraction for new object detection. The object extraction scheme is based on threshold to determine the set of pixels that represent the object from the captured image. The whole scheme has been implemented in a hardware platform using FPGA circuit. The results show low power consumption and short time processing. It looks very attractive for the WMSN application. However, we have to note that the implementation cost is higher than software implementation; moreover, the flexibility and the scalability of the solution are often reduced.

In [13–16], the authors studied the problems of new object detection and tracking by the use of WMSN. Different methods were specified and designed to ensure this objective. But still the problem of specific target object identification was not studied. It requires more effort to define the efficient way to perform this application through wireless multimedia sensor networks.

3. General Approach of the Proposed Multimedia Communication in WSN

In our proposed solution, the network is composed of a set of multimedia sensors. Each multimedia sensor is designed with smart capabilities to be remotely configurable, by the end user, in order to identify a specific target. When detecting a new object, the sensor has to decide if it is the tracked object, to extract the useful information that represents this object and to notify the end user. The main challenge in the design this approach is the specification of a low complexity scheme to identify targets with low-power and low-memory occupancy cost. In depth, the wireless multimedia sensors are with limited memory space. So, for optimal multimedia processing in the sensor, the designer has interest to keep memory occupancy as low as possible. In our proposed new scheme, the end user has to generate the object descriptor, to be identified in the area of surveillance. The data representing this object will be transmitted to the dedicated wireless multimedia sensor through the network using a specific protocol for dynamic configuration. The object descriptor used for identification has to be represented with very limited bytes. In fact, the protocol ensuring this remote configuration will get low overhead to communicate through the network. This approach favors scalability for a dynamic run-time target objects change (Figure 1).

Figure 1

General scenario for object identification.

In this scheme, the wireless multimedia sensor receives through the network, from the remote end user, the descriptor of object to be tracked. This descriptor will be locally loaded in the sensor memory as a reference for target recognition. The algorithm running in the wireless multimedia sensor should be able to detect a new object that appeared in its surveillance area. It should decide if it is the target object. Then, according to the requirement of the application running at the user, the sensor might decide to notify with simple byte the object detection or it might send the descriptor or it extracts the useful information representing the scene and sends it to the end user.

4. Specification of the Scheme for Object Detection and Identification

The well-known schemes developed for object identification based on image processing cannot be directly applied in the area of WMSNs. In depth, a specific tuning should be applied to these algorithms in order to be used in the context of WMSN. Our contribution, at that level, is to specify a new scheme that is inspired from algorithms developed basically 13 for computer vision application to meet the constraints of WMSNs. The structure of the proposed scheme for object detection and identification is described by the following sequential steps (Figure 2). The main consideration in the specification of this scheme is to reduce the following:

(i)

the memory usage and mainly the feature vector that describes the object,

(ii)

the number of arithmetic operations in order to achieve low processing complexity.

Figure 2

General structure of the object identification scheme.

4.1. New Object Detection

The detection of a new object is based on the approach of background subtraction. The used approach divides the image on a set of blocks of $8 * 8$ pixels and then the difference at the level of pixels is processed between corresponding blocks of the new image and the image of the background. If the whole difference is greater than a certain threshold ( $T_{ther}$ ) a new object is supposed to be detected.

If we suppose that the background for the block j of the image is $β_{j}$ . Then for the new captured image $(n)$ , this backround value will be $β_{n} (j)$ and $β_{n - 1} (j)$ for the previous one. The condition to consider that a new object appears in in the new image is expressed by

\begin{matrix} \sum_{j = 1}^{k} |β_{n} (j) - β_{n - 1} (j)| > T_{ther} . \end{matrix}

(1)

4.2. Object Extraction

When a new object is detected the set of image blocks containing the object will be isolated to reduce the processed size of useful information. In fact this step reduces the memory, the memory occupancy, and the energy consumption.

Let us suppose that the whole size of the image is $S_{im} = (m * n)$ pixels and the size of one block that contains pixels of the detected object is $S_{B} = (m_{B} * n_{B})$ pixels. Then if the object is represented through N blocks, the gain per-pixel in terms of memory storage, energy, and processing time will be proportional to $(1 - α)$ where α is given by

\begin{matrix} \frac{({N * S}_{B})}{S_{im}} = α < 1 . \end{matrix}

(2)

So the more the size of the block is bigger the more we increase the time and the energy consumption related to this step. However, the tiny size of the blocks will create a problem of background change detection as expressed in (1).

When the object is isolated and the set of useful blocks of pixels is extracted, the new image is then transformed to the binary level that allows applying a region growth segmentation method to determine the shape.

4.3. Features Extraction

The feature extraction process of the detected object is the main task on which the performances of the whole scheme for identification depend. The main keys considered on the design and the specification of this task are mainly as follows.

(i)

Low complexity method: the main reason for high power consumption is the high number of arithmetic operations. So the more we reduce the number of related arithmetic operations the more we expect low number of running clock cycles in the motes and as per consequence low power consumption.

(ii)

Keeping the feature vector that represents the object descriptor as short as possible: in fact due to the memory constraints of the motes and the low bandwidth available in the WMSN, it is preferable to keep the shortest vector for the feature that allows identifying the target efficiently.

In the literature review [6, 7] it was proved that the shape-based recognition methods are very attractive to achieve low complexity and high efficiency that match very well the previous indicated considerations. In our approach, we have focused on two shape-based feature extraction methods.

4.3.1. Centroid Distance Feature Extraction Method

The features extraction is based on the use of the centroid function. This method is one of the shape signature methods that allow extracting the main features from the shape. If a shape is represented by its region function in (3), define the centroid $(g_{x}, g_{y})$ [7]:

\begin{matrix} g_{x} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}, \\ g_{y} = \frac{1}{N} \sum_{i = 1}^{N} y_{i} . \end{matrix}

(3)

The centroid distance function is expressed by the distance of the boundary points from the centroid $G (g_{x}, g_{y})$ of the shape as follows:

\begin{matrix} r (t) = {({[x (t) - g_{x}]}^{2} + {[y (t) - g_{y}]}^{2})}^{1 / 2} . \end{matrix}

(4)

$r (t)$ is defined for a set of numbers N, $t \in [1, N]$ .

This method is invariant to translation and when normalized it is also invariant to the scale.

The feature vector represents the distance to the centroid in relation to the angle θ calculated as follows:

\begin{matrix} θ (n) = \arctan \frac{y (t) - y (t - w)}{x (t) - x (t - w)}, \end{matrix}

(5)

where w is a small window to calculate the angle θ to be more accurate and to give good accuracy of the shape features.

This method is highly adequate in terms of complexity and memory requirement to the wireless sensor constraints.

4.3.2. The Area Function Extraction Method

The area function is one of the shape signature methods easy to process with a low complexity and low-memory requirements. It calculates the area of infinitesimal triangle area defined by three points: two at the shape and the centroid $G (g_{x}, g_{y})$ , $p_{1} (A_{x}, A_{y})$ , and $p_{2} (B_{x}, B_{y})$ of the shape as follows:

\begin{matrix} A (t) = |\frac{g_{x} (A_{y} - B_{y}) + A_{x} (A_{y} - B_{y}) + B_{x} (A_{y} - B_{y})}{2}| . \end{matrix}

(6)

The given signature of this method $A (t)$ ( $t \in [1, N]$ ) is invariant to translation. When the obtained values of $A (t)$ are normalized it becomes invariant to the scale. For the rotation invariance, the starting point $p_{0} (x_{0}, y_{0})$ can be defined as the closest to the centroid $G (g_{x}, g_{y})$ and then processing clockwise (or the opposite) while searching the next point in the shape and then calculating the corresponding area to the formed triangle.

While this method looks simple and highly adequate to low processing resources, it presents weaknesses to efficiently identify object within sophisticated shapes.

We also studied the adequacy of a third method that is not based on shape features but on grayscale histogram instead.

4.3.3. Histogram Feature Extraction Method

The histogram of an image represents the distribution of the pixels in the image over the gray-level scale. This feature can be easily extracted and might be applied in the context of object identification. The histogram method has been widely applied in the context of image processing for region-based classification and recognition [17]. In this paper we apply the grayscale histogram signature for object identification as an attractive low complexity scheme. Despite this attractive characteristic, the histogram signature is very sensitive to the luminescence and to the noises [18]. But it is invariant to rotation and translation.

The idea is to isolate the new detected object from the background and to determine the distribution of pixels value. This feature can be applied to match with the reference signature to decide the detection of the target object.

4.4. Matching

We have used Spearman rank correlation or Spearman's Rho [19, 20] to match the similarity between the signatures of the two objects. We compared the signature of the new detected object to the reference one that is supposed to be saved on the memory of the mote. Spearman's Rho $(ρ)$ is defined for a given two distances vectors, $D_{1} = {d_{0}, d_{1}, d_{3}, \dots, d_{n}}$ and $D_{2} = {d_{0}, d_{1}, d_{2}, \dots, d_{n}}$ , and two ranking vectors, $R_{1}$ and $R_{2}$ , for the two distances vectors, $D_{1}$ and $D_{2}$ , by

\begin{matrix} ρ = 1 - \frac{6 \sum_{i = 0}^{n} {(R_{1} (i) - R_{2} (i))}^{2}}{n (n^{2} - 1)} . \end{matrix}

(7)

$R_{1} (i)$ and $R_{2} (i)$ represent ranks of the two vectors $D_{1}$ and $D_{2}$ and n is the number of distances. Spearman's ρ is normalized between −1 and 1. In the case of maximum similarity between two rankings $ρ = 1$ and in the case of maximum dissimilarity $ρ = - 1$ . The full interpretation of the correlation coefficient values is given in [19, 20].

Ayinde and Yang in [20] have compared Spearman's correlation coefficient and Pearson's correlation coefficient under considerable intensity differences between images, occlusion, and other random differences between images. He demonstrated that Spearman's ρ reliably produced a higher discrimination power than Pearson's correlation coefficient. In the same trend, Muselet and Trémeau in [19] demonstrated that the rank correlation of color components captured under different scene illuminations remains fairly unchanged and represents a robust metric for object recognition. So we think that this method is very accurate and can be well applied in the context of our application.

4.5. End User Notification

When the object is detected, the event will be notified to the end user. At that level the mote has to process the notification according to the end user requirements. A big gain in time and power consumption can be achieved at this step. In depth, the notification can be just sending few bits or the object feature vector or also an extraction of the image of the object.

5. Performances Analysis the Proposed Scheme

We studied the capability of the proposed scheme to ensure object identification for a specific target under different conditions. The idea is to check out the invariance of the proposed method mainly for translation, for rotation, and for the scale.

The proposed scheme, with different method for features extraction, was programmed using Matlab in order to check out its performances. Figure 3 shows the target object and its shape feature based on the centroid distance and the area signatures. It also sums up the image of the bird (target object) taken in different positions and the associated shape signature compared to the reference one.

Figure 3

Shape signatures of centroid distance and area function methods for object identification (size vector features = 380).

Grayscale images (64 $*$ 64 pixels and 128 $*$ 128 pixels 8 bpp) were used (as well as other sizes) to study the performances of the scheme successfully detecting and identifying the target.

In Figure 3, the obtained signatures are more correlated when applying the centroid distance features. The area signature presents a big difference even when the detected object is almost within the same position as the original one (case of signature with image test 1). The low performances of area signature to achieve efficient identification are well reflected in the matching ratio expressed in Table 1 (often less than 50%).

Table 1

Object identification using shape features with centroid distance and area schemes.

	Matching rate for object identification using Spearman correlation approach
	Centroid distance-based scheme			Area-based scheme
	Size of the feature vector
	360	180	128	360	180	128
Test image 1	92%	92%	94%	55%	53%	56%
Test image 2	94%	94%	96%	20%	24%	23%
Test image 3	87%	92%	92%	35%	39%	35%
Test image 4	77%	79%	81%	24%	23%	26%
Test image 5	70%	64%	62%	30%	24%	31%

Table 1 sums up the matching ratio using the two shapes features extraction methods when applying Spearman's Rho for matching. The results show that in general the scheme based on centroid distance signature was able to identify the target much better than the one based on area signature. The percentage of matching for these different positions is often greater than 60% that corresponding to high correlation with the reference. This reflects the capability of our scheme to achieve efficiently the target object detection and identification.

We also note from Table 1 that when we change the size of the feature's vector, the capability of the algorithm to identify the target will not dramatically change. In fact, we noted a maximum of variation in the percentage of identification of about 8% with the vector size change. In all the cases, when the vector size is decreased from 360 to 128, we did not notice a change in the final decision of identification. The proposed method based on centroid distance features gives high recognition ration with a low-data representation of the feature. This characteristic is very attractive since the more the vector is kept with small data representation the less we have memory constraints. In addition, for the dynamic target change, the small size of the vector allows to meet time and power constraints while updating the remote WMS node with the object to be detected.

We also checked out the capability of the proposed scheme to identify the target using grayscale histogram signature of the detected object. Figure 4 sums up the capability of the scheme to identify the target compared to the reference one. While the test images show the bird in different positions, the identification results were always relatively high (greater than 65%).

Figure 4

Histogram signature-based methods for object identification (vector features 256).

As being discussed this method based on the histogram signature is very sensitive to the luminescence intensity and to the noise effect. Furthermore, the change in the size of the feature vector might give a wrong result. These drawbacks limit the application of this method to specific environments only where the luminescence is not expected to change.

6. Implementation for TinyOS Based Platforms

We estimated the performance of the proposed scheme for wireless sensors (MICA2 sensors). We were interested to evaluate the performances of centroid distance-based scheme and the grayscale histogram-based scheme using images at the grayscale level (128 $*$ 128 8 bpp) and (64 $*$ 64 pixels 8 bpp). The feature's vector was considered with a size of 128 values. The area-based scheme was not studied due to its low identification performance.

The scheme for object detection and identification was implemented in the emulators for microcontrollers. We used WinAVR for ATmega128L microcontrollers (MICA2). This tool allows checking up the number of clock cycles for Atmel series. Then using the characteristics of the microcontrollers for MICA2, the power consumption and the processing time were estimated.

Table 2 sums the results related to the power consumption and the time processing of the scheme for object detection and identification when implemented on MICA2 wireless motes. The results detail the required number of clock cycles for the different blocks of the scheme (Figure 2). We note that the main power and time consuming part in this scheme are related to the new object detection and the segmentation for the object extraction from the background. In depth, the detection of a new object involves a processing of all the pixels of the image, while the segmentation for the object extraction will be applied for the set of blocks containing the new object. In our approach we applied segmentation method by region growth that gives good results, but we think that this method can be enhanced to reduce more time and power consumptions.

Table 2

Evaluation of the proposed scheme for object identification based on centroid distance method implemented on MICA2.

	Processing (image 64 $*$ 64 pixels 8 bpp)
	Clock cycles	Time (s)	Energy (mj)
New object detection	2505220	0.34	7.50
Extraction and segmentation	7926851	1.08	23.76
Features extraction	3540661	0.48	10.56
Whole scheme (without notification)	13972732	1.9	41.8

Table 3 illustrates the results of the same parameters for the scheme based on the use of the grayscale histogram signature for the object identification. As expected, we can note that the measured values get increased with the size of the processed image. However, the consumed energy and the required processing time still remain very low. For 64 $*$ 64 pixels the identification scheme based on grayscale approach showed short time and lower power consumption than the one based on centroid distance features. This is due to the additive processing tasks while computing the vector of features. However, we have to remind here that the centroid distance feature vector is less sensitive to the luminescence and less variant with the environment conditions that elect it to be a stable and scalable method. In addition, the size of the feature vector used for identification can be variable in the case of the centroid distance-based scheme. However, it is not the case for the method of grayscale histogram feature vector since the change in the size of signature vector will deeply change the identification result.

Table 3

Evaluation of the proposed scheme based on grayscale histogram for MICA2 wireless motes.

	Images sizes (^****pixels 8 bpp)	Processing
	Images sizes (^****pixels 8 bpp)	Clock cycles	Time (s)	Energy (mJ)
Whole scheme (detection + object extraction and segmentation + features extraction) (without notification)	64 * 64 128 * 128	3672354 13436021	0.498 1.830	10.956 40.260

The main baseline of our approach is to extend network life time with the reduction of the multimedia data stream that has to be submitted in the WMSN. The proposed scheme will be processed locally in the multimedia sensor. But the end user has to be notified through the WMSN. At that level of notification process, the requirements for the end user application should be considered. In fact the notification can be ensured as follows.

(i)

Transmission of few bits (typically 8 bits): this is in case of number of occurrences of the event.

(ii)

Vector of features: this might be used for tracking, recognition, or further classification at the end user.

(iii)

Transmission of limited data stream that represents the useful information: this is for database image storing application or any further processing at the end user level.

Table 4 presents the results of these different approaches for notification.

Table 4

Notification to the end user.

	Data stream for notification (bits)	Time processing MICA2 $[$ s $]$	Energy for MICA2 $[$ mJ $]$
Transmission of 1 byte	8	0.0002	0.014
Transmission of the feature vector (128 bytes)
Centroid distance (128 bytes)	1024	0.027	1.870
Histogram (256)	2048	0.051	3.584
Extraction of useful information and transmission	2084	0.054	3.740

It is clear from the results shown in Table 4 that according to the type of notification we can gain time and energy compared to the whole transmission of the whole image even with the application of a compression scheme [1, 2].

The presented approach for image-based event detection used in the context of multimedia sensing has very attractive features when compared to similar methods (Table 5). The time and power consumptions related to the processing of the studied scheme are more suitable for use in WMSN. In fact, it was shown in [1] that the energy cost related to the transmission of an image (128 × 128, 8 bpp) is about 860 mJ and the time processing is around 13.5 s. As, shown in the previous results, our scheme outperforms well the basic approach for simple image transmission. In addition, the transmission of the full image will significantly load the network and consequently decrease its performances as well as its life time.

Table 5

Comparison with other solutions.

Network type	Wireless multimedia sensor networks
Network type	Distributed compression scheme [11]	Quad tree decomposition for compression [22]	AIS-based solution [23]	Prioritization scheme of the block of the image [21]	Object extraction scheme [12]	Our solution
Processing approach
Local (L)/distributed (D)	D	L	D	L	L	L
Scheme-based approach
Image compression		*
Useful data extraction					*	*
Quad tree decomposition		*
Image block prioritization				*
AI method			*
Implementation approach
Software	*	*	*	*		*
Hardware					*
Performances
Scalability and dynamic target change	*	—	*	—	—	*
Comments about complexity and adequacy for low power processing	High complexity High energy	High energy	High algorithm complexity High energy	High energy consumption and time processing	High cost of implementation	Low-power consumption

In [11] a distributed compression scheme was applied in WMSN. The energy consumption for the processing and the transmission of an image 512 $*$ 512 pixels was very high (1.4 J). We think that the idea of extracting the useful information remains more interesting than the transmission of the image even with compression scheme. However, we believe that our solution fits well the concept of distributed processing. It seems that if applied to our scheme it will balance the power consumption and it will extend the whole network life time.

When compared to the hardware solution for object detection and useful information extraction [12], we think that the power consumption achieved with hardware platform is much better. However, the hardware implementation is less flexible and the cost is very high. We think that the CMOS implementation of multimedia schemes for communication in WMSN remains an attractive trend.

The solution proposed in [21] based on the prioritization scheme of the block of the image achieves an interesting low-bit rate even better than the application of classical compression algorithms. The estimated energy is higher than 60 mj (from 60 to 130 mj). When compared to our proposed scheme, we think that our solution provides lower energy cost (less than 50 mj). In addition, our proposed solution is scalable for more applications and provides different types of notifications to the end user. These characteristics elect it to be more suitable for WMSN deployment.

The authors in [22] presented a new scheme based on quad tree decomposition for image compression and experimentation results in a WMSN. The proposed algorithm is characterized by low complexity and it provides a high compression ratio. The results reported in this paper showed that the consumed energy, for the transmission over one hope, with a compression factor $R = 0.1$ is about 120 mJ for an image of 128 $*$ 128 8 bpp. It is around 45 mj when transmitting an image of 64 $*$ 64 pixels 8 bpp. The PSNR associated to this decomposition factor is good. Although, the proposed scheme looks very interesting for the transmission of the whole image, it is not designed for event-based notification. In fact, there is no special event detection in this approach. The energy consumption of this method remains high compared to our scheme. Moreover it loads the network which increases the contention and congestion probabilities.

In [23] the authors proposed an artificial immune systems-based image pattern recognition scheme in WMSNs. The designed scheme is used for offline training of image recognition conducted at the base station. It uses the extraction of the principal components of the image. The proposed scheme is mainly applied for classification. The results show that an interesting gain is obtained with scheme. But we think that the energy evaluation needs to be studied more in depth. In fact the authors did not evaluate the local processing part of image at the source motes and they considered only the transmission and reception parts of the communication. According to the work that we presented in this paper, this local processing part is not negligible and should be definitely considered.

Duran-Faundez et al. in [24] presented a lightweight image compression algorithm based on tiny block-size image coding (TiBS). The proposed algorithm operates on blocks of 2 $*$ 2 pixels and it applies a compression techniques based on pixel removal (also referred as pixel subsampling). It consists of eliminating one pixel per block which therefore provides a compression ratio of 4 : 3. The algorithm is mixed with a pixels interleaving mechanism to improve the robustness of image communication under packet losses conditions. The proposed algorithm was experimented on WMSN and it was shown that this approach is an interesting one when compared to JPEG compression technique. The performances analysis was carried for 128 $*$ 128 8 bpp grayscale images. The results have shown that the energy consumption is greater than 750 mj and the processing time is higher than 13 sec. The PSNR of the image at the reception level was high reflecting the good capability of the algorithm to ensure reliable transmission of the image. The proposed scheme uses the approach of source mote processing of the data, but it is not oriented for specific event detection. Compared to our approach, the energy and time associated to the proposed method in this paper are much higher than the results obtained by our scheme. So we think that the method studied in this paper might be used in parallel with our proposed scheme to achieve notification with whole image when required.

As a conclusion, the proposed scheme performs better than the solutions reported in the literature for multimedia sensing in WSN. In depth, the evaluation of the power consumption and time processing related to this scheme has shown very attractive results. We think that the use of the centroid distance method to extract the shape features of the detected object offers an interesting tradeoff between complexity and performances to efficiently recognize the target. In addition, this method is suitable in terms of efficiency and scalability for object identification.

7. Conclusion

This paper discussed a new approach for image-based object identification in wireless multimedia sensor network (WMSN). The main idea was to unload the source mote and the network from heavy data processing and transmission by the detection of event of interest before sending the information. The proposed scheme is flexible and scalable to the object features. We studied in this scheme the use of two methods for the target features extraction. The first is based on shape-based features and the second uses grayscale histogram features. We concluded that the use of centroid distance signature for object identification provides a tradeoff between scalability and complexity.

The performances of the proposed scheme were analyzed for object identification. It showed high performances and good scalability while keeping a low power processing. The tinyOS implementation on MICA platform showed good performances when compared to image transmission with the application of compression scheme.

As future work, we think that the application of other shape-based schemes for object identification such as curvature function will ensure higher performances. We think also that a CMOS hardware implementation of the described schemes might achieve very low power consumption.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work is supported by the Research Center (RC1303102) of the College of Computer and Information Sciences, King Saud University.

References

Kaddachi

M. L.

Soudani

Lecuire

Torki

Makkaoui

Moureaux

J.-M.

Low power hardware-based image compression solution for wireless camera sensor networks

Computer Standards and Interfaces 2012 34 1 14 23

10.1016/j.csi.2011.04.001

2-s2.0-81855193958

Chefi

Soudani

Sicard

Hardware compression scheme based on low complexity arithmetic encoding for low power image transmission over WSNs

AEU—International Journal of Electronics and Communications 2014 68 3 193 200

2-s2.0-84893643588

10.1016/j.aeue.2013.08.006

S.-C.

Chang

G.-Y.

Sheu

J.-P.

Hsieh

K.-Y.

Scalable continuous object detection and tracking in sensor networks

Journal of Parallel and Distributed Computing 2010 70 3 212 224

10.1016/j.jpdc.2009.12.001

2-s2.0-74449087248

Stojkoska

Davcev

Trajkovik

N-queens-based algorithm for moving object detection in distributed wireless sensor networks

CIT Journal of Computing and Information Technology 2008 16 4 325 332

Yan

Xin

Yingying

Segmentation of fiber image based on GVF snake model with clustering method

Proceedings of the 4th International Conference on Intelligent Computation Technology and Automation (ICICTA '11)

March 2011

Guangdong, China

1182 1186

10.1109/icicta.2011.605

2-s2.0-79956007471

Belongie

Malik

Puzicha

Shape matching and object recognition using shape contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence 2002 24 4 509 522

10.1109/34.993558

2-s2.0-0036538619

Yang

Kpalma

Ronsin

A survey of shape feature extraction techniques

Pattern Recognition Techniques, Technology and Applications 2008 chapter 3

InTech

43 90

10.5772/6237

T. H.

Berry

N. M.

On scaling distributed low-power wireless image sensors

Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06)

January 2006

235

2-s2.0-33749584672

10.1109/hicss.2006.364

Peijiang

Moving object detection based on background extraction

Proceedings of the IEEE International Symposium on Computer Network and Multimedia Technology (CNMT '09)

2009

1 4

10.

Vasuhi

Fathima

A. A.

Shanmugam

S. A.

Vaidehi

Object detection and tracking in secured area with wireless and multimedia sensor network

Networked Digital Technologies 2012 294

Berlin, Germany

Springer

356 367 Communications in Computer and Information Science

10.1007/978-3-642-30567-2_30

11.

Zuo

Luo

A two-hop clustered image transmission scheme for maximizing network lifetime in wireless multimedia sensor networks

Computer Communications 2012 35 1 100 108

10.1016/j.comcom.2011.07.009

2-s2.0-81255127287

12.

Pham

D. M.

Aziz

S. M.

Object extraction scheme and protocol for energy efficient image communication over wireless sensor networks

Computer Networks 2013 57 15 2949 2960

10.1016/j.comnet.2013.07.001

2-s2.0-84884210649

13.

Essaddi

Hamdi

Boudriga

An image-based tracking algorithm for hybrid wireless sensor networks using epipolar geometry

Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '09)

July 2009

346 349

10.1109/icme.2009.5202505

2-s2.0-70449581339

14.

Casares

Vuran

M. C.

Velipasalar

Design of a wireless vision sensor for object tracking in wireless vision sensor networks

Proceedings of the 2nd ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC '08)

September 2008

1 9

10.1109/icdsc.2008.4635736

2-s2.0-57349092450

15.

Gao

Zhang

Object tracking and QoS control for wireless sensor networks

Chinese Journal of Electronics 2009 18 4 724 728

2-s2.0-70449336724

16.

Ikeura

Niijima

Takano

Fast object tracking by lifting wavelet filters

Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (ISSPIT '03)

2003

207 210

10.1109/isspit.2003.1341096

17.

Zhu

Wang

Local multiple patterns based multiresolution gray-scale and rotation invariant texture classification

Information Sciences 2012 187 1 93 108

10.1016/j.ins.2011.10.014

2-s2.0-84155164449

18.

Finlayson

Hordley

Schaefer

Tian

G. Y.

Illuminant and device invariant colour using histogram equalisation

Pattern Recognition 2005 38 2 179 190

10.1016/j.patcog.2004.04.010

2-s2.0-6444228145

19.

Muselet

Trémeau

Rank correlation as illumination invariant descriptor for color object recognition

Proceedings of the IEEE International Conference on Image Processing (ICIP '08)

October 2008

San Diego, Calif, USA

IEEE

157 160

10.1109/icip.2008.4711715

2-s2.0-69949127407

20.

Ayinde

Yang

Y.-H.

Face recognition approach based on rank correlation of Gabor-filtered images

Pattern Recognition 2002 35 6 1275 1289

10.1016/s0031-3203(01)00120-0

2-s2.0-0036604865

21.

Irgan

Ünsalan

Baydere

Low-cost prioritization of image blocks in wireless sensor networks for border surveillance

Journal of Network and Computer Applications 2014 38 1 54 64

10.1016/j.jnca.2013.06.005

2-s2.0-84897583932

22.

Nikolakopoulos

Stavrou

Tsitsipis

Kandris

Tzes

Theocharis

A dual scheme for compression and restoration of sequentially transmitted images over Wireless Sensor Networks

Ad Hoc Networks 2013 11 1 410 426

10.1016/j.adhoc.2012.07.003

2-s2.0-84870052828

23.

Wang

Peng

Wang

Sharif

Wegiel

Nguyen

Bowne

Backhaus

Artificial immune system based image pattern recognition in energy efficient wireless multimedia sensor networks

Proceedings of the IEEE Military Communications Conference (MILCOM '08)

November 2008

1 7

10.1109/milcom.2008.4753651

2-s2.0-62349110674

24.

Duran-Faundez

Lecuire

Lepage

Tiny block-size coding for energy-efficient image compression and communication in wireless camera sensor networks

Signal Processing: Image Communication 2011 26 8-9 466 481

10.1016/j.image.2011.07.005

2-s2.0-80052723771