Abstract
Applications based on video and image in wireless sensor network are highly attractive due to their wealth of information. In this context, application for object recognition and tracking using image and video information is one the attractive approaches that can be applied for event detection and localization, security processes, following of the rare animal species and control of road traffic, and so forth. However, the implementation of such approach with WMSN requires a specific image processing scheme and efficient transmission protocol. In fact, because of the limited energy of the batteries embedded in motes, the power consumption is the major constraint facing network life time and reliability in WMSN. The efficiency and the validity of these multimedia applications over wireless sensor networks are then dependent on the capabilities of the designer to provide low-power scheme for data processing and energy-aware transmission protocols. This paper presents a contribution to the design of low complexity scheme based on object identification for efficient sensing of multimedia information in wireless multimedia sensor networks. It proposes a new solution and explores the associated performances of this scheme. The presented results in this paper attest the high efficiency to achieve low-power objects identification when implemented in wireless motes.
1. Introduction
Recently, research in the area of WSN is more focusing on the idea of enhancing the capabilities of the WSN in order to provide the end user with useful information gathered in a smart scheme instead of simply sending all the measurements to report about a single event occurrence or information of interest. This idea comes with big interest when talking about multimedia communication over wireless sensor networks. In fact, a wireless multimedia sensor network (WMSN) is built using wireless sensor nodes that integrate multimedia devices such as image and audio sensors enabling to retrieve video or audio data streams. The use of these sensors provides the application with rich visual verification, in-depth awareness of the real scene, and recognition with a lot of other interesting capabilities.
As the multimedia applications are characterized with the production of relatively huge data streams, the power consumption appears to be the major challenge to face when deploying the WSN for the transmission of multimedia data stream. Since the power consumption is proportional to the number of bits to be transmitted that represent the multimedia information, then, as a first reflection, reducing this amount of data will help to reduce the power consumption. Many research works which addressed image compression in WMSN have noticed that classical compression methods for video and image are not suitable to be processed within weak hardware capabilities characterizing the wireless sensor nodes [1, 2].
As a second alternative, to minimize the power consumption and to reduce the data stream amount of images transmitted through the WMSN, it will be recommended to first check if the image captured contains interesting phenomena to be notified to the end user and then to send the minimum of bytes that represents the detected object or the physical phenomena. This scheme will not only reduce the power consumption at the source mote but will also contribute to significantly unload the network from useless information to the end user. Consequently, this method will contribute to increase the network life time. The efficiency of this method to achieve the desired goals will depend on the scheme used to extract the useful information from the video stream at the source mote and its capabilities to detect the occurrence of a specific event. So while this idea looks very attractive to keep a strategic balance between the local processing resources of the mote and power saving, we think that more focus is still required to design an efficient scheme that will be implemented in the multimedia sensor mote.
The main contribution of this paper is to present a new scheme for sensing with multimedia capabilities based on event detection and useful information extraction in order to achieve low power notification to the end user. It focuses on the specification of a low complexity scheme intended to detect and identify a specific target or object based on image sensing and processing. The paper presents also the performances analysis of this scheme and shows the energy gain compared to other methods for multimedia sensing such as methods based on image compressions and transmissions.
In the remaining part of the paper we will first present the specified scheme for object detection and identification. Then we will present its performances to identify the target and it invariance for different parameters. We also will address the performances of this scheme when processed on MICA2.
2. Related Works and Motivation
WMSNs were deployed for remote object detection. Some research contributions were developed to detect a new object in the background of the video scene and then to remotely notify that to the end user. The image of the detected object may be then transmitted through the network according to the wish of the application. Tu et al. in [3] presented a new scheme to detect change in the background of a scene using WMSN. They have based their approach on the detection of a significant difference in the background. The performance of this scheme was addressed to show a low complexity at the implementation level and a high accuracy in the detection of new objects. Nevertheless, this solution was not specified to ensure the identification of a specific object that can be applied to enable potential application related to different areas.
Stojkoska et al. in [4] were interested to develop a solution for motion detection over WMSN that can be used in the area of surveillance. They specified a new scheme where the captured images will be divided in a set of small size blocks. This subdivision helps to detect using a comparison process a change compared to the reference subblock in the background of the scene. This approach helps to save energy and is very attractive for low-bandwidth communication. It can be well applied in the context of object detection; however, it was not developed to identify a specific object target.
Li et al. in [5] studied a new scheme to extract the information that represents an object with the use of clustering approach. The idea was to define the contour of this object using a snake algorithm for high level of accuracy. This scheme proved high efficiency with low load of processing but it was not intended for embedded system. Then further works has to be carried to check its adequacy for the case of WMSN.
A shape features methods for matching to identify a target objects is one of the most interesting schemes. The shape context has been presented by Belongie et al. in [6]. They developed a new solution for matching based on distance between shapes. This proposed scheme demonstrated good results regarding the simplicity and accuracy. An interesting survey in the image feature extraction is presented by Yang et al. in [7]. The authors divided the approaches to six main approaches (shape signature, polygonal approximation, space interrelation feature, moments, scale-space methods, and shape transform domains).
Ko and Berry in [8] was interested in the issue of low-power wireless image sensor. They developed an algorithm to minimize power consuming by reducing the data acquired for the images. This algorithm is based on the idea of minimizing the communication between nodes in the processing of the captured image and sharing the data result.
Another related research work has been presented in [9]. It concerns object motion detection in WMSN. The presented idea is based on background extraction. The authors divided the image into
background extraction and converting the color into grayscale pixels to make it easier to apply the extracting algorithm,
using the threshold segmentation with the application of new formula for defining the threshold to detect the object.
This method seems to be very useful in the context of WMSN and might fit well with the requirement of our application for object identification.
In [10] Vasuhi et al. presented a new method for object detection and tracking for multiple objects in WMSN. They used for object detection the following steps:
background techniques,
Haar Wavelet for the feature extraction method,
the joint boosting algorithm for the classification with modified algorithm with less computation.
While this method looks to be very interesting, the authors did not addressed the complexity and the power consumption of this scheme in the context of WMSN to prove the adequacy to the constraints of these systems.
The authors in [11] presented a two-hop clustered image transmission scheme aiming to minimize the power consumption in WMSN. The approach is based on two types of sensor nodes: normal sensor nodes and camera-equipped nodes. The camera-equipped node acts as a cluster head. It distributes the compression's tasks in the nodes of the cluster. This approach unloads the camera node of the compression tasks. This later will be responsible to aggregate the coded data and transmit it through multihops to the end user. The other node will participate in the distributed compression scheme and in the transmission process. The idea of distributed compression scheme looks very attractive. It balances the power consumption over the nodes and therefore extends the whole network life time. However, the idea of transmitting a stream of images still exhausts the networks energy and increases the contention and congestion in the network. We believe that such compression scheme can be applied to transmit just the image of the specific target required by the end user.
In [12] Pham and Aziz proposed a new architecture for object extraction and reliable communication protocol over wireless sensor network. The approach is based on the concept of background subtraction for new object detection. The object extraction scheme is based on threshold to determine the set of pixels that represent the object from the captured image. The whole scheme has been implemented in a hardware platform using FPGA circuit. The results show low power consumption and short time processing. It looks very attractive for the WMSN application. However, we have to note that the implementation cost is higher than software implementation; moreover, the flexibility and the scalability of the solution are often reduced.
In [13–16], the authors studied the problems of new object detection and tracking by the use of WMSN. Different methods were specified and designed to ensure this objective. But still the problem of specific target object identification was not studied. It requires more effort to define the efficient way to perform this application through wireless multimedia sensor networks.
3. General Approach of the Proposed Multimedia Communication in WSN
In our proposed solution, the network is composed of a set of multimedia sensors. Each multimedia sensor is designed with smart capabilities to be remotely configurable, by the end user, in order to identify a specific target. When detecting a new object, the sensor has to decide if it is the tracked object, to extract the useful information that represents this object and to notify the end user. The main challenge in the design this approach is the specification of a low complexity scheme to identify targets with low-power and low-memory occupancy cost. In depth, the wireless multimedia sensors are with limited memory space. So, for optimal multimedia processing in the sensor, the designer has interest to keep memory occupancy as low as possible. In our proposed new scheme, the end user has to generate the object descriptor, to be identified in the area of surveillance. The data representing this object will be transmitted to the dedicated wireless multimedia sensor through the network using a specific protocol for dynamic configuration. The object descriptor used for identification has to be represented with very limited bytes. In fact, the protocol ensuring this remote configuration will get low overhead to communicate through the network. This approach favors scalability for a dynamic run-time target objects change (Figure 1).

General scenario for object identification.
In this scheme, the wireless multimedia sensor receives through the network, from the remote end user, the descriptor of object to be tracked. This descriptor will be locally loaded in the sensor memory as a reference for target recognition. The algorithm running in the wireless multimedia sensor should be able to detect a new object that appeared in its surveillance area. It should decide if it is the target object. Then, according to the requirement of the application running at the user, the sensor might decide to notify with simple byte the object detection or it might send the descriptor or it extracts the useful information representing the scene and sends it to the end user.
4. Specification of the Scheme for Object Detection and Identification
The well-known schemes developed for object identification based on image processing cannot be directly applied in the area of WMSNs. In depth, a specific tuning should be applied to these algorithms in order to be used in the context of WMSN. Our contribution, at that level, is to specify a new scheme that is inspired from algorithms developed basically 13 for computer vision application to meet the constraints of WMSNs. The structure of the proposed scheme for object detection and identification is described by the following sequential steps (Figure 2). The main consideration in the specification of this scheme is to reduce the following:
the memory usage and mainly the feature vector that describes the object,
the number of arithmetic operations in order to achieve low processing complexity.

General structure of the object identification scheme.
4.1. New Object Detection
The detection of a new object is based on the approach of background subtraction. The used approach divides the image on a set of blocks of
If we suppose that the background for the block j of the image is
4.2. Object Extraction
When a new object is detected the set of image blocks containing the object will be isolated to reduce the processed size of useful information. In fact this step reduces the memory, the memory occupancy, and the energy consumption.
Let us suppose that the whole size of the image is
So the more the size of the block is bigger the more we increase the time and the energy consumption related to this step. However, the tiny size of the blocks will create a problem of background change detection as expressed in (1).
When the object is isolated and the set of useful blocks of pixels is extracted, the new image is then transformed to the binary level that allows applying a region growth segmentation method to determine the shape.
4.3. Features Extraction
The feature extraction process of the detected object is the main task on which the performances of the whole scheme for identification depend. The main keys considered on the design and the specification of this task are mainly as follows.
Low complexity method: the main reason for high power consumption is the high number of arithmetic operations. So the more we reduce the number of related arithmetic operations the more we expect low number of running clock cycles in the motes and as per consequence low power consumption.
Keeping the feature vector that represents the object descriptor as short as possible: in fact due to the memory constraints of the motes and the low bandwidth available in the WMSN, it is preferable to keep the shortest vector for the feature that allows identifying the target efficiently.
In the literature review [6, 7] it was proved that the shape-based recognition methods are very attractive to achieve low complexity and high efficiency that match very well the previous indicated considerations. In our approach, we have focused on two shape-based feature extraction methods.
4.3.1. Centroid Distance Feature Extraction Method
The features extraction is based on the use of the centroid function. This method is one of the shape signature methods that allow extracting the main features from the shape. If a shape is represented by its region function in (3), define the centroid
The centroid distance function is expressed by the distance of the boundary points from the centroid
This method is invariant to translation and when normalized it is also invariant to the scale.
The feature vector represents the distance to the centroid in relation to the angle θ calculated as follows:
where w is a small window to calculate the angle θ to be more accurate and to give good accuracy of the shape features.
This method is highly adequate in terms of complexity and memory requirement to the wireless sensor constraints.
4.3.2. The Area Function Extraction Method
The area function is one of the shape signature methods easy to process with a low complexity and low-memory requirements. It calculates the area of infinitesimal triangle area defined by three points: two at the shape and the centroid
The given signature of this method
While this method looks simple and highly adequate to low processing resources, it presents weaknesses to efficiently identify object within sophisticated shapes.
We also studied the adequacy of a third method that is not based on shape features but on grayscale histogram instead.
4.3.3. Histogram Feature Extraction Method
The histogram of an image represents the distribution of the pixels in the image over the gray-level scale. This feature can be easily extracted and might be applied in the context of object identification. The histogram method has been widely applied in the context of image processing for region-based classification and recognition [17]. In this paper we apply the grayscale histogram signature for object identification as an attractive low complexity scheme. Despite this attractive characteristic, the histogram signature is very sensitive to the luminescence and to the noises [18]. But it is invariant to rotation and translation.
The idea is to isolate the new detected object from the background and to determine the distribution of pixels value. This feature can be applied to match with the reference signature to decide the detection of the target object.
4.4. Matching
We have used Spearman rank correlation or Spearman's Rho [19, 20] to match the similarity between the signatures of the two objects. We compared the signature of the new detected object to the reference one that is supposed to be saved on the memory of the mote. Spearman's Rho
Ayinde and Yang in [20] have compared Spearman's correlation coefficient and Pearson's correlation coefficient under considerable intensity differences between images, occlusion, and other random differences between images. He demonstrated that Spearman's ρ reliably produced a higher discrimination power than Pearson's correlation coefficient. In the same trend, Muselet and Trémeau in [19] demonstrated that the rank correlation of color components captured under different scene illuminations remains fairly unchanged and represents a robust metric for object recognition. So we think that this method is very accurate and can be well applied in the context of our application.
4.5. End User Notification
When the object is detected, the event will be notified to the end user. At that level the mote has to process the notification according to the end user requirements. A big gain in time and power consumption can be achieved at this step. In depth, the notification can be just sending few bits or the object feature vector or also an extraction of the image of the object.
5. Performances Analysis the Proposed Scheme
We studied the capability of the proposed scheme to ensure object identification for a specific target under different conditions. The idea is to check out the invariance of the proposed method mainly for translation, for rotation, and for the scale.
The proposed scheme, with different method for features extraction, was programmed using Matlab in order to check out its performances. Figure 3 shows the target object and its shape feature based on the centroid distance and the area signatures. It also sums up the image of the bird (target object) taken in different positions and the associated shape signature compared to the reference one.

Shape signatures of centroid distance and area function methods for object identification (size vector features = 380).
Grayscale images (64
In Figure 3, the obtained signatures are more correlated when applying the centroid distance features. The area signature presents a big difference even when the detected object is almost within the same position as the original one (case of signature with image test 1). The low performances of area signature to achieve efficient identification are well reflected in the matching ratio expressed in Table 1 (often less than 50%).
Object identification using shape features with centroid distance and area schemes.
Table 1 sums up the matching ratio using the two shapes features extraction methods when applying Spearman's Rho for matching. The results show that in general the scheme based on centroid distance signature was able to identify the target much better than the one based on area signature. The percentage of matching for these different positions is often greater than 60% that corresponding to high correlation with the reference. This reflects the capability of our scheme to achieve efficiently the target object detection and identification.
We also note from Table 1 that when we change the size of the feature's vector, the capability of the algorithm to identify the target will not dramatically change. In fact, we noted a maximum of variation in the percentage of identification of about 8% with the vector size change. In all the cases, when the vector size is decreased from 360 to 128, we did not notice a change in the final decision of identification. The proposed method based on centroid distance features gives high recognition ration with a low-data representation of the feature. This characteristic is very attractive since the more the vector is kept with small data representation the less we have memory constraints. In addition, for the dynamic target change, the small size of the vector allows to meet time and power constraints while updating the remote WMS node with the object to be detected.
We also checked out the capability of the proposed scheme to identify the target using grayscale histogram signature of the detected object. Figure 4 sums up the capability of the scheme to identify the target compared to the reference one. While the test images show the bird in different positions, the identification results were always relatively high (greater than 65%).

Histogram signature-based methods for object identification (vector features 256).
As being discussed this method based on the histogram signature is very sensitive to the luminescence intensity and to the noise effect. Furthermore, the change in the size of the feature vector might give a wrong result. These drawbacks limit the application of this method to specific environments only where the luminescence is not expected to change.
6. Implementation for TinyOS Based Platforms
We estimated the performance of the proposed scheme for wireless sensors (MICA2 sensors). We were interested to evaluate the performances of centroid distance-based scheme and the grayscale histogram-based scheme using images at the grayscale level (128
The scheme for object detection and identification was implemented in the emulators for microcontrollers. We used WinAVR for ATmega128L microcontrollers (MICA2). This tool allows checking up the number of clock cycles for Atmel series. Then using the characteristics of the microcontrollers for MICA2, the power consumption and the processing time were estimated.
Table 2 sums the results related to the power consumption and the time processing of the scheme for object detection and identification when implemented on MICA2 wireless motes. The results detail the required number of clock cycles for the different blocks of the scheme (Figure 2). We note that the main power and time consuming part in this scheme are related to the new object detection and the segmentation for the object extraction from the background. In depth, the detection of a new object involves a processing of all the pixels of the image, while the segmentation for the object extraction will be applied for the set of blocks containing the new object. In our approach we applied segmentation method by region growth that gives good results, but we think that this method can be enhanced to reduce more time and power consumptions.
Evaluation of the proposed scheme for object identification based on centroid distance method implemented on MICA2.
Table 3 illustrates the results of the same parameters for the scheme based on the use of the grayscale histogram signature for the object identification. As expected, we can note that the measured values get increased with the size of the processed image. However, the consumed energy and the required processing time still remain very low. For 64
Evaluation of the proposed scheme based on grayscale histogram for MICA2 wireless motes.
The main baseline of our approach is to extend network life time with the reduction of the multimedia data stream that has to be submitted in the WMSN. The proposed scheme will be processed locally in the multimedia sensor. But the end user has to be notified through the WMSN. At that level of notification process, the requirements for the end user application should be considered. In fact the notification can be ensured as follows.
Transmission of few bits (typically 8 bits): this is in case of number of occurrences of the event.
Vector of features: this might be used for tracking, recognition, or further classification at the end user.
Transmission of limited data stream that represents the useful information: this is for database image storing application or any further processing at the end user level.
Table 4 presents the results of these different approaches for notification.
Notification to the end user.
It is clear from the results shown in Table 4 that according to the type of notification we can gain time and energy compared to the whole transmission of the whole image even with the application of a compression scheme [1, 2].
The presented approach for image-based event detection used in the context of multimedia sensing has very attractive features when compared to similar methods (Table 5). The time and power consumptions related to the processing of the studied scheme are more suitable for use in WMSN. In fact, it was shown in [1] that the energy cost related to the transmission of an image (128 × 128, 8 bpp) is about 860 mJ and the time processing is around 13.5 s. As, shown in the previous results, our scheme outperforms well the basic approach for simple image transmission. In addition, the transmission of the full image will significantly load the network and consequently decrease its performances as well as its life time.
Comparison with other solutions.
In [11] a distributed compression scheme was applied in WMSN. The energy consumption for the processing and the transmission of an image 512
When compared to the hardware solution for object detection and useful information extraction [12], we think that the power consumption achieved with hardware platform is much better. However, the hardware implementation is less flexible and the cost is very high. We think that the CMOS implementation of multimedia schemes for communication in WMSN remains an attractive trend.
The solution proposed in [21] based on the prioritization scheme of the block of the image achieves an interesting low-bit rate even better than the application of classical compression algorithms. The estimated energy is higher than 60 mj (from 60 to 130 mj). When compared to our proposed scheme, we think that our solution provides lower energy cost (less than 50 mj). In addition, our proposed solution is scalable for more applications and provides different types of notifications to the end user. These characteristics elect it to be more suitable for WMSN deployment.
The authors in [22] presented a new scheme based on quad tree decomposition for image compression and experimentation results in a WMSN. The proposed algorithm is characterized by low complexity and it provides a high compression ratio. The results reported in this paper showed that the consumed energy, for the transmission over one hope, with a compression factor
In [23] the authors proposed an artificial immune systems-based image pattern recognition scheme in WMSNs. The designed scheme is used for offline training of image recognition conducted at the base station. It uses the extraction of the principal components of the image. The proposed scheme is mainly applied for classification. The results show that an interesting gain is obtained with scheme. But we think that the energy evaluation needs to be studied more in depth. In fact the authors did not evaluate the local processing part of image at the source motes and they considered only the transmission and reception parts of the communication. According to the work that we presented in this paper, this local processing part is not negligible and should be definitely considered.
Duran-Faundez et al. in [24] presented a lightweight image compression algorithm based on tiny block-size image coding (TiBS). The proposed algorithm operates on blocks of 2
As a conclusion, the proposed scheme performs better than the solutions reported in the literature for multimedia sensing in WSN. In depth, the evaluation of the power consumption and time processing related to this scheme has shown very attractive results. We think that the use of the centroid distance method to extract the shape features of the detected object offers an interesting tradeoff between complexity and performances to efficiently recognize the target. In addition, this method is suitable in terms of efficiency and scalability for object identification.
7. Conclusion
This paper discussed a new approach for image-based object identification in wireless multimedia sensor network (WMSN). The main idea was to unload the source mote and the network from heavy data processing and transmission by the detection of event of interest before sending the information. The proposed scheme is flexible and scalable to the object features. We studied in this scheme the use of two methods for the target features extraction. The first is based on shape-based features and the second uses grayscale histogram features. We concluded that the use of centroid distance signature for object identification provides a tradeoff between scalability and complexity.
The performances of the proposed scheme were analyzed for object identification. It showed high performances and good scalability while keeping a low power processing. The tinyOS implementation on MICA platform showed good performances when compared to image transmission with the application of compression scheme.
As future work, we think that the application of other shape-based schemes for object identification such as curvature function will ensure higher performances. We think also that a CMOS hardware implementation of the described schemes might achieve very low power consumption.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This work is supported by the Research Center (RC1303102) of the College of Computer and Information Sciences, King Saud University.
