Abstract
Wireless sensor networks (WSNs) are widely used in detecting, locating, and tracking moving objects. The cheap, low-powered, and energy-limited sensors that are set up in large areas may consume large portion of energy and disable the whole network. In this paper, a new energy-efficient method based on Distributed Incremental Gene Expression Programming (DI-GEP) is proposed to collaboratively mine moving patterns of moving targets in order to turn on/off some sensor nodes at certain time to save energy further. Meanwhile, an adjustable sliding window is designed to quickly train the latest collected location data in order to improve the efficiency of DI-GEP. The simulation results show that the proposed method effectively prolongs the network lifetime by around 25% compared with the EKF and ECPA.
1. Introduction
Recent advances in low-power micro-electro-mechanical system (MEMS) technology, wireless communications, and digital electronics have made it possible to design and develop highly integrated, yet low-cost, low-power, multifunctional microsensor nodes, with the capabilities of sensing, processing, and wireless communications. Once deployed in a certain region, the wireless sensor networks (WSN), composed of thousands of sensor nodes, can work for several years. Through cooperative processing of these sensor nodes, WSNs work in many areas, for example, civil, military, health, and so on. For example, WSNs can be deployed in a hospital to track and monitor patients to remotely collect the physiological data of a patient continuously. Unlike traditional networks, WSNs are self-organized, application specific, and data centric [1].
A wireless sensor node is typically battery operated. Thus, the most important constraint in WSNs is the low energy consumption requirement among their sensor nodes. Sensor nodes carry limited, generally irreplaceable, battery-power sources. So, WSNs must focus primarily on power conservation and provide inbuilt trade-off mechanisms that give end users the chance of prolonging network lifetime at the cost of high quality of service (QoS). Sensor nodes may fail due to energy depletion and lead to network failure. So it is very important for WSN to operate energy efficiently. Raising research interest promotes us to develop energy-efficient protocols or algorithms for WSNs.
Target tracking is an important application in terms of WSNs [2]. Bayesian Network and Kalman filtering are two classical methods for achieving this task. One possible solution is as follows. The system state includes the position, direction, and velocity of the target. At each step, sensors near to the target form clusters and select a leader to perform the Kalman filtering, and the updated state is forwarded to cluster leaders chosen from the next step. The Kalman filtering implementation is straightforward in a centralized environment. But it is difficult in the extremely distributed environments such as WSNs due to the energy constraints and lower computation capability of sensor nodes.
The target tracking applications in WSNs are always limited by the inherent energy constraints of sensor nodes, aiming to improve the energy efficiency in target tracking applications in WSNs; the paper proposes a new scheme based on Gene Expression Programming (GEP) [3]. GEP is also adapted to fit for distributed environment. GEP works well in modeling the moving patterns of targets without aprior knowledge. Based on the historical location information of the target, GEP automatically evolves a trajectory of a moving object. To handle the problem, this paper makes the following contributions.
A new algorithm named Distributed Incremental Gene Expression Programming (DI-GEP) is proposed to mine moving patterns of targets. The basic idea is that DI-GEP runs at multiple collaboratively working sensor nodes to mine the trajectory of a target. An adjustable sliding window is adopted to ensure that distributed GEP can quickly train the latest collected location data. When new location data are received, old location data are discarded when prediction error exceeds a certain threshold, which is defined and can be calculated by (7). The policy ensures that succeeding evolutions can energy efficiently find latest moving patterns. Extensive simulations are conducted on OMNet++, a discrete event simulator, to show that new algorithms effectively prolong the network lifetime by about 25% in average when compared to other algorithms, that is, EKF and ECPA.
The rest of the paper is organized as follows. Section 2 presents the related work on energy-saving algorithms. Section 3 gives the preliminaries including GEP-related and target tracking. Section 4 introduces the target diction model. Section 5 formally defines the problems. Section 6 proposes the main algorithms in our scheme. Section 7 gives the experimental analysis. Section 8 concludes this paper and gives the future research directions.
2. Related Work
There are many research efforts on target detection and tracking in terms of WSNs, which describes several aspects of collaborative signal processing [2, 4, 5] and real-time application for biologists to find the presence of individuals [6]. A set of approaches presented in [7–9] were proposed recently to solve the target localization and tracking problem with proximity binary sensors, which transmit only one bit information to indicate whether a target is present. The information transmitted among sensor nodes was greatly reduced, while the localization error was increased. Shrivastava et al. [8] proved that the accuracy in tracking a target is of the order of
The lifetime of a WSN depends greatly on power consumption from each sensor node. Energy-efficient algorithms, protocols, and node hardware and software designing technologies can help prolong the lifetime of the network. Several approaches have been proposed at hardware and software levels to design energy efficient CPU, OS, algorithms, and communication protocols [1]. Dynamic power management (DPM) schemes have been proposed in [13–15] to reduce the power consumption by selectively turning off idle components, such as radio frequency (RF) transmitter, RF receiver, sensing device, A/D converter, and the sensor node.
Target tracking applications are special and have their own characteristics. It is unnecessary to turn on all sensor nodes because an object only appears at certain time and place. It is feasible to turn off some idle nodes if we can predict the time and place where the object will appear. Classical target tracking algorithms such as Bayesian Network and Kalman filtering cannot be directly used to predict moving patterns of targets in WSNs due to resource limitations.
To track moving targets energy efficiently, Allegretti et al. [16] proposed a solution based on CA (Cellular Automata) to reduce long distance communications among nodes because of its locally data exchanging scheme, but a higher power consumption is introduced because it cannot turn off those nodes that are far away from the moving object. Qing et al. [17] proposed ECPA (Enhanced Closest Point Approach) to predict the location of targets during the phase of moving, but the velocity and direction calculation algorithms with regard to the targets are computation intensive for sensor nodes, which often have low power and computation capability.
3. Preliminary
3.1. Introduction of Gene Expression Programming
Gene Expression Programming (GEP) was proposed by Ferreira in 2006 [3]. As a new member of Evolutionary Algorithm (EA) family, GEP is widely applied in data mining areas, that is, function finding, classification, association rule mining, time series prediction, parameter optimization, and digital circuit design, and so forth. In GEP, Genotype (Chromosome) and Phenotype (Expression Tree (ET)) are separated. Without prior knowledge, GEP automatically evolves over training data and discovers knowledge as mathematical formula depicting movement patterns of moving objects in WSNs.
In GEP, an individual, that is, a solution corresponding to a problem, is represented as linear fixed-length string named chromosome. It contains one or more genes. Each gene is decoded into a nonlinear expression tree (ET). Decoded ETs are linked together by prespecified linking function symbols such as plus (+) and minus (−). One chromosome represents one formula that is, the solution to a specific problem. Genetic operations are applied on chromosomes, that is, genotype and selection operations are performed on ETs.
A gene in GEP consists of head and tail. The head contains symbols from either function symbol set (F) or terminal symbol set (T) and the tail only contains symbols from terminal symbol set. The tail length satisfies (1), which guarantees that a gene can be decoded into a valid ET
Let Example 1.

Individual representation in GEP: 2-gene chromosome, 2 expression trees, and corresponding one mathematical expression.
3.2. Target Tracking in WSNs
There are several target tracking methods that adopt distinct models. Kalman filtering method requires that the velocity, direction, and acceleration of a moving object are given to predict the next location of a moving object. But it is impossible for WSNs to equip each node with these devices due to its lower cost and lower power supply. It is critically important to design energy-efficient target tracking algorithms for WSNs.
The tracking model is described in Figure 2. The monitored region is covered with sensor nodes that are distributed manually or randomly. A target moves along a trajectory that is unknown before and is detected by some sensor nodes that are depicted as black solid circle nodes. Suppose that the location data the moving object passed are known at time

Target tracking model.
Once the trajectory of a moving object is found and expressed as a mathematical function, then the following tasks are straightforward.
Predicting the locations that the moving object will appear at time Activating the necessary sensor nodes and turning off unrelated sensors to save energy.
Target-tracking applications need careful consideration of trade-off between tracking error and energy consumption. The tracking error is defined by the average target location estimation error of sensor nodes. The better trajectory leads to a better target location estimation.
To achieve the above goals, we propose three policies.
DI-GEP, that is, a trajectory discovery algorithm based on distributed incremental gene expression programming. It mines the trajectory in order to reduce tracking errors. Node scheduling algorithm. It activates the necessary nodes and turn off unrelated sensors at certain time in future to save energy. Sliding window strategy is used to improve performance of DI-GEP.
The experiments in Section 7 will demonstrate the effectiveness and efficiency of proposed methods.
4. Target Detection Model
Sensor nodes receive the physical signal and convert it to electrical signal. Based on the variation of electrical signal, sensors can detect the existence of target.
To describe the model formally, we make the following three assumptions.
There are N sensors There is a single target anytime. Note that, by Assumption ( (borrowed from [19]): Le
This study adopts a practical target detection model shown in Figure 3, which satisfies

Target detection model.
5. Problem Formulation
A trajectory of a moving object is treated as a sequence of time-stamped locations that are collected by sensor nodes around the target. It is described as follows.
Definition 2 (Trajectory).
A trajectory of a moving object is a time sequence with time interval
Given the historical location data Find a formula During evolving process, the size of sliding window h determines how many historical location data are used. The smaller h leads to less energy consumption and faster convergence speed. h is adjusted based on the location prediction error ε, geometric distance between the prediction value and the real measurement. ε should be as small as possible and is calculated by
The historical locations are listed in Table 1.Example 3.
Location data obtained.
Thus, f may be

An example of a tracking trajectory.
6. DI-GEP Scheme
6.1. Fitness Evaluation of Individual
In evolutionary computations, fitness functions and selection environments are the two very important faces of fitness and are, therefore, intricately connected. When we speak of the fitness of an individual, on the one hand, it is always relative to a particular environment and, on the other, it is also relative to the measure (the fitness function) we are using to evaluate them. Consequently, the success of a problem not only depends on the way the fitness function is designed but also on the quality of the selection environment [3].
Combining the fitness evaluation and prediction error, DI-GEP calculates the fitness of each individual in distinct populations by
6.2. Trajectory Mining Algorithm
The main steps of trajectory search algorithms are given below.
Sensor nodes are activated based on the node scheduling algorithm. Communications occur among sensor nodes when one node succeeds in obtaining a trajectory and notifies other nodes. Other nodes stop running their algorithms and obtain the trajectory to predict future location of the moving object.
Figure 5 details the flowchart of DI-GEP.

The workflow of distributed GEP.
DI-GEP stops if one of these stopping criteria is satisfied.
The maximum number of generations is reached. DI-GEP exceeds the specified runtime. One or more other nodes send stopping signal to the node. The node succeeds in obtaining a trajectory.
The implementation of DI-GEP is described in Algorithm 1. It mines a trajectory represented as individual in DI-GEP.
or null if no trajectory is found (1) load historic location data of size h and initial configuration (2) randomly create an initial population (3) decode each chromosome into one ET (4) calculate each chromosome's fitness by (8). (5) while (stopping criteria are not satisfied){ (6) select individuals, generate next generation (7) apply genetic operations sequentially on the new generation (8) decode each chromosome into ET. (9) calculate each chromosome's fitness (10)} (11) return an individual with best fitness.
6.3. Location Prediction and Node Scheduling
Once a trajectory is found, the model uses it to predict the location where the target will appear as at time
To reduce computational cost, we do not use a circle but a square to select nodes around
prediction length L. (1) for (2) for each (sensor node (3) if (4) and (5) (6) else (7) (8)
6.4. Incremental Evolution Strategy
In real-world practice, a trajectory of a target is very complex and variable. Thus, to improve the performance of DI-GEP, the key steps of our policy are as follows.
Trajectory is described as a curve. Any complex curve can be spitted into multiple simpler curves that are described as line segments, quadric curves, or cubic curves. This method can not only ensure the flexibility of modeling the trajectory but also guarantee less computation cost. To capture the variation of a trajectory and accelerate the evolving process, sliding window policy is proposed to keep a certain number of latest historical data during individual evolving. When the obtained function cannot represent the current moving behaviors, that is, the prediction error is greater than a certain threshold; distributed GEP and the node scheduling algorithm should run again with location data in sliding window.
We assume that a trajectory function

Sliding window.
The sliding window moves to DI-GEP and the node scheduling algorithm run again.
The performance of these algorithms is evaluated on OMNet++ and Castalia.
In order to compare the performance with other target tracking algorithms, we evaluate DI-GEP, ECPA, and extended Kalman filtering (EKF). The target cannot be found until the network fails, and the tracking task stops simultaneously.
7. Experiments
7.1. Sensor Networks Setting
Suppose that hundreds of sensor nodes are uniformly distributed in a square of 100 × 100 meters. A target can present at a random place in the WSNs and sends signal with the strength of
The sensing range
Parameters setting in DI-GEP.
7.2. Network Lifetime
Suppose that different number of sensor nodes are uniformly distributed in the grid network, this experiment analyses the impact of the number of sensor nodes on the network lifetime. The results are given in Figure 7. It shows that the three algorithms often obtain longer network lifetime when the number of sensor nodes gets bigger. The network lifetime of DI-GEP is averagely 35% longer than EKF and ECPA.

Network lifetime and number of sensor nodes.
7.3. Energy Consumption
In this experiment, four hundred sensor nodes are used to monitor the grid network. The sensor nodes are manually distributed at cross points in the grid network.
This experiment analyzes the energy consumption of three algorithms. The results are given in Figure 8. The results show that the total left energy decreases when the time passes. The total left energy cannot be zero because all these algorithms are invalid when the network fails. Note that, some sensor nodes consume their energy and cannot communicate with other nodes any more. Meanwhile, DI-GEP performs better than EKF and ECPA. This is because it consumes less energy than EKF and ECPA after running the same time.

Energy consumption and time elapsed.
7.4. Active Nodes
This experiment uses the similar setting as given in the previous section and testifies the influence of the number of the active nodes as shown in Figure 9. DI-GEP outperforms EKF and ECPA in node scheduling because of its better trajectory prediction, so the number of active nodes in DI-GEP is about 25, 30% less than that in EKF and ECPA, separately.

Active nodes and time elapsed.
7.5. Prediction Accuracy
This experiment uses the same setting as given in previous section and will testify that the prediction accuracy can heavily affect the network lifetime. The results are shown in Figure 10. Distributed GEP outperforms EKF and ECPA because of its better trajectory prediction, so at the same prediction accuracy, the network lifetime is averagely 28% longer than that in EKF and ECPA.

Network lifetime and prediction accuracy.
8. Conclusions
In order to track targets energy-efficiently in WSNs, we presented a distributed incremental algorithm based on GEP for target tracking applications in WSNs, proposed sliding window policy for distributed GEP to improve evolution process, proposed a new target tracking model, and give extensive experimental results to show the good performance of our method.
The future work includes (a) optimize DI-GEP to capture abrupt moving behaviors, (b) optimize DI-GEP to suit randomly distributed wireless sensor networks, and (c) consider border intrusion detection to save more energy in the initial state.
Footnotes
Acknowledgments
The work in this paper was partially supported by the National Science Foundation of China under Grant no. 61103043, and Youth Fund of Sichuan University (project No. 2030404134011 and 2010SCU11049). The authors would like to express their gratitude to these two entities for their financial and technical support.
