Abstract
In designing wireless sensor networks, it is important to reduce energy dissipation and prolong network lifetime. This paper presents research on the existing clustering algorithm applied in heterogeneous sensor networks and then puts forward an energy-efficient prediction clustering algorithm, which is adaptive to sensor networks with energy and objects heterogeneous. This algorithm enables the nodes to select the cluster head according to factors such as energy and communication cost, thus the nodes with higher residual energy have higher probability to become a cluster head than those with lower residual energy, so that the network energy can be dissipated uniformly. In order to reduce energy consumption when broadcasting in clustering phase and prolong network lifetime, an energy consumption prediction model is established for regular data acquisition nodes. Simulation results show that compared with current clustering algorithms, this algorithm can achieve longer sensor network lifetime, higher energy efficiency, and superior network monitoring quality.
1. Introduction
Over recent years, wireless sensor networks (WSNs) [1] have become a hot research and have a wide range of potential applications including environmental monitoring, military detection, industrial control, and home networks [2–5]. But in practical applications, in order to meet the demands of various applications for the technologies of sensor networks, increasing attentions have been attracted to the researches on heterogeneous wireless sensor networks HWSN [6].
HWSN is composed of different types of sensor nodes, which are in a wide range of applications [7–9]. For HWSN, it should be given priority to reduce energy dissipation in network operation, improve network load and stability, and prolong the network lifetime.
Energy consumption in networks can be effectively reduced by organizing clustering sensor nodes. Due to the dynamic and complex nature of energy configuration and network evolution, it is very difficult to design a clustering protocol which can save energy and provide reliable data transmission in heterogeneous networks.
In this paper, a new heterogeneous sensor networks model with heterogeneity of monitored objects and energy heterogeneity of all nodes is proposed. For the heterogeneous networks with such properties, in order to make more rational use of network energy and prolong the lifetime of the networks, this paper presents an energy-efficient prediction clustering algorithm (EEPCA).
EEPCA determines node energy factor by comparing the energy of a node with the average energy of other nodes within the communication range and determines communication cost factor according to the ratio of the average energy consumed in one communication within all nodes and the ideal average energy consumption after the node becomes the cluster head. The probability for nodes to become cluster heads is directly related to energy factor and communication cost factor. In order to save energy consumed by broadcasting energy information in each round of nodes clustering, an energy predication model is established for nodes whose data collection (such as temperature, humidity) is of regularity in time interval and message length. Considering the changes in networks environment and errors between calculated and actual node energy consumption, set the nodes that do not need to broadcast their energy information if the difference between the node residual energy in the initial stage at the current round and the predicted value at the last round is within a certain range. Simulation results show that EEPCA can achieve longer lifetime, higher energy efficiency, and superior network monitoring quality compared with other clustering protocols such as LEACH, SEP, and EDFCM.
2. Related Work
LEACH [10] is one of the most popular distributed cluster-based routing protocols in wireless sensor networks. In the initialization phase, LEACH carries out cluster head selection. In order to balance the load of all network nodes, LEACH elects cluster head nodes in each round, where is the proportion of optimal cluster heads. If the probability of a node i is less than the following probability threshold, it becomes the cluster head as follows:
However, LEACH has some constraints, including: (1) it does not take into account the optimization of the number of cluster heads; (2) LEACH therefore must base itself on two assumptions so as to achieve uniform energy consumption at per node: (1) the initial energy of each node is equal; (2) the energy consumed at each node when acting as the cluster head is equal.
Many researchers have done profound work probing into HWSN.
In [10], authors improved LEACH algorithm and put forward LEACH-C algorithm of electing cluster heads according to the residual energy. However, each node needs to know the total energy of the current network to determine whether it can become the cluster head, which requires support of routing protocols, and therefore distributed implementation is difficult to achieve. SEP [11] is designed for two-level heterogeneous networks. But in the multilevel heterogeneous networks, SEP does not suitable.
For further researches, a heterogeneous network model in term of different initial energies is discussed in [6, 12–14]. In [15], the authors introduced a cluster head election method using fuzzy logic to overcome the defects of LEACH. They investigated that the network lifetime can be prolonged by using fuzzy variables.
In [14], authors proposed EEHC protocol. This protocol selects cluster heads based on the weighted probability of each node related to the initial energy, the more initial energy, and the higher probability; the node will be selected as a cluster head. However, this protocol cannot predict energy consumption, so its performance is limited in heterogeneous networks in which parts of nodes are regular data acquisition nodes.
In [16], authors proposed EDFCM protocol, which applies to networks with three different kinds of heterogeneous nodes. Nodes in the networks model of this protocol fall into two ordinary types: one performing the function of managing information and the other collecting different data (type_0, type_1). type_1 has more complex hardware and software architectures, so it has more initial energy and greater data transfer capability. But the application of this protocol is limited to the networks with only two types of ordinary nodes.
In [17], authors proposed ERP clustering routing protocol. In this paper, an evolutionary algorithm with an appropriate fitness function is proposed with the intrinsic properties of clustering in mind.
3. System Model and Problem Description
3.1. Heterogeneous Model for Wireless Sensor Networks
To meet the demands of efficient monitoring, we describe our HWSN model with both different initial energies and monitored objects. The basic assumptions of networks model are as follows: the networks is located in a

(a) 100-node random heterogeneous network. (b) Dynamic clustering structure by EEPCA.
Therefore, nodes are heterogeneous in two ways: (1) heterogeneous data-acquisition regularity: some nodes are regular in acquiring data, and some are not. All regular nodes send
Nodes do not have any location information, but they can calculate the distance between nodes according to signal strength received. Nodes in the networks are organized in the form of clusters. Cluster heads perform the function of data fusion and are responsible for the resultant data transmission to the BS. There is only one BS in the networks.
Node initial energy is randomly distributed in
3.2. Energy Consumption Model
This paper applies a simple energy consumption model [10] to calculate energy consumption in communication, ignoring energy consumption of nodes in the process of computing, storage, and so forth. In the process of transmitting l bits message through distance d, the energy consumption of the transmitter is
Receiver's energy consumption is
3.3. Problem Description
EEPCA must take full account of the following:
algorithm should be fully distributive and self-organized. Each node must decide whether to become a cluster head or a member belonging to a cluster in the clustering phase [10]; nodes with more residual energy must have higher probability to become cluster head, and it must be ensured that the cluster has a smaller communication cost, but energy is not the only factor for cluster head selection; cluster load balancing must be ensured; in order to save energy consumption when nodes broadcast in initial clustering phase of each round, an energy prediction model of RDA nodes is established.
4. EEPCA Clustering Algorithm
4.1. Calculation of Distance between Nodes
Nodes can perceive their mutual distance according to attenuation of signal strength in the process of transmission. In clustering phase, all nodes use certain transmission energy for broadcast. For instance, with energy
The nodes establish a routing table of neighboring nodes based on received data and save all relevant information of all nodes within its communication range. All nodes in the networks are marked by the only integer value, which is each node's ID. The information stored in the routing table includes the distance between the node and its neighboring nodes, cluster head node's ID, the distance to the cluster head, the current energy, and predicted energy consumption.
4.2. Cluster Head Selection
The cluster head node has to perform extrafunctions such as data fusion and relaying messages. In order to prevent some nodes from dying too soon due to excessive energy cost, the nodes with more residual energy should be given greater opportunity to become cluster heads, and all nodes take their turns to be cluster head nodes.
Set
Ideally, nodes are distributed uniformly and send back data at identical frequency and length. Set
The number of optimal cluster heads is [13]
Therefore, the proportion of optimal cluster heads is
In the initial stage of clustering, for any node i, there are a total of n nodes within its communication range, of which the distance between
Consider the nodes distribution in the networks. If after nodes have been clustered, the average distance between nodes within the cluster and cluster head is far, a high communication cost is inevitable for one communication within the cluster. Set
On an ideal occasion that nodes in the networks are uniformly distributed and every data transmission sends data identical in length l, the number of nodes in each cluster is
Therefore, the number of these two types of nodes is
The random distribution of nodes can be viewed as a Poisson point process [19]. Ideally, there are n points in circle A, and their locations which are uniformly distributed in A are mutually independent random variables.
A circle can be obtain after any radius revolves around the center, so consider the distribution of points on a random radius. Points are distributed uniformly in the circle, and accordingly, the density of points is proportional to radius squared. Therefore, the probability density of points on a random radius is
Therefore, the calculation of
By formula (15) and (16), the average distance expectation of nodes whose distance to the cluster head is less than
The average distance expectation of nodes whose distance to the cluster head is more than
Therefore, ideally the average energy consumption within one data transmission in the cluster is
By formulae (10) and (19), communication cost factor
Integrating node energy factor and communication cost factor, the probability for node i to become the cluster head node is
The constraints of LEACH threshold formula
4.3. Energy Consumption Prediction Mechanism
Obviously, after the networks complete a round, a new node need to be selected as the cluster head. Because it is necessary to re-evaluate the energy factor and the communication cost factor so as to determine the probability for the node to become the cluster head, the current node residual energy must be obtained. The easiest way is that all nodes in the networks carry out a broadcast through the method utilized in the first round of clustering. However, considerable energy will be consumed when broadcasting in each round of clustering, so this paper establishes an energy consumption prediction mechanism for RDA nodes.
In
The residual energy of node j can be predicted at the beginning of r round when
Due to reasons such as networks environment changes, when r round starts, all nodes need to be reclustered, and new cluster head nodes need to be elected. Node j determines whether its current residual energy is close to the residual energy predicted in the last round or not as follows:
If γ is less than constant ε, the energy predication error can be tolerated. In the initial phase of r round, node j does not broadcast its energy information and the remaining nodes update node
5. Simulation Experiment
5.1. Establishment of Simulation Environment
To evaluate performance of the algorithm proposed in this paper, we have made a simulation experiment on it with the help of MATLAB simulation software. The experiment simulates a sensor network randomly formed within a
Parameters used in simulations.
5.2. Experiment Results and Analysis
In EEPCA, α and β are calculation factors regulating the proportion of energy factor and communication cost factor in calculation
Figure 2 shows the death time of the first node, 10% nodes and 50% nodes when the values of α and β vary in the above circumstances. It can be seen when the values of α are in the vicinity of 0.74, death time of the first node and 10% nodes appears the latest; while when the values of α are within the range of 0.66–0.68, death time of 50% nodes appears the latest. In subsequent experiments, the values of α and β are unified as 0.7 and 0.3.

Influence of α and
In the previous experimental environment, EEPCA and LEACH, SEP, and EDFCM will be compared and tested to analyze EEPCA cluster head selection mechanism's impact on the algorithm performance when all nodes are heterogeneous.
The simulation results in Figure 3 show the variation of the number of dead nodes over time in the previous experimental environment in different algorithms. It can be seen in Figure 3, LEACH cannot make good use of the additional energy of heterogeneous nodes, the stable period is very short, and nodes die at a fixed speed rate. Compared with LEACH, SEP has longer stable periods. EEPCA and EDFCM curves are lines with smaller slope versus X-axis. Because EEPCA distributes energy consumption uniformly on each node in the heterogeneous network, the death time of the first and the last node is relatively closer.

Death time of nodes.
In the previous experimental environment, change the proportion of heterogeneous nodes in the total number of nodes, and observe the performance of each algorithm. Figure 4 presents the number of rounds from the beginning to the death of the first node when the proportion of heterogeneous nodes varies from 0 to 100%. In this experiment, the initial energy of all nonenergy-heterogeneous nodes is

Death time of the first node when the number of heterogeneous nodes changes.
Before the death of 10% nodes, the network can send back to the BS data of high quality and reliability [13]. So Figure 5 presents the number of rounds from the beginning to the death of 10% nodes, namely, the stable period.

Network stable period when the number of energy-heterogeneous nodes changes.
It can be seen that as LEACH is not a clustering algorithm for heterogeneous networks, with the increase of the proportion of heterogeneous nodes, attainable network stable period quickly reduces. SEP can obtain 25% more stable period than LEACH, which is basically consistent with the experimental results presented by [11]. As EDFCM takes into account heterogeneous energy of different nodes, it gets longer stable period than SEP. EEPCA takes into account the energy consumption of nodes in the communication process in addition to residual energy, so the decline rate of stable period is significantly less than other algorithms in the process of increasing proportion of heterogeneous nodes. Therefore, with greater proportion of heterogeneous nodes, a more stable period is obtained.
To go further, RDA nodes are introduced into the experiment. Set all nodes energy in the networks to be heterogeneous, 50% nodes to be RDA nodes, and 10% nodes to be malfunctioning. All RDA nodes send messages 3–7 times in a round, and the sizes of messages are valued randomly between 2000 and 6000 bits. Examine the impact of the constant ε on networks stable period.
Figure 6 shows when the value of ε is near 0.92–0.93, the network achieves maximum stable period.

Impact of constant ε on network stable period.
RDA nodes are introduced, and the stable periods of all the algorithms are examined. This experiment sets all nodes that are energy heterogeneous, 50% of which are RDA nodes, constant

Network stable period.
Obviously, due to the introduction of energy consumption prediction mechanism, broadcast frequency in the clustering phase in each round is effectively reduced. Therefore, in a network heterogeneous in two ways, initial energies and monitored objects, EEPCA makes significant improvement in network stable period compared with the other three algorithms.
Figure 8 shows that all the nodes are energy heterogeneous, 50% are RDA nodes, and 10% of the nodes are malfunctioning. In EEPCA, the number of messages received by BS is on linear rise for a long period of time, while in other algorithms, the growth rate of the number of messages received by BS begins to decline earlier. To sum up the total number of messages sent back to BS by all nodes in these four algorithms when the network fails, the amount of data collected by EEPCA is much larger than that by the other three algorithms. Therefore, EEPCA has better network monitoring quality.

The number of messages received by BS.
6. Conclusions
In this paper, we describe the HWSN model with both different initial energies and monitored objects. We present an effective energy prediction clustering algorithm EEPCA for multilevel heterogeneous sensor networks. In EEPCA, each node independently selects itself as the cluster head node based on energy factor and communication cost factor, which leads to the probability of cluster head election related to nodes' current residual energy and average communication cost after being selected. At the same time, with the consideration that the WSNs are frequently used to monitor objects such as temperature and humidity which need to report data regularly, and the length of reported data is usually fixed, an energy consumption prediction mechanism is established for RDA nodes. Simulation results show that compared with LEACH, SEP, and EDFCM, and EEPCA can achieve longer lifetime, higher energy efficiency, and better network monitoring quality. Its performance is superior to other protocols.
Footnotes
Acknowledgments
This work was supported by the opening Project of State key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (SKLNST-2010-1-03), the Technology Fund of support in Sichuan Province (2009GZ0153, 2011GZ0188).
