Abstract
In the PEDAP algorithm, a minimum spanning tree considering the energy consumption is established based on the Kruskal algorithm, and updated every 100 rounds. There exists a defect that the energy of some nodes rapidly expires because the degrees of nodes differ significantly, and the delay time is not considered. Based on the above analysis, a new algorithm called DADAT (a Degree-based Adaptive algorithm for Data Aggregation Tree) is proposed. The energy consumption and the delay time are both considered, and a weight model to construct a minimum spanning tree is established. Furthermore, the node degree on the tree is readjusted according to the average degree of the network, and nodes are labeled by red, yellow, and green colors according to their remaining energy; the child nodes of the red nodes are adaptively transferred to their neighbor nodes which are labeled as green. Finally, we discuss the weight and the update rounds' impact on the network lifetime. Experimental results show that the algorithm can effectively balance the energy consumption and prolong the lifetime of the network, as well as achieving a lower latency.
1. Introduction
With the development of wireless communication technology, more and more high-performance, low-cost products are widely used and this significantly promotes the progress of the wireless sensor networks (WSNs). Wireless sensor networks are composed by a large number of stationary or mobile sensors. Its widely used in national defense, environmental monitoring, traffic management, biomedicine, disaster monitoring, manufacturing, gas turbine engine testing, and other fields [1, 2]. Wireless sensor nodes have the characteristics of large number as well as small size, limited communications, and computing ability; the power of nodes' battery is limited and cannot be charged. In addition, its initial energy is heterogeneous. Therefore, how to utilize energy effectively, to ensure effective data transmission and prolong the network lifetime, will be the key issues in the process of design the network algorithms or protocols.
The sensor nodes in the monitoring region always send its information to sink node by the way of ad hoc network, and the sink node is located at the centre or on the border of the network, so the other nodes must transmit data directly or indirectly to the sink node through single hop or multiple hops. Densely deployed sensor network for data collection and transmission contains large number of redundant data, but the node energy consumption in data process is far less than that in the data transmission. Accordingly, the data aggregation techniques to minimize the energy consumption of data transmission are necessary and helpful to reduce the amount of the transmission data and achieve efficient utilization of energy. The mediate nodes aggregate the received data based on aggregate functions, segment the aggregated data into fixed-size packets, and then transmit to the next hop node [3], and thus the redundant information and unnecessary energy consumption is reduced in communication process according to the data correlation. To solve the problem, the most common method is to build an aggregation tree rooted at the sink [4, 5], in which each mediate node aggregates the data from their children and the data their own sensing information into a fixed-size packet and then sends to its parent node. As the tree is the minimal connectivity subgraph structure in a network, the tree-based data collection protocol would ensure network connectivity and reliability with guaranteed QoS effectively and easily implement efficient energy saving and so forth, and the tree-based schemes become a hot one currently. Typical tree-based data collection protocols include EDAET (an energy-aware distributed heuristic) [6], PEDAP (power efficient data gathering and aggregation protocol) [7], MNL (maximum network lifetime) [8], IAA (iterative approximation algorithm) [9], CNS (center at nearest source), SPT (shortest path tree), and GIT (greedy incremental tree) [10]. In practical applications, the amount of sensor nodes is great and the nodes are widely distributed, so the topology control and the data transmission are hard, and some distributed algorithms are difficult to be carried out considering the data aggregation. Sensor nodes independently select the data return path, the path formed by the intersection is random and cannot implement the effective aggregation of sensing data, so it can be further optimized.
In addition, the source nodes have a large number of data packets sent to the sink in case of the network link failure; the time waiting data packet and the different data aggregation computing time lead to an increase in the network delay. In many real-time applications, especially in target tracking, medical care, fire monitoring, and structural health diagnosis, it requires the strict time limits for data packet transmission. Also, the outdated information may lead to a negative impact on detection system, and the data in complex delay-constrained network is required to be transmitted with priority to eliminate the queuing delay on the mediate nodes. Therefore, during the network algorithm design, we have to consider how to establish efficient routing strategies for data aggregation and reduce the energy consumption and network data transmission delay to ensure the real-time application requirements and prolong the network lifetime.
Based on the above analysis, several key factors are considered in our scheme, such as the number of hops, residual energy, aggregation factor, and so on, and then a distributed algorithm based on the degree of adaptive dynamic aggregation tree (DADAT) is proposed. In DADAT, the weight of the node model is established, and then an aggregation tree rooted at the sink is built up based on the Kruskal's greedy algorithm. In addition, the nonleaf node updates and maintains the tree according to the average degree and its own energy information and ensures a balanced use of energy in each round. The data packets are aggregated in nonleaf node based on aggregation strategy, and this can reduce the amount of data transmitted, so that the node energy is consumed in a more efficient way, and the network lifetime is prolonged and a lower latency is achieved.
2. Network and Communication Model
2.1. Network Model
Assume that N sensor nodes are randomly deployed in the area of size all nodes no longer move after deploying; all nodes in the same structure and with the same initial energy and energy cannot be added; each node has the same communication radius r; all nodes know the network location information of all neighbors.
2.2. Communication Model
2.2.1. Energy Consumption of Communication
Our energy model is the same as what is described in the literature [11], the energy consumption of sending data and receiving data according to the wireless communication model is formulated, respectively, as follows, and
2.2.2. Communication Delay
To save energy, MAC layer communication protocol should work between the sleeping and listening modes. First, all nodes in the network are set in sleeping state, and some will switch to the listening mode when they are awakened and begin aggregating data while listening mode is completed. When the transmitting process is completed, these nodes will switch to sleeping mode again and continue to wait the beginning of the next listening period. Therefore, the delay for the node is calculated as
2.2.3. Aggregation Strategy
In the PEDAP, the
Therefore, there exists a relationship between the original data packet and the aggregated data packet for the node i as listed in (6), where
3. Data Aggregation Based Spanning Tree Algorithm
In GIT algorithm, the distance between each pair nodes is taken into account to build a tree which starts from the sink node, and its performance is much better. However, it needs the global information and is difficult to be achieved in the distributed pattern. In PEDAP algorithm, it is a near optimal minimum spanning tree-based routing scheme, and the tree structure is updated for each 100 rounds. Because it does not take into account the residual energy of nodes, the minimal energy nodes are easily to be dead. PEDAP-PA uses the information about the residual energy of each node and considers the balance of energy consumption.
3.1. Weight Model
To solve the problem, we select the numbers of hops, the residual energy, and aggregation coefficient to redesign the data collection scheme to prolong the network's survival time while ensuring data accuracy. The weight model is formulated as follows:
Based on (7) and (5), we have
3.2. The Establishment of the Tree and Adaptive Maintenance
In Section 3.1, we create a weight model, establish a tree starting from the sink node, and select the maximum weight node to be added to the current tree until all nodes in the network are added to the tree, which is reconstructed after a certain number of rounds.
In the process of establishing tree, we take into account the hops, residual energy, and amount of data, but the load of the node cannot be balanced. Some nodes have more children while others have few; this may lead to sharp energy consumption on some nodes. To solve this subproblem, we have to reorganize some nodes in the network. First, we compute the average degree of nonleaf nodes and let the degree of other nodes approach the average degree of nonleaf nodes. For example, in Figure 1, the average degree of nonleaf nodes is 2.3 and the degree of node 1 is 4, which may lead to rapid energy consumption on node 1, and we introduce the node transfer operation to reduce the energy consumption. For node 3, its current parent is node 1, and its neighbor's parent is node 2 whose degree is 2 and less than the average degree of nonleaf nodes, so node 2 may be a potential parent for node 3. If we let node 3 be a child of node 2, the energy consumption of the network should be well balanced, and thus the network's lifetime is prolonged.

The node transfer operation.
According to the above method, some nodes consume more energy than others do in the network because the nodes in different layers consume energy differently. When the residual energy of these nodes is not enough to meet the energy requirements of the transmission of data, this will result in the loss of the data packets. To solve this problem, we can transfer the children of these nodes to the node with more residual energy in the same layer to prolong the lifetime of the network. The node is labeled in red color if the node's residual energy is less than a given value
In Figure 2(a), node 3 is a red node, node 4 is a yellow node, others are green nodes, and node 3 will die soon if the network topology control does not change in time. For node 3, node 2 and node 4 become the potential parents for the children of node 3. It is noted that node 4 is yellow already but node 2 is green, so node 2 becomes the parent of node 6. Therefore, this can well balance the energy consumption for node 3 and its neighbor nodes and prolong the network's lifetime.

The transfer diagram among red nodes, yellow nodes, and green nodes.
Such a tree is adaptively revised based on average degree to make the tree maintain a better structure. It is helpful to prolong the network lifetime while ensuring the data latency and data volume requirements. Algorithm for the specific process is listed as follows.
Step 1.
Set sink as the root, then apply the Kruskal greedy algorithm, select the node which has the largest value based on formula 11, and join the tree firstly; turn to Step 2 .
Step 2.
Calculate the average degree of nonleaf node, transfer the node based on it average degree of the node, and then calculate the degree of each node, turn to Step 3 .
Step 3.
Calculate the residual energy of each node; if the number of dead nodes reached a certain number, the algorithm stops; otherwise divide all nodes according to the residual energy into red, yellow, and green nodes. And then transfer the red node's children to the green nodes in the same layer, and calculate the degree of each node; turn to Step 4 .
Step 4.
When the number of rounds to meet certain conditions, turn to Step 1 , rebuild the new aggregation tree, or turn to Step 2 .
And the algorithm flowchart is shown as in Figure 3.

Algorithm flowchart.
4. Experimental Analysis
4.1. Experimental Environment
Some algorithm parameter values are listed in Table 1. To determine the weight factor a, we have to analyze the number of rounds when the first node dies.
Experimental parameters.
As shown in Figure 4, DADAT has a longest network lifetime when

The relationship between weight and lifetime.

The relationship between update numbers and lifetime.
4.2. The Survival Node Numbers of the Network
In this section, we compare the algorithm performance among GIT, PEDAP, PEDAP-PA, and DADAT. Figure 6 shows the comparison of four algorithms with the survival node numbers in the same round. We find that DADAT can prolong the lifetime of the network, because DADAT in the establishment of a tree adopts the adaptive revision and the updating round number is optimized. Therefore, nodes in the same layer almost consume the same energy in each round and die almost in the same time, and no node is to be the bottleneck node to influence the transmission performance of the network. However, PEDAP and PEDAP-PA algorithms establish the tree under the condition of energy efficience because some nodes in the tree die due to its high degree. After GIT builds the tree, it can be further optimized because there do not exist maintenance operations.

The relationship between the number of rounds and survival node numbers.
4.3. Network Delay
Figure 7 shows the network delay comparison among GIT, PEDAP, PEDAP-PA, and DADAT algorithms under the different networks scale. As shown in Figure 7, DADAT achieves a minimal delay because DADAT algorithm takes into account the number of hops to be the delay factor while the other three algorithms do not consider the distance factor and the node residual energy. However, the other three algorithms did not consider this factor of the number of hops, so it leads to the network average delay being too large.

The delay under different node numbers.
4.4. The Energy Consumption
We compare the energy consumption of the four algorithms: GIT, PEDAP, PEDAP-PA, and DADAT. As shown in Figure 8, we can see that the trend of energy consumption is similar because they are all based on the minimum spanning tree topology. However, DADAT has minimum energy consumption than others; this because it is based on energy-aware simultaneously maintenance tree operations. The GIT only considers the distance factor, but the energy consumption and the updating tree operation are absent, so the tree has maximum energy consumption. The PEDAP-AP algorithm considers the node residual energy; therefore PEDAP-PA algorithm's energy consumption is less than GIT and PEDAP, but PEDAP-PA does not frequently adopt the adaptive tree maintenance, so it is not as effective as DADAT.

The residual energy of different algorithm.
In addition, we can conclude that our algorithm has good scalability from Figure 6 to Figure 8; also it displays some good properties in different scenarios (sink node is located both in the center and on the edge).
5. Conclusion
In this paper, data collection problem is studied. We consider the number of hops, the residual energy, and aggregation factor, and then a distributed dynamic aggregation tree algorithm based on constrained-degree is established. The degree of nodes in the tree can be adjusted according to the average degree of the tree, and then all nodes are labeled by red, yellow, and green colors according to their remaining energy; red nodes can transfer their child nodes adaptively. We finally discuss the weight and the updating rounds' impact on the network lifetime. The simulation results show that our proposed approaches can effectively balance the energy consumption, prolong the network lifetime, and result in a lower latency.
In the next work we will consider heterogeneous data aggregation in wireless sensor networks, while meeting the QoS requirements of a boundary [13].
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This project is supported by the National Natural Science Foundation of China (Grants nos. 60703118, 71271165, and 61373174), the Open Research Funds of CEMEE (Grants nos. CEMEE20120207B and CEMEE20140302A), and Zhejiang Province Research Funds (2010R50041 and 2011C14024).
