Abstract
Data communication incurs the highest energy cost in wireless sensor networks, and restricts the application of wireless sensor networks. Data compression is a promising technique that can reduce the amount of data exchanged between nodes and results in energy saving. However, there is a lack of effective methods to evaluate the efficiency of data compression algorithms and to increase nodes’ energy efficiency. The energy saving of nodes is related to both hardware and software, this article proposes a new scheme for evaluating energy efficiency of data compression in wireless sensor networks according to the node’s hardware and software. The relationship between the energy efficiency and the hardware and software factors is expressed by a formula. In this formula, energy efficiency can be improved by increasing the compression ratio and decreasing the ratio of
Keywords
Introduction
Data compression is very important to improve the energy efficiency of data storage and wireless communication. With the advancement of big data technologies, 1 the effect of data compression on energy saving will become more prominet. Energy saving is a core issue of wireless sensor networks (WSNs), since the nodes are powered by battery which is extremely difficult to be replaced or recharged on a larger scale. As the transmission of data consumes the majority of energy of nodes and the energy required for transmitting a single bit is approximately equal to the energy for execution of 4000 instructions, 2 data compression is used naturally in WSN node energy savings. In addition to research on various compression algorithms applicable to WSNs, 3 there is a literature that proposes hardware accelerators to accelerate data compression to improve the algorithm performance. 4
Compression ratio is the main metric to evaluate the performance of data compression algorithms. 5 Increasing the compression ratio of algorithm means reducing the amount of data transmission, thereby reducing the communications energy consumption, but also tends to increase the computational energy consumed by compressing data, especially for wireless sensor nodes with limited resources. An obvious tradeoff exists between the computational energy used for compression versus the energy saving associated with the transmission of compressed data instead of raw data. In order to ensure the actual energy-efficiency, we cannot simply choose a high-compression ratio algorithm, but should focus on the overall energy efficiency of the data compression. In other words, we must ensure that the energy consumed to communicate the amount of data reduced by compression is greater than the energy consumed to compress the data to reduce this amount of data.
Energy efficiency of data compression algorithms in WSNs reflects energy-saving obtained by data compression.6,7 The higher the energy efficiency of data compression, the more energy saved by compressing data. Each data compression algorithm has its own advantages and disadvantages, and the actual energy efficiency of the compression algorithm is closely related to the hardware conditions of the nodes executing the algorithm. Although WSNs are data-centric, considering the human-centric applications, 8 the energy efficiency changes as the hardware or its parameters of nodes change. So the energy efficiency evaluation scheme of the data compression algorithm must be based on the hardware and software considerations in WSNs.
Through analyzing the three kinds of energy consumption related to the transmitting raw data, the transmitting compressed data, and the compressing data in WSNs, it can be found that in addition to the compression ratio, the complexity and operation environment of algorithms and wireless communication environment also are non-negligible factors. Similar to the compression ratio, the operating environment and execution efficiency of the compression algorithm are also the main factors affecting the overall energy efficiency of data compression. Although nodes’ wireless communication environment (the transmission power of radio frequency (RF) devices, receiver sensitivity, and the scene environment which decides the relationship between radio signal strength and communication distance) is not directly related to data compression, it affects the energy consumed by transmitting data directly, correspondingly, it indirectly affects the energy saved by compressing data in nodes. In other words, a node’s wireless communication environment indirectly affects energy efficiency of data compression algorithms.
There are two parts of energy involved in the energy efficiency evaluation of data compression. The first part is energy saved by reducing the amount of data transmit, which relates to the compression ratio of the algorithm. The second part is energy consumed by running compression algorithms. Among the current researches on performance evaluation of compression algorithms, most literatures only take the compression ratio into account9,10,11 or combine the compression ratio with the complexity of algorithms
12
as well as compression error13–15 and so on, only a few literatures have proposed several evaluation metrics involving energy consumption such as
At present, there is still a lack of a comprehensive, effective, and quantitative method that can be used both to evaluate the efficiency of data compression algorithms and to help increase the energy efficiency of nodes in WSNs; the method should also be simple and feasible. Therefore, it is necessary to study a scheme to evaluate data compression algorithms objectively and effectively, so as to provide a quantitative basis for the selection of energy-efficient algorithms and further enhancement of energy efficiency. The scheme should take the direct and indirect factors above into account under the given operation environment of algorithms and nodes’ wireless communication environment.
According to the consideration above, this article proposes an evaluation scheme that is based on current energy efficiency metrics. The energy efficiency evaluation of data compression is dependent on the hardware and software implementations of compression algorithms. In the scheme, it is revealed by a formula that in addition to compression ratio, energy efficiency of data compression algorithms is also dependent on hardware factors, such as relative energy consumption of processor and RF devices, as well as software factors including relative energy consumption of algorithms. Two typical lossless data compression algorithms are evaluated by the scheme. For the nodes whose wireless communication power is controllable, an adaptive mechanism based on the scheme is proposed to save more energy through selecting the more effective algorithm adaptively while other conditions remain unchanged.
The article is divided into six sections: section “Current energy efficiency evaluation metrics” introduces the main evaluation metrics involving energy consumption currently in WSNs. In section “Evaluation scheme,” an evaluation scheme is proposed by analyzing the shortcomings of current schemes. In section “Evaluation experiment and analysis,” the scheme is used to evaluate typical lossless data compression algorithms in WSNs. An adaptive mechanism for the selection of compression algorithms is proposed for nodes whose wireless communication power is controllable in section “Mechanism to select algorithms.” Section “Example of algorithms selecting” gives an application of the mechanism. Section “Conclusion” concludes the article.
Current energy efficiency evaluation metrics
In the current energy efficiency evaluation metric of data compression, compression ratio is the most important and most applied, the other metric involves
Compression ratio (Rc )
Compression ratio is a ratio between the volume of the data reduced by data compression and the raw one. Its expression is shown as follows
In the formula,
As the most important evaluation metric of data compression in traditional applications, compression ratio also plays an irreplaceable role in the energy efficiency evaluation of compression algorithms in WSNs. Different compression algorithms have different compression ratios for the same data. The higher the compression ratio, the less the amount of data to be transmitted. Therefore, it affects the energy efficiency of algorithms directly.
D/E
In equation (2),
ESB
In equation (3),
Compression error
Compression error is defined as the error between the decompressed data and the raw data, it is mainly for lossy data compression algorithms. Signal-to-noise ratio (SNR), root mean square error, and peak error are the evaluation indexes of error generally used in compressing data. Compression error reflects the accuracy of compressed data, and it also influences the compression ratio. For the same algorithm, the larger the compression error, the higher the compression ratio.
The complexity of algorithms
While data compression algorithms can be used to reduce the amount of data to be transmitted, the operation of algorithms consumes energy and the energy is related to the complexity of algorithms closely. The complexity of algorithms includes time complexity and space complexity. The higher the complexity of algorithms, the longer the running time of algorithms, and the larger is the energy consumption. In general, the complexity of algorithms is not necessarily associated with the compression ratio.
In conclusion, compression ratio, compression error, and complexity of algorithms are not enough to reflect the energy efficiency of compression algorithms in WSNs. In terms of the definitions,
Evaluation scheme
According to the energy saved by compressing data and the energy consumption for running data compression algorithms in WSN node, the net energy income obtained by running an algorithm to compress given data is calculated as shown in Figure 1.

Net energy income obtained by data compression.
Since
To reflect the effectiveness of net energy income obtained by compressing data, the percentage of
In equation (6),
In equations (7) and (8),
It is obvious that
As shown in equation (9), in addition to
The factors only related to hardware are defined as hardware coefficient
It is shown in equation (10) that
The factors only related to algorithms and data exchanged are defined as software coefficient
As shown in equation (11),
From equations (9) and (11),
It is shown in equation (12) that energy efficiency of data compression algorithms is not only related to the compression ratio
According to
According to
Equations (13) and (14) are substituted into equation (10) as
When MCU and RF chip have the unified voltage source,
With offline evaluation,
It can be found from equation (17) that there are two ways to obtain
It can be known from equation (12) that
From equation (18), it can be found that when
So equation (12) can not only be used to evaluate the energy efficiency of different compression algorithms under the different hardware conditions, but can also reflect the applicability of algorithms under the specific hardware condition and the energy-saving effect of algorithms. It also points out a possible direction to further improve the energy efficiency of algorithms.
Evaluation experiment and analysis
In this evalution experiment, nodes and compression algorithms are selected from Sadler and Martonosi 19 and Marcelloni and Vecchio. 20 The nodes selected are T-mote Sky and TinyNode, respectively, and the tested algorithms are called S-LZW and S-Huffman, respectively, which are generally used in WSNs. LZW is a dictionary-based lossless compression algorithm and S-LZW is an adapted version of LZW. Huffman coding is a kind of statistical compression algorithm and S-Huffman is a modfied version of Huffman coding. S-LZW and S-Huffman all are designed specifically for resource-constrained sensor nodes.
The evaluation of coefficient k of T-mote Sky and TinyNode
The hardware parameters of T-mote and TinyNode are shown in Table 1.
The hardware parameters of T-mote and TinyNode.
The parameters in Table 1 are substituted into equation (16), respectively, each coefficient
It can be seen from equations (19) and (20) that when the energy consumption for executing instructions is the same as that for transmitting 1 bit, MCU can execute 136 CPU cycles in T-mote. Similarly, MCU can execute 397 CPU cycles in TinyNode.
The evaluation of coefficient s of S-LZW and S-Huffman
The data to be compressed are generated by MATLAB as shown in Figure 2. In data series (1) ∼ (3) of Figure 2, the white noises whose variances are, respectively 0.5, 3, and 6 are added to original signals, and the length of data is 512 byte (

The data sets with different variances.
The use of MATLAB to generate test data is based on the fact that the compressibility test of the compression algorithm is easily affected by the redundancy or dispersion of the data itself, because the measured data is not only difficult to avoid the impact of its own on the test results, but also more difficult to control its redundancy or disperse. Although the way in which data are generated in MATLAB does not guarantee that the data are exactly the same each time, it is convenient to control the dispersion or redundancy of the test data. Using this method can not only test different algorithms based on the data of the same distribution feature, but can also test the compression effect of the algorithm on different dispersive data by changing the dispersion (i.e. variance) of the data.
The data in Figure 2 are compressed by S-LZW and S-Huffman, respectively.
The test results of compression by S-LZW and S-Huffman.
According to the evaluation scheme proposed in this article,
Evaluation results of S-LZW in T-mote Sky.
Evaluation results of S-Huffman in TinyNode.
As shown in Tables 3 and 4, the value of
Mechanism to select algorithms
Since the connectivity and coverage of network are two important problems in WSNs, and problems such as how to reasonably connect sensor nodes deployed randomly and how to achieve the balance of network’s energy are all crucial to prolong the life of network, power control technology 21 for WSNs is a good solution. The technology of power control is that sensor nodes in a distributed network make the application performance of relevant network optimized by selecting the appropriate level of radio transmitting power, and the adjustability of transmission power is the prerequisite to realize power control technology. For nodes, controlling wireless transmission power can change the energy efficiency of algorithms. Therefore, a power-aware selecting mechanism of algorithms is proposed in this article.
If
The relationship between

Diagram of the relationship between
In the past, most of the design and evaluation of compression algorithms did not consider the possible adjustment of the node’s wireless communication power or use a fixed value. But when the single-hop communication distance of WSNs changes with the network topology or other factors, the transmission power of the node also tends to be adjusted or changed. At this time, selecting a compression algorithm with better performance can maintain high energy efficiency or more energy conservation.
The necessity of selecting data compression algorithms
According to equation (21) and Figure 3, it can be seen that
Although the increase in transmitting power can improve energy efficiency, the improvement is limited. If
It is obvious that when the transmitting power is different, the energy efficiency of algorithms is also different. So when the transmit power of a node is changed, a compression algorithm with better energy efficiency should be selected to ensure high energy efficiency or more energy saving.
The possibility of selecting data compression algorithms
For most compression algorithms, the compression ratio is proportional to the complexity of algorithms. The relationship between the compression ratio and the complexity of algorithms provides a basis to decide when the algorithm is changed. Suppose the compression ratios are

The schematic diagram of possibility to switch algorithms.
As shown in Figure 4,
When
In equation (25), [
In conclusion, if data compression algorithms can be selected, the algorithms must meet the conditions in equation (24) and
The program to select algorithms is shown as follows.
Through the adjustment of algorithms, the curve of energy efficiency can be expressed as a piecewise function shown in equation (26).
The curve is the outer envelope of the two curves shown in Figure 4, regardless of what the value of
Example of algorithms selecting
The power-aware scheme of data compression selects the algorithm with better performance when the transmitting power is changed. Its purpose is to make
Node type and hardware parameters: the processor is MSP430F2618 MCU,
Data sets: five groups of slow-changing signals are generated by MATLAB and their variance is 1.
Data compression algorithms: S-LZW, 20 LEC, 14 b-RLE. 22 They are suitable to compress slow-changing signals.
The results tested using the algorithms above to compress the five data sets of slow-changing signals are shown in Table 5.
The test results of different algorithms on five data sets.
As shown in Table 5, the
The combinations of algorithms and their adjustability.
According to equation (24), algorithm combinations [S-LZW, LEC] and [S-LZW, b-RLE] meet the conditions.
Since CC2420 has eight levels of transmitting power, combined with the information of microprocessor and according to equation (29),
The values of
It can be seen from Table 7 that the range of
The energy efficiency curves of the three algorithms are shown in Figure 5. The operations of the scheme are as follows: (1) when the wireless transmitting power is −7 dBm or less,

Energy efficiency curves of different algorithm under different wireless transmitting power.
Figure 5 shows that a highly efficient compression algorithm at a certain transmit power level may become lower at the other power level. That is, for an algorithm that is most efficient at a certain transmit power, does not guarantee that at the other power level, it is still the highest. Figure 5 also clearly shows that when the transmit power of the node is changed, the energy efficiency obtained after switching the compression algorithm according to the proposed scheme is greater than the energy efficiency obtained using only a single algorithm.
Conclusion
For resource-constrained WSN nodes, a new evaluation scheme for energy efficiency of data compression according to hardware and software implementations is proposed in this article. Based on the scheme, it is known that energy efficiency
Because of the conflict between the different performance requirements of WSNs, there is no compression algorithm that meets all WSNs performance requirements. The evaluation scheme proposed in the article is considered from the point of maximizing the energy efficiency of a node. It does not consider the real-time requirements for data compression, nor does it guarantee that the communication load generated by the selected algorithm is minimal. Since the compression efficiency evaluation methods of various lossy compression algorithms are affected by the allowable limit of compression error, and we have not yet obtained the appropriate test results of lossy compression, so the effectiveness of the proposed evaluation scheme in the article is limited to lossless compression. The applicability of this scheme to lossy compression algorithms will be the next research task. In the future, through the combination of some learning algorithms,23,24 prediction and optimization algorithms 25 that are suitable for WSNs, we will further improve the efficiency of the proposed evaluation scheme.
Footnotes
Handling Editor: Wenbing Zhao
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
