Abstract
Wireless sensor networks are widely used in many mission critical applications such as environment monitoring, military monitoring, and healthcare. The highly sensitive nature of collected information and various potential security threats make security in wireless sensor networks a crucial concern. For resource-constrained nature of these special networks, the design of lightweight cryptography has become active research topic over the last several years. In this article, we provide experimental performance analysis of several lightweight block ciphers and message authentication codes to help for choosing better algorithms for wireless sensor networks in terms of efficiency. Unlike the previous simulation-based benchmarking projects, we employ representative three real-world devices of wireless sensor networks: 8-bit AVR (Arduino Uno), 16-bit MSP (Tmote), and 32-bit ARM (Raspberry Pi 2) microcontrollers. We also introduce an efficient implementation of rotations that can improve the performance of lightweight block ciphers with 32-bit rotation operations on both 8-bit AVR and 16-bit MSP microcontrollers and verify this methodology. We evaluate and compare performance of selected block ciphers and message authentication codes to recommend better solutions for wireless sensor networks. Based on our results, we show that the simulation-based results could be inaccurate and different from results of real implementations. As a result of our experimental performance analysis, we identify lightweight block ciphers and message authentication codes suitable for wireless sensor networks and provide their feasibilities.
Keywords
Introduction
Wireless sensor networks (WSNs) are emerging technologies composed by a large number of low-cost, low-power, less-memory, and limited computational power sensor nodes communicating at short distance through wireless links. WSNs have a great potential to be deployed in wide mission critical applications, such as civilian, military monitoring, and healthcare applications. However, due to the distributed nature, deployment in remote areas, and wireless communication, WSNs are vulnerable to numerous security threats such as node capture, physical tampering, eavesdropping, packet injection, and denial of service (DoS). Such security threats cause privacy breaches of the people, malfunction of the networks, and serious problems to the networks. In particular, WSNs for healthcare monitor and collect the health data of patients using wearable or implantable sensor nodes. Since patient data are generally categorized as high sensitive information, unauthorized collection and alteration of data and unwanted information leakage result in an invasion of privacy of patient and further life-threatening risks to the patient. Security thus needs to be taken into account at the start step in the design of WSNs.
It is essential to ensure privacy protection, data confidentiality, data integrity, and authenticity typically obtained by means of encryption algorithms and message authentication mechanisms. Those cryptographic mechanisms can prevent and thwart various security threats. However, due to the highly resource-constrained sensor nodes, security in WSNs poses more severe challenges compared to the traditional networks. Thus, many works on lightweight cryptography including lightweight block ciphers and message authentication mechanisms have been proposed in order to consume fewer resources while providing the claimed level of security. Several benchmarking projects have also been promoted on different hardware and software platforms to evaluate the performance of those cryptographic primitives, but few articles present results of real software implementations on the real-world devices in WSNs. Additionally, the benchmark results could be different from the results based on real software implementations.
In this article, we evaluate performance of several representative lightweight block ciphers and message authentication codes (MACs) suitable for WSNs by conducting experiments on three real-world devices used as typical sensor motes: 8-bit AVR (Arduino Uno), 16-bit MSP (Tmote), and 32-bit ARM (Raspberry Pi 2) microcontrollers. Since most lightweight block ciphers are optimized for high-performance devices (i.e. 32-bit and 64-bit platforms), their performance tends to be slow in the low-performance devices (i.e. 8-bit AVR and 16-bit MSP microcontrollers). Fair Evaluation of Lightweight Cryptographic Systems (FELICS), a most recent and widely used simulation-based benchmarking framework,1,2 pointed out this issue and provided the optimal assembly code for rotations of 32-bit values. 3 This efficient implementation of rotations can improve the whole performance of lightweight block ciphers with rotate operations. Seo et al. 4 also presented compact implementations of LEA block cipher on AVR 8-bit microcontrollers based on the similar methodology. However, it is impossible to ascertain how much the optimal rotations enhance the performance of block ciphers. We thus introduce the efficient implementations of rotations in detail and verify the performance of efficient implementation of rotations by applying it to three lightweight block ciphers (SIMON, SPECK, and LEA) and by means of three target devices. We measure the execution time of three lightweight block ciphers based on experimental analysis and compare their performance. We then compare our results to FELICS and perform a simple statistical test to confirm the difference between real experimental results and simulation-based results. By conducting experiments on the target devices, we also measure and compare performance of MAC algorithms: cipher-based MAC (CMAC) and keyed-hash MAC (HMAC). For CMAC, we consider the above three lightweight block ciphers with the efficient implementation as well as one typical block cipher standard, Advanced Encryption Standard (AES).
This article is organized as follows: section “Efficient implementation of block ciphers for WSNs” presents lightweight block ciphers and introduces benchmarking projects and an efficient implementation of lightweight block ciphers for WSNs. Section “Efficient MACs for WSNs” presents classification of MACs and describes several MAC algorithms. Section “Experimental performance analysis” presents the methodology used to conducting our experiments for performance analysis of block ciphers and MACs for WSNs and provides our results and analysis. Section “Conclusion” concludes this article.
Efficient implementation of block ciphers for WSNs
In this section, we introduce lightweight block ciphers and benchmarking projects for them and present a generalized efficient implementation of rotations.
Lightweight block ciphers
There are two design strategies for lightweight block ciphers: 5 Feistel and substitution–permutation network (SPN). Both strategies have their respective advantages and disadvantages. Feistel networks usually apply a round function to only one half of the input block but require more executions of the round functions to maintain security margin. 6 Implementing the inverse of Feistel construction is almost free, so it is possible to design a circuit that provides functionalities for the both encryption and decryption with minimal overhead. However, most SPNs usually apply a transformation function, slightly complex round function, to the entire block and hence can be implemented using fewer rounds. However, decryption is not free, so this strategy requires more area. Table 1 illustrates properties of lightweight block ciphers in terms of structure, block size, key size, number of rounds, and so on.
Lightweight block ciphers (block and key sizes are expressed in bits).
SPN: substitution–permutation network; AES: Advanced Encryption Standard; GFN: generalized Feistel network.
SPN-based lightweight block ciphers can be divided into two categories: 7 AES-like and other SPN. Lightweight block ciphers such as LED 8 and KLEIN 9 in the AES-like category have a structure derived from that of the AES.10,11 Lightweight block ciphers in the other SPN category have a SPN structure which is not as close to the AES as the others. Noekeon, 12 mCrypton, 13 and PRESENT 14 belong to this category.
Feistel-based lightweight block ciphers can be divided into two categories:
7
two branched and generalized Feistel network (GFN). Two branched category includes all the Feistel networks operating on blocks of size
Many block ciphers employ ARX (modular addition, rotation, XOR) operations because they are relatively fast in hardware and software. Also, ARX-based block ciphers do not require the space that the substitution tables (S-Boxes) and so their implementation is compact and simple. In addition, ARX operations are inherently less conductive to timing attacks than S-Box-based constructions because they run in constant time. Examples include SPECK, HIGHT, and LEA.
Benchmarking projects for lightweight block ciphers
As various lightweight block ciphers have been presented, several benchmarking projects1,21–24 have been promoted to evaluate the performance of cryptographic primitives including lightweight block ciphers on different hardware or software platforms.
European Network of Excellence for Cryptology—Phase II (ECRYPT II) project is a Network of Excellence in the area of cryptology with a duration of 4 years. 21 During ECRYPT II project, Eisenbarth et al. 25 implemented 12 low-cost block ciphers and evaluated their performance on Atmel AVR ATtiny45 8-bit microcontroller in terms of code size, RAM use, cycle count in encryption and decryption, and energy consumption.
The ECRYPT Benchmarking of Cryptographic Systems (eBACS) designed during the ECRYPT II project is a well-known public benchmarking for cryptography. 22 eBACS measured the speed of various cryptographic primitives on personal computers and servers using a benchmarking framework, system for unified performance evaluation related to cryptographic operations and primitives (SUPERCOP). eBACS unifies and integrates ECRYPT Benchmarking of Asymmetric Systems (eBATS), ECRYPT Benchmarking of Stream Ciphers (eBASC), ECRYPT Benchmarking of All Submitted Hashes (eBASH), and ECRYPT Benchmarking of Authenticated Ciphers (eBAEAD).
The BLOC project is a research project aimed at studying the design and analysis of block ciphers dedicated constrained environments. 23 During this project, Cazorla et al.26,27 evaluated 12 lightweight and 5 conventional block ciphers for wireless sensor nodes using a MSP430, a Texas Instrument 16-bit microcontroller running with an external 8 MHz clock.
According to the results of benchmarking frameworks, there is a problem that the performance of all algorithms are slow on both 8-bit and 16-bit microcontrollers. To solve this problem, two research groups presented implementations of SIMON, SPECK, and LEA optimized to 8-bit microcontrollers.4,28
Beaulieu et al. 28 showed high-performance implementations of SIMON and SPECK on AVR 8-bit microcontrollers with regard to RAM-minimizing, high-throughput/low-energy, and flash-minimizing implementations.
FELICS, a free and open-source benchmarking framework, evaluated the performance of lightweight block and stream ciphers with regard to code size, RAM consumption, and execution time on 8-bit AVR, 16-bit MSP, and 32-bit ARM microcontrollers. 1 FELICS provided the optimal assembly code for rotations that can improve the whole performance of lightweight block ciphers on 8-bit and 16-bit microcontrollers. 3
In a similar way, Seo et al. 4 provided compact implementations of LEA block cipher on AVR 8-bit microcontrollers. They took advantages of byte-wise rotations by splitting a 32-bit word operation into four byte-wise operations and by avoiding several rotation operations and reduced the code size.
In this article, we introduce the optimized implementations for rotations.3,4 This methodology shows an efficient implementation of rotate operations considering registers, and it can be applied to all algorithms with rotate operations. We select three lightweight block ciphers, SIMON, SPECK, and LEA, that ranked high in FELICS. We then apply the efficient implementation of rotate operations to LEA, SIMON, and SPECK and evaluate their performance using real-world devices, Arduino Uno as a 8-bit AVR microcontroller, Tmote as a 16-bit MSP microcontroller, and Raspberry Pi 2 as a 32-bit ARM microcontroller. FELICS provides most recent results of the performance of lightweight block ciphers on three widely used microcontrollers. Since FELICS is a simulation-based benchmarking framework, there can be difference between the evaluation results of FELICS and the evaluation results using real-world devices. We compare the results of evaluation on real-world devices to those of FELICS in terms of the execution time.
Efficient implementation of rotation
SIMON, SPECK, and LEA require multiple rounds and perform several rotations for each round to increase security. Table 2 shows the number of rotations for every round required in SIMON, SPECK, and LEA.
Required rotations of SIMON, SPECK, and LEA.
ROL #: rotate left through carry by # bits; ROR #: rotate right through carry by # bits.
The performance of algorithms depends on the speed of rotations because multiple rotations are performed for each round. According to the block size, 64-bit, 96-bit, and 128-bit blocks, SIMON and SPECK perform rotations of 32-bit, 48-bit, and 64-bit words by # bits, while LEA performs rotations of 32-bit words by # bits mentioned in Table 2. Rotations of three algorithms are well-supported and fast in 32-bit and 64-bit platform, but their speed will be degraded in 8-bit and 16-bit platforms. Therefore, an optimized implementation of rotations on 8-bit AVR and 16-bit MSP microcontrollers is needed.
A rotate operation is a circular shift (>>, <<) in which no bits are discarded. The following code (1) is an example of rotations of 32-bit words x by i bits
In a microcontroller with 32-bit registers, the above rotations are efficiently performed using one or two registers, while multiple registers are required to perform them in microcontrollers with 8-bit and 16-bit registers. For example, for a microcontroller with 8-bit registers, four registers and additional registers depending on the situations are required to rotate bits on a 32-bit word.
The following assembly code (2) describes the process of executing left rotations of a 32-bit word.
For the lowest register
The performance of operations is reduced because multiple registers are required for 32bit-wide operations in the 8-bit AVR microcontrollers. However, the 8-bit AVR microcontroller performs a 1-bit left rotation and 8-bit wise (byte-wise) rotation in a special way. A 1-bit rotation can be achieved by only adding the bit with itself. The following assembly code (3) describes a 1-bit rotation implemented by the AVR-GCC compiler.
By the
A byte-wise rotation can be achieved by swapping between registers regardless of the direction. The following assembly code (4) describes a byte-wise left rotation implemented by the AVR-GCC compiler.
After storing the value of the lowest register
Under the special way mentioned above, a 1-bit rotation and byte-wise rotation are more efficient than other rotations. In this regard, if all rotations are replaced with the 1-bit and byte-wise rotation, the performance will be improved. Left rotations by more than 1 bit can be replaced by repeated 1-bit left rotations, left rotations by more than 5 bits can be replaced by one byte-wise left rotation and repeated 1-bit right rotations, and left rotations by more than 8 bits can be replaced by one byte-wise left rotation and repeated 1-bit left rotations. Table 3 illustrates the modified left rotations using only 1-bit and 8-bit rotations. This approach can be easily applied on not only 8-bit AVR microcontrollers but also 16-bit MSP microcontrollers. However, the original approach is more efficient in the 32-bit ARM microcontrollers.
Modified rotation using 1- and 8-bit rotations only.
ROL #: rotate left through carry by # bits; ROR #: rotate right through carry by # bits.
Efficient MACs for WSNs
Classification
A MAC is a small fixed-sized block of data used to ensure data integrity and authenticity by means of a key-dependent authentication tag, also referred to as a MAC. The MAC is appended to a message and provides assurance that the message is unaltered (integrity) and comes from the intended sender (authenticity).
Based on the security, MACs may be both unconditionally secure and computationally secure. Unconditionally secure MACs are based on universal hash function families, pioneered by Carter and Wegman. 29 Unconditionally secure MACs allow for message integrity opposed to forgers with unlimited computational power while computationally secure MACs are only secure when forgers have limited computational power. In computationally secure MACs, keys can be used to authenticate an arbitrary number of messages. However, in unconditionally secure MACs, the authentication key can only be used to authenticate a limited number of exchanged messages. Because of the difficulty of management of one-time keys, unconditionally secure MACs are considered impractical in many applications. Thus, computationally secure MACs have become the main method for real-life applications. Computationally secure MACs can be classified into three categories depending on the main building block used to construct them: block cipher–based, cryptographic hash function–based, and universal hash function family–based. Figure 1 shows the development and variation of computationally secure MACs.

Variation in MAC algorithms.
Block cipher–based MACs use block ciphers for creating authentication tags and thus their security depends on the security block ciphers. Cryptographic hash function–based MACs, first introduced by Tsudik, 30 create authentication tags using cryptographic hash functions and thus their security depends on the security of cryptographic hash functions. Both categories of MACs allow to easily change underlying block ciphers and cryptographic hash functions, respectively. Since the length of authentication tags in cryptographic hash function-based MACs is longer than that of block cipher–based MACs, the cryptographic hash function-based MACs are usually more secure. Universal hash function family–based MACs use universal hash functions to create authentication tags and have the merit of low collisions. However, there is a limit to employ universal hash function family–based MACs to WSNs because they have high overhead due to the use of distinct random numbers for every message and the integer multiplications.
In this article, we focus on the block cipher–based and cryptographic hash function–based MACs appropriate to WSNs where the lightweight MACs are required.
Block cipher–based MAC
MAC algorithms based on block ciphers have designed with a focus on the improvement of security and parallelism.
MAC algorithms with the improvement of security and parallelism have been developed based on cipher block chaining-MAC (CBC-MAC)31,32 and parallelizable MAC (PMAC), 33 respectively. The MAC algorithm specialized in 32-bit microcontrollers, Chaskey, 34 was recently developed.
CBC-MAC, the most well-known MAC algorithm, uses a block cipher for constructing an authentication tag. The message is encrypted with some block cipher algorithm in CBC mode to create a chain of blocks and a part of the last encrypted block will be used for the authentication tag. The CBC-MAC algorithm was adopted as a standard by NIST 31 and ISO/IEC. 32 XOR MACs 35 are MAC algorithms based on the finite pseudorandom function (PRF), and their security depends on the security of the underlying PRF without regard to the length of messages.
CMAC 36 fixes security deficiencies of CBC-MAC, that is, secure only for fixed-length messages. Black and Rogaway 37 first addressed the security deficiencies of CBC-MAC and proposed XCBC, but the XCBC algorithm requires three keys. After then, Iwata and Kurosawa 38 proposed an improvement of XCBC, one-key CBC-MAC (OMAC), reduces the amount of key material required for XCBC. They refined the OMAC algorithm and analyzed its additional security. This refinement of OMAC is OMAC1 (OMAC version one) and it is equivalent to CMAC. CMAC became an NIST recommendation in May 2005. 36
Galois MAC (GMAC) 39 is an authentication-only variant of the Galois/Counter Mode (GCM) 39 uses GHASH, 40 a universal hash function based on the Carter–Wegman design, 29 to create authentication tags. Chaskey is a permutation-based MAC algorithm designed for 32-bit microcontroller architectures 34 and improves the problem of key scheduling in the previous block cipher–based MACs. Although Chaskey is designed to perform well on a wide range of 32-bit microcontrollers, it is compatible with 8-bit and 16-bit microcontroller architectures.
Black and Rogaway 33 proposed PMAC based on Gray codes, but Rogaway 41 reproposed PMAC1, a refinement of PMAC due to the difficulty of implementation of the PMAC algorithm. PMAC1 uses a tweakable block cipher to improve the efficiency of implementation. The security of the PMAC1 algorithm is similar to that of the CMAC algorithm. Sarkar 42 proposed iPMAC to solve the problem of PMAC1 that require the computation of certain discrete logarithms in the design stage. The iPMAC algorithm avoids the tweakable block cipher approach and uses a parallelizable PRF instead. As a result, iPMAC was improved in terms of the performance and flexibility compared with the PMAC1 algorithm. PMAC_Plus 43 is an extensively modified version of PMAC and improved the security of PMAC at a low additional cost. MARVIN was designed for constrained platforms. 44 The MARVIN algorithm is fully parallelizable and creates an authentication tag using a square-complete transform (SCT).
Cryptographic hash function-based MAC
Cryptographic hash function-based MAC is a mechanism for calculating a MAC using cryptographic hash functions. MAC algorithms based on cryptographic hash functions have designed with a focus on the improvement of security and performance.
Preneel and van Oorschot 45 proposed a new forgery attack on all iterated MACs including message authenticator algorithm (MAA) and Data Encryption Standard (DES) CBC-MAC. To solve this problem, they proposed MDx-MAC, 46 a new generic construction which transforms any MD4-family hash functions into a secure MAC algorithm. The underlying hash function can be any of MD5, RIPEMD, SHA, or other similar hash functions except MD4 because of the vulnerability of MD4. Since MDx-MAC mainly uses MD5, it is also known as MD5-MAC.
HMAC is the typical MAC algorithm based on cryptographic hash functions. 47 Any cryptographic hash function such as MD5 or SHA-1 can be used for calculating an HMAC in combination with a secret shared key. HMAC depends on the properties of the underlying hash function but substantially less affected by collisions than their underlying hashing algorithms alone. FIPS PUB 198 generalizes and standardizes the use of HMACs. 47 In 1996, Bellare and colleagues48,49 published the definition and analysis of the HMAC construction and also wrote RFC 2104. They also defined nested MAC (NMAC), the theoretical foundation of HMAC, in the article. 48
Patel 50 proposed enhanced NMAC/enhanced HMAC (ENMAC/EHMAC), enhancements that allow both short and long messages to be message authenticated more efficiently than NMAC/HMAC. Najjar 51 proposed dynamic HMAC (d-HMAC) to improve and enhance the cryptographic characteristics of HMAC. d-HMAC provides stronger resistance against birthday attacks and brute force attacks than HMAC. Yasuda 52 proposed L-Lane HMAC to resolve the problem that the security of HMAC is limited by a birthday attack. Yasuda 53 also proposed H2-MAC to increase efficiency over HMAC by remedying the drawback of managing two secret keys of HMAC. H2-MAC is defined by omitting the outer key of HMAC and proven to be a secure PRF under the assumption that the underlying compression function is a PRF with an AffiX (PRF-AX). 53 H2-MAC keeps the security of HMAC.
Experimental performance analysis
Target devices and experiment methods
To analyze the performance of lightweight block ciphers and MAC algorithms in terms of execution time, we selected three widely used target devices that are representatives of most used 8-bit, 16-bit, and 32-bit microcontrollers in WSNs: Arduino Uno, Tmote, and Raspberry Pi 2. Table 4 describes the main features of the target devices.
Features of target devices.
The execution time measures the number of clock cycles spent on executing a given operation, such as encryption, decryption, or generation of a MAC. The execution time is calculated as the absolute difference between the number of cycles at the beginning of the given operation and at the end of the given operation. Unlike the FELICS simulation, we utilized real-world devices. To extract the number of cycles, we thus used the
Experiment of lightweight block cipher
We selected three lightweight block ciphers, SIMON, SPECK, and LEA, highly ranked in the FELICS block ciphers brief results 2 and presumed to be suitable for WSNs because our aim is to confirm the performance of lightweight block ciphers by conducting experiments on the real-world devices and to make certain that those ciphers are acceptable for WSNs. In addition, since rotate operations have effect on the performance of three ciphers, they can be used to verify the performance of the generalized optimal implementation in section “Efficient implementation of rotation.”
For SIMON and SPECK, we used the block size of 64 bits and the key size of 128 bits because their C codes use only 32-bit operations and the modified implementation is based on rotations of a 32-bit word. For LEA, we used the block size of 128 bits and the key size of 128 bits; LEA basically has 32-bit operations. We employed two kinds of modes of operations widely used for the standard protocols such as IPSec and TLS: CBC and CTR (Counter) modes. We measured the average number of cycles spent on encrypting and decrypting (without key schedule) of 160 bytes of data using CBC mode and only encrypting (without key schedule) of 160 bytes of data using CTR mode because in the CTR mode, encryption and decryption is same.
We first analyzed the performance of efficient implementation of rotations (the modified implementation using a 1-bit rotation and 8-bit rotation only) and compared it with the original implementation of rotations. We then measured the difference between the performance of three block ciphers with regard to two implementations on the three target devices. We finally compared our experimental results using target devices with the results of FELICS simulation in terms of the execution time. FELICS measured the execution time by computing the average number of cycles spent on encrypting and decrypting 128 bytes of data. To ensure a fair evaluation, we computed the execution time of FELICS by converting cycles for 128 bytes of data into cycles for 160 bytes of data. In addition, we used the codes of FELICS and slightly modified those codes to fit in the target devices.
Experiment of MAC
MACs based on universal hash functions have flexibility and low collision probabilities, but there can be overheads due to the large size of blocks and integer multiplication. We considered thus two types of MAC algorithms based on the classical primitives, block ciphers and cryptographic hash functions. Among those types, we selected CMAC and HMAC that are NIST standards and commonly used in building MACs. We employed SIMON, SPECK, and LEA including AES for CMAC and SHA-256 for HMAC as the underlying building blocks. We implemented AES-CMAC and HMAC-SHA-256 using open-source codes. We used the block size of 64 bits and the key size of 128 bits for SIMON and SPECK and the block size of 128 bits and the key size of 128 bits for LEA and AES. We used 23 bytes of data to consider padding in CMAC. We measured the average number of cycles spent on generating MACs of 23 bytes of data using CMAC and HMAC algorithms.
Performance of lightweight block ciphers
Comparison of the modified implementation and the original implementation
We evaluated the number of CPU clock cycles spent on executing rotations varying the number of rotations from 1 to 10 for each microcontroller. Figure 2 shows the performance comparison of original implementation (Figure 2(a), (c), and (e)) and modified implementation (Figure 2(b), (d), and (f)).

Performance comparison of bitwise rotations for 32-bit word in Arduino Uno (a, b), Tmote (c, d), and Raspberry Pi 2 (e, f).
For the 8-bit AVR microcontroller (Arduino Uno), the original implementation deteriorates performance of all rotations except a 1-bit rotation and byte-wise rotation as shown in Figure 2(a), while the modified implementation improves the performance of rotations 10 times more than the original one as shown in Figure 2(b). As shown in Figure 2(c) and (d), the results in the 16-bit MSP microcontroller (Tmote) are similar to those of Arduino Uno; a 1-bit and byte-wise rotations are more efficient than the other rotations. In addition, the performance of modified implementation is three times more than the original implementation.
Figure 2(e) and (f) shows the performance of the original and modified implementations in the 32-bit ARM microcontroller (Raspberry Pi 2), respectively. Unlike the other microcontrollers, the original implementation in the 32-bit ARM microcontroller has uniform and similar performance for all rotations to that of the modified implementation. From the results, we can infer that the original implementation does not deteriorates the performance of all rotations for only 32-bit ARM microcontroller and the modified implementation greatly improves the performance of rotations for 8-bit AVR and 16-bit MSP microcontrollers.
Comparison of lightweight block ciphers
We analyzed the performance of lightweight block ciphers on a Arduino Uno and Tmote in which the modified implementation can improve the performance of algorithms. Figure 3 gives the performance of SIMON, SPECK, and LEA on 8-bit AVR and 16-bit MSP microcontrollers with regard to both implementations. In this figure, “_Rot” means the modified implementation of rotations. According to Figure 3(a), in Arduino Uno, the modified implementation improves performance of all algorithms by almost 78%. All algorithms based on the modified implementation require less than 6100 cycles and SPECK requires less than 2000 cycles. As shown in Figure 3(b), in Tmote, the modified implementation improves performance of all algorithms by almost 48%. Comparing performance in the 8-bit AVR Arduino Uno and 16-bit MSP Tmote, SPECK is the fastest one for both original and modified implementations. SIMON_Rot has lower cycles than LEA_Rot in the Arduino Uno, while LEA_Rot has higher performance than SIMON_Rot in the Tmote.

Performance comparison of lightweight block ciphers on 8-bit AVR ((a) Arduino Uno) and 16-bit MSP ((b) Tmote) microcontrollers: SIMON, SPECK, and LEA (*_Rot: the modified implementation of rotations).
For three microcontrollers, we also compared our experiment results with the results of FELICS simulation. Figure 4 gives the results of comparison. In this figure, bars show our experiment results conducted on the real-world devices and lines show the results of FELICS simulation by computing cycles spent on encrypting and decrypting 160 bytes of data. As shown in Figure 4, in the 32-bit ARM microcontroller, there are differences between our results and FELICS results because FELICS considers the Arduino Due with an 84-MHz ARM core processor as a 32-bit ARM microcontroller. Raspberry Pi 2 used in our experiments has higher performance 10 times than Arduino Due. To find out whether the differences between two types results in both 8-bit and 16-bit microcontrollers are statistically significant, we performed paired t-tests. There were significant differences in performance between our results and FELICS results for both the 8-bit AVR microcontroller (two-tailed paired t-test, p < 0.01) and 16-bit MSP microcontroller (two-tailed paired t-test, p < 0.01). From these results, we can infer that simulation-based results are not always accurate and reliability of them may not be guaranteed in some cases. Therefore, the real-world devices should be considered in order to obtain reliable results of performance evaluation of cryptographic algorithms including block ciphers, especially for a specific environments such as WSN.

Performance comparison of our experiments and FELICS simulation (log scale, *_v#: algorithm version in FELICS).
Performance of MAC algorithms
Since WSNs mainly use resource-constrained devices, lightweight MAC algorithms are required. In this point of view, MACs based on block ciphers have advantages: low requirements related to performance and a variety of choices in the underlying block ciphers. In particular, if lightweight block ciphers are used to MAC algorithms, those algorithms will also become lightweight. Thus, as we mentioned in section “Experiment of MAC,” we used three lightweight block ciphers, SIMON, SPECK, and LEA, as the underlying primitives for CMAC. We additionally considered AES that is the most widely used encryption standard but it is not a lightweight block cipher. In general, hash functions are much faster than the canonical block ciphers such as DES and AES in software implementation. Like CMAC, HMAC also uses the underlying hash functions as a black-box and so it can be implemented by simply calling the existing function. Any block-based hash function can be used with HMAC, for example, MD5 and all SHA algorithms. Among secure hash standards, we selected SHA-256 for HMAC. We also applied two kinds of implementations to experiments of MAC algorithms.
Figure 5 gives the experimental results of performance of MAC algorithms by measuring the number of cycles spent on generating MACs for 23 bytes of data. From this figure, for the most cases, CMAC has higher performance than HMAC and lightweight block ciphers have higher performance than AES in CMAC. When having a closer look at the results on all kinds of microcontrollers, LEA-CMAC based on either the original implementation or the modified implementation is the fastest on all devices. We can make sure that lightweight block ciphers improve the performance of MACs based on block ciphers as expected. In WSNs, MACs are necessary cryptographic algorithms provide message integrity and authentication, but WSNs consist of sensor nodes with limited processing capability and memory and battery power. Thus, lightweight MAC algorithms are required, but research on lightweight MAC algorithms have not been done as much as lightweight block ciphers. We also utilized the modified implementation so as to solve the problem that performance of lightweight block ciphers is degraded in low-performance devices (i.e. 8-bit AVR and 16-bit MSP microcontrollers) because most lightweight block ciphers are optimized for high-performance devices (i.e. 32-bit and 64-bit platforms). From our experiments on the real-world devices, we showed that MAC algorithms based on lightweight block ciphers can be a candidate for lightweight MAC algorithms.

Performance comparison of MAC algorithms (log scale).
Conclusion
In this article, we have presented some experimental evaluation of lightweight block ciphers and MAC algorithms on three types of target devices that can be used as sensor motes in WSNs: Arduino Uno (8-bit AVR microcontroller), Tmote (16-bit MSP microcontroller), and Raspberry Pi 2 (32-bit ARM microcontroller). We have introduced the efficient implementation of rotations and applied it to three block ciphers, SIMON, SPECK, and LEA highly ranked in simulation-based results of FELICS to improve their performance on 8-bit AVR and 16-bit MSP microcontrollers. We have evaluated three lightweight block ciphers and two kinds of MAC algorithms, CMAC and HMAC. The results show that they have different performance on the distinct devices and provide more efficient algorithm according to devices. Further research may include the addition of more algorithms and other microcontrollers.
Footnotes
Handling Editor: Chunming Qiao
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly supported by the National Research Foundation of Korea (NRF-2016-R1C1B2011095 and NRF-2015-R1A2A2A01004792) and also partly supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2016-0-00304) supervised by the IITP (Institute for Information & Communications Technology Promotion).
