Abstract
Wireless sensor networks are susceptible to node replication attacks due to their unattended nature. Existing replicas detection schemes can be further improved in regard of detection probabilities, detection overheads, and the balance of detection overheads among sensor nodes. In this paper, we make the following contributions: first, we point out the unrealistic assumption that the replica node would behave honestly as the benign sensor nodes; thus the existing detection schemes would fail if the replica nodes cheat or collude with the compromised node. Then, we propose a location-binding symmetric key scheme forcing the replica nodes to be inserted only in the vicinity of the compromised node. Later, a detecting scheme is presented to inspect the location claims within the neighborhood. Finally, analysis shows that our scheme helps to detect and defend against replication attacks effectively and efficiently. Extensive simulations are conducted and the results show that the detection overheads are low and evenly distributed among all the sensor nodes.
1. Introduction
Wireless sensor networks (WSNs) are generally deployed in the unattended environments for some missions, such as environment monitoring and enemy surveillance. The unattended nature and the lack of tamper-resistant hardware cause wireless sensor networks to be vulnerable to various insider attacks, threatening the operation of WSNs.
Replication attack is one of the insider threats. The attacker captures one or more sensor nodes, tampers with them and obtains the credential materials, such as the identity and keys, then clones some nodes as replica nodes, and surreptitiously inserts these replicas in the network. Subsequently, the attacker may launch a variety of insidious attacks, such as data injection, selecting forwarding, routing loop, or even topology partition. Just as shown in Figure 1, a network was formed by the normal nodes (without frame). The captured and compromised nodes are represented in the solid frame, and replica nodes are represented in the dashed frame.

Replication attacks in wireless sensor network.
Thus, detection of replica nodes becomes one research hotspot in WSN [1]. The first distributed replication detection schemes RM and LSM were proposed by Parno et al. [2]. In RM scheme, nodes broadcast to neighboring nodes the location claim message signed by ID-based public key scheme. Then the neighbors forward such received claim message with a specified probability to randomly selected network nodes, which act as witness. According to the birthday paradox [1], the nodes owning the same ID would select same witness nodes with a big probability. These witness nodes eventually detect replicas successfully. To further increase the detection probability, LSM scheme is also proposed. In this scheme, the nodes in the forwarding path of the claim messages also store and compare the messages. Thus witness line segment is formed from the source to the destination; then the witness line of the same ID will cross at some node with a large probability and the node at the cross point acts as the witness node. Compared with RM scheme, the detection probability is increased at the cost of memory storage. However, LSM scheme has the crowded-center problem because the witness line is prone to cross at the center of network with a big probability.
2. Related Work
The existing detection schemes can be classified as centralized approaches and distributed approaches.
2.1. Centralized Detection Approaches
The schemes in [3–6] assume a central base station to conduct the detection. Choi et al. [3] proposed to detect the replica nodes by set. The network is divided into disjoint subregions. A header node is enumerated to report the member list to the base station in each subregion. The reports from all of the header nodes are computed by set. The intersection of two sets is checked; any nonempty intersection implies the existence of the replica sensor node. Brooks et al. [4] proposed a centralized scheme to detect replication attacks by using random key predistribution. Every sensor node should report the usage of its keys. If the usage of some key exceeded the threshold, then the sensor node was identified to be suspicious. Ho et al. [5] presented a SPRT method for replica detection in mobile sensor networks, in which the base station checks whether the speeds of the mobile sensor nodes exceed the threshold. Based on a state-of-the-art signal processing technique, compressed sensing, Yu et al. [6] proposed CSI to detect replication attacks.
2.2. Distributed Detection Approaches
In distributed approaches [2, 7–14], the replication attacks detection is conducted by reporting the location claim messages to randomly chosen witness nodes in the network. Paradoxes of the location claims indicate the detection of replication attacks. To further improve the detection probability, Conti et al. [7] proposed RED scheme, in which a random seed was shared and upgraded in the network. The same random seed and the same pseudorandom function result in the same witness node chosen by replica nodes and the compromised node. But it is difficult to share and upgrade such random seed across the whole network. Zhu et al. [8] proposed another detection scheme by using localized multicast. Ho et al. [9] proposed to take advantage of the group deployment knowledge to further raise detection probability and lower detection overheads. Zhang et al. [10] proposed four detection schemes, B-MEM, BC-MEM, C-MEM, and CC-MEM, to address the cross-over problem and the crowded-center problem in the detection. Li and Gong [11] proposed RDE scheme to utilize the local neighborhood geographic information for replication attacks detection. Zeng et al. [12] proposed two detection schemes, RAWL and TRAWL, to distribute the witness sensor node to the network. Wang and Shi [13] introduced mobile patrollers to detect replica nodes; the result shows this solution is effective and also energy efficient to prolong the lifetime of network. Xing and Cheng [14] proposed two replication detection schemes from both the time domain and the space domain in MANETs (mobile ad hoc networks). The basic idea is to utilize a cryptographic one-way hash function to force the replica sensor nodes to keep on generating paradoxes.
It is assumed by existing schemes that compromised sensor node and the replica node carry out the detection procedures honestly. However, it is not always true; it is more likely that the program code in the compromised sensor node and the replica sensor node has been modified by the attacker for the purpose of escaping from being detected. To make things worse, the replica node may collude with the compromised node, which will lead to the failure of existing detection schemes.
In this work, we seek to detect and defend against the replication attacks with fewer communication, computation, and memory overheads than previous works. We propose a location-binding pairwise key scheme, forcing the attacker to insert the replica nodes to the vicinity of the compromised node. Then, the neighbor sensor nodes around the replica sensor nodes are the first possible witnesses to detect the replication attacks.
The remainder of this paper is organized as follows: the next section illustrates the network model and assumptions; Section 4 proposes our location-binding pairwise key management scheme which is used in our scheme to defend against the replication attacks; Section 5 presents our replicas detection scheme; Section 6 analyzes the security and efficiency of our scheme, and extensive simulations supporting the analytical findings are also shown. Discussion follows in Section 7. Section 8 concludes the proposed scheme.
3. Network Model and Assumptions
In this paper, we assume that there are only stationary sensor nodes in the wireless sensor network. We also assume that the communications between the stationary sensor nodes are bi-directional, which is also an assumption of most of previous detection schemes.
Stationary nodes can get their geographic location by using positioning device (e.g., GPS device) or positioning algorithms [15–18]. Also, we assume that all the sensor nodes are loosely time synchronized using time synchronization techniques, such as [19, 20]. Consider
Prior to network deployment, we assume that a trusted authority (TA) chooses one t-degree bivariate symmetric polynomial in (1) with the coefficients
Notation summary.
4. Location-Binding Pair-Wise Key Scheme
Most of the existing detection schemes explore the public key algorithms for encryption and digital signature. Even if public key algorithms are feasible and available in wireless sensor networks, the large computation overheads will exhaust the limited battery power and pose great challenges to the resource-limited sensor nodes.
Due to the low cost, small factor, and limited resources in the sensor nodes, symmetric encryption algorithms are more suitable for wireless sensor networks. In [21], the key management in distributed sensor networks is studied. Fei et al. [22] proposed a time-space related symmetric key predistribution scheme. Wang et al. [23] proposed an updateable key management scheme with intrusion tolerance through symmetric polynomial and one-way hash chain.
We assume the sensor nodes adopt a location-binding pair-wise key scheme (LBK), which tightly binds the sensor node's identity with its geographical location to resist against node capture and replica nodes. In the following, we first propose our location-binding pair-wise key management scheme.
4.1. Sensor Deployment and Localization
After sensor nodes are deployed into the targeted region, each sensor node carries out the bootstrap procedure as described by Algorithm 1. The sensor node obtains its geographical location
(1) Start timeout timer (2) Get the location (3) Compute sensor identity (4) (5) Compute key ring polynomial (6) Erase the symmetric polynomial (7) (8) Erase the symmetric polynomial (9) Halt (10)
Once the identity is generated, a sensor node, for example, sensor node u, generates its key ring polynomial
All the sensor nodes are required to conduct the previous process within short time duration
4.2. Neighborhood Pairwise Key Establishment
Every sensor node broadcasts its identity within its radio range to establish its neighborhood. Then each sensor node uses its neighbor lists to calculate the relevant pair-wise key for encryption and decryption. For example, every sensor node establishes pair-wise keys with its neighboring sensor node according to Algorithm 2.
(1) Broadcast (2) Initialize and start timeout timer (3) (4) (5) Receive (6) If (7) Compute pair-wise key (8) Store (9) (10) (11) (12) Erase the key ring polynomial
To defend against node attacks from compromising t sensor nodes and deriving the symmetric polynomial
Also, every sensor node, for example, sensor node u, should establish neighborhood and compute the pair-wise keys with at most
For those sensor nodes that have more than node u sends a negotiation request message to node w, including the identity of u and w, the message type (Nego_Req), the neighbor list of node u, and a nonce. The nonce is used to defend against replay attack; node w receives node u's neighbor list, chooses one common node, for example, node v, generates the shared key node w sends a message to node v, which includes the identities of nodes w and v, the message type (Nego_Ask), and the secret part, containing the identity of nodes u and v, the nonce of w, and the shared key node v replies to node u with a message, including the identities of v and w, the message type (Nego_Rly), and the secret part, which contains the identities of w and u, the shared key

Pair-wise key negotiation.
Node u decrypts to get
5. Replicas Detection
Replica nodes detection can be conducted right after the sensor nodes are initialized and location-based pair-wise keys are generated. Node location inspection is carried out to detect the forged nodes who claim to have the benign nodes' location. If the node passes location inspection, then the encrypted messages are deciphered using the pair-wise key according to the node's identity. If this decryption fails, then the node is identified as a replica node.
Because of the above pair-wise key scheme and the location-binding sensor nodes' identity, the attacker is restricted in placing the replica nodes after compromising benign nodes and making replicas. The best strategy for the attacker is to put the replica nodes in the vicinity of the compromised node. This way, the replica nodes can utilize the stolen credentials to communicate with the neighbor sensor nodes of the compromised node and hide in the network without being detected.
Thus, the replica nodes are restricted to a region around the sensor node. Based on this idea, we propose a replication detection scheme within neighborhood.
Every sensor node broadcasts location claim message to its neighborhood periodically:
(1) (2) (3) return 2 (4) (5) (6) return 1 (7) (8) return 0 (9)
When the sensor node v receives a message from its neighbor node u, it computes the distance to check whether or not the claim is issued from its neighbor sensor nodes. If this check fails, the sensor node who issued this location claim is identified as a replica node.
6. Analysis and Simulations
To inspect our defending and detection scheme, both numerical and simulations results are shown in this section. First, we give the security analysis in Section 6.1, then the numerical results follow in Section 6.2, and simulation results are shown in Section 6.3.
6.1. Security Analysis
Definition 1.
The communication coverage of a sensor node is a circle with center at the sensor node and radius equal to the communication range R. The area is written as
The sensor nodes in the communication coverage of sensor node u are neighbor sensor nodes of u.
Definition 2.
Blind zone is defined as the common communication coverage of sensor node u and its replica sensor node
Definition 3.
Witness zone is defined as the region in the communication coverage of sensor node
We also have the following facts.
Fact 1. Replica node Fact 2. Sensor node u and its replica node
Scenario 1: The Distance between u
and

Replication node
In scenario 1, u and
Lemma 4.
Replica sensor node
Proof.
This lemma holds according to Fact 1 and Fact 2.
This lemma indicates that it's difficult for the attacker to insert replica sensor node
Lemma 5.
If replica sensor node
Proof.
This lemma holds trivially.
This lemma means that we need some new schemes to detect the replica sensor node
Scenario 2: The Distance between u
and
Lemma 6.
If replica sensor node
Proof.
This lemma holds trivially.
Scenario 3: The Distance between u
and
Lemma 7.
In scenario 3 with
Proof.
This lemma holds according to Fact 1.
6.2. Numerical Discussion
In this section, we execute the mathematical analysis to get the numerical results of our proposed defending and detection scheme.
Given that there are n sensor nodes evenly distributed in the network, the area of network deployment region is S, the communication radius of the radio is R, and the detection probability in three scenarios is calculated as follows.
The blind zone in scenario 1 and scenario 2 is the common communication area between the benign node u and the replication node

Blind zone
Sensor nodes in the blind zone cannot distinguish sensor node u from replica node
The number of sensor nodes depends on the distribution of the sensor nodes in the deployment area. Suppose that the sensor nodes are evenly distributed in the area. Then the density of sensor nodes is
To detect the replica node
To facilitate our analysis, we let γ denote the ratio between the distance of
Also, we let B denote the average number of sensor nodes in the communication range. B can be derived as in (8). The greater B is, the more neighbor sensor nodes exist:
Then (5) can be rewritten as function of γ and B, yielding
We can see from (9) that the number of witness sensor nodes on the replica node

The number of witness nodes.

Number of witness nodes.
If N is greater than or equal to 1, it means that the replica node
This means that we can get higher detection probability by increasing the density of sensor nodes. Meanwhile, it also means that the attacker has to insert the replica node close to the compromised sensor node u to stay undetected. The closer the replica node is to the compromised node, the less the potential threat of the replication attacks is
At last, we let P denote the detection probability as in (11). Figure 7 shows that the detection probability changes with the ratio γ when B is fixed to be 1, 5, and 10, respectively. Consider

Detection probability.
6.3. Detection Performance
To inspect the detection scheme, simulations in different scenarios are carried out in the NS2 network simulator [24].
6.3.1. Replication Detection
In the simulations, the WSN deployment area is 5000 meters both in the width and in the length. The wireless nodes' physical radio model is TwoWayGround and the MAC layer protocol is IEEE 802.11. In NS2, the default communication radius of the wireless node is 250 meters.
Simulations are conducted in two network deployments; there are 100 sensor nodes in one deployment, while there are 1000 sensor nodes in the other deployment. In both deployments, the sensor nodes are distributed randomly with a uniform distribution. There is only one replica sensor node in every round of the simulations; the replica sensor node claims to own a location-binding identity, but the identity is forged. The average count of neighbor nodes can be computed with (7), yielding the average sensor nodes count;
We carried out 100 rounds simulations in both deployments. The detection probability was computed as the ratio of the successful detection rounds to the total 100 rounds. We change the distance ratio γ defined in (6) from 0.0125, 0.025, 0.05, 0.1, 0.2, 0.5, 1.0, 1.5 to 2.0 in the simulations, in order to examine the detection probability in 3 scenarios.
The results are shown in Figure 8, in which the analytical results are also shown for comparison. It can be seen that the larger the ratio γ is, the larger the detection probability is, which fits the analytical results.

Simulation results and analytical results.
6.3.2. Detection Overhead
At last, simulations are carried out to analyze the detection costs of our scheme. In the simulations, the total number of sensor nodes is fixed to
Figure 9 shows the results. Every stationary sensor node has to send about 1 data packet to its neighbor sensor nodes, while it has to receive about at most 7 data packets from its neighbor nodes on average. That is, in total, at most 8 data packets have to be transmitted on average.

Detection overheads.
7. Comparison with Related Work
In our scheme, every sensor node asynchronously reports its location-based identity to its neighbor nodes and also has to receive claim messages from all its neighbor nodes. The total communication and computation overheads incurred by the detection are
The detection algorithm is executed by every node in the network in a distributed manner, eliminating the packets forwarding in the network. Thus, the detection overheads are well balanced over the network.
Table 2 compares the detection overheads with RM [2], LSM [2], RED [7], and CSI [6] schemes, in which B represents the average number of sensor nodes in the communication range.
Comparison with other schemes.
As distributed approaches, RM [2], LSM [2], and RED [7] all assume public key schemes, which poses great challenges to the resource-limited sensor nodes. Yet, as a centralized scheme, CSI assumes the base station to conduct detection, which has the common problems of centralized approaches, such as single-point failure and unbalanced overheads among the sensor networks [1].
8. Conclusions
We present a detecting and defending scheme against replication attacks within neighborhood by using location-binding symmetric key scheme. The replica nodes are restricted to the vicinity of the compromised node. The location claim messages would not be forwarded, eliminating the overheads of other sensor nodes out of the neighborhood.
The analysis results show that the average number of sensor nodes and the distance between the compromised sensor node and the replica node determine the detection probability. The greater sensor nodes is, the higher the number of the detection probability; the farther apart the compromised node is from the replica node, the higher the detection probability is. Simulation results confirm the analytical findings. Compared with previous replica detection schemes, our scheme greatly lowers relevant detection overheads.
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant no. 61202474, the Natural Science Foundation of Jiangsu Province under Grant no. BK2011464, and the Project of the Key Laboratory of Intelligent Computing & Signal Processing, Ministry of Education, Anhui University.
