Abstract
Wireless sensor network (WSN) is built of many sensor nodes. The sensors can sense a phenomenon, which will be represented in a form of data and sent to an aggregator for further processing. WSN is used in many applications, such as object tracking and security monitoring. The objects in many situations need physical and location protection. In addition to the source location privacy, sink location privacy should be provided. Providing an efficient location privacy solution would be challenging due to the open nature of the WSN. Anonymity is a key solution for location privacy. We present a network model that is protected against local, multilocal, and global adversaries that can launch sophisticated passive and active attacks against the WSN.
1. Introduction
A wireless sensor node (SN) is a simple autonomous host device. It can sense a phenomenon, convert the sensed information into data, process the data, and then transmit the data to a base station (BS) for further analysis. The SN host is very limited in terms of storage space, memory, processing power, communication bandwidth, and battery energy [1]. This work focuses on monitoring and tracking applications, such as tracking animal in the wildlife or a fellow soldier in the battlefield. When the SN senses an object, it reports data by sending it hop-by-hop to the BS. One of the most common applications discussed in the literature is the panda monitoring game [1]. When a sensor node detects a Panda in a certain area, it should report the data via a message to the BS. The fact that the data is sent out of the SN in open environment will be a reason for the adversary (ADV) to know the location of the SN and consequently the location of the object. In order to protect the Panda from the ADV, we need to implement in place an efficient source location privacy scheme (SLP). SLP is even more important in military, homeland security, and law enforcement, in addition to many civilian applications [2]. We need, as well, to provide location privacy for the BS (BSLP), which is the data aggregator and the controller of the WSN. The solution needs to provide anonymity where the ADV cannot know the identity of the SNs. We have to reduce the capture likelihood and increase safety period. Safety period is the time before the first SN is captured. All transmissions in the WSN happen in the open air where the SNs are unattended. One of the most important issues that any solution needs to account for is the energy conservation to maintain a longer life for the WSN. Thus, applying regular privacy and security algorithms might not be suitable for such networks.
2. Problem Statement
Privacy in WSN is typically categorized into two categories: data privacy and context privacy [3]. Data privacy focuses on data aggregation and data query. Contextual privacy involves location privacy, identity privacy, routing privacy, and temporal privacy. Providing SLP and BSLP could be achieved using many schemes such as random walk, geographic routing, delay, dummy data sources, cyclic entrapment, anonymization, cross-layer routing, separate path routing, network coding, and other schemes [1, 4].
In this work, we will use anonymization to provide SLP and BSLP. One of the first works to classify context privacy was done by Kamat et al. [5, 6], where they addressed the Panda hunter game. They claim that the routing scheme is responsible for hiding source location of an object. They have used two metrics to measure SLP: first, the safety period, which is the number of messages a source sends before it is captured, and second, the capture likelihood, which is the probability that an adversary can capture the source within a certain period. There are generally two ways to locate a source using passive attacks: traffic analysis [3, 7] and packet tracing [3, 8, 9]. We provide a framework that can be tested against other solutions using five metrics: security: the probability that the adversary successfully identifies the source, the intermediary SNs, or the BS; energy cost, where the solution will provide privacy with reasonable energy conservation; storage memory cost and delivery time, where the solution should consume a reasonable storage space and provide reasonable delays; safety period, where the solution will guarantee transmission of reasonable amount of messages before the first SN is captured. The rest of this paper is organized as follows. In Section 3, we give some background and literature survey. In Section 4, we will explain the suggested system model, network model, threat model, and the traffic model. In Section 5, we will introduce our anonymity protocol. In Section 6, we will have a thorough security analysis. In Section 7, we will have performance analysis and evaluation. In Section 8, we will summarize our work and suggest some additional models to the framework in the future work.
3. Background and Literature Survey
There are many solutions, which have been presented to solve the problems of SLP and BSLP. Conti et al. [1] categorized the solutions into groups. They have discussed many solutions and compared them in terms of the threat model, view of the network, power consumption, exposed information, and efficiency in providing “location privacy.” Recently, anonymity has become a concern for WSNs. We have identified important literature discussing solutions for anonymity in WSN, such as HIR: Hashing-Based ID Randomization [10]; RHIR: Reverse HIR [10]; SAS: Simple Anonymity Scheme [11]; CAS: Cryptographic Anonymity Scheme [11]; MAQ: Max Query Aggregation [12]; APR: Anonymous Path Routing [13]; ACS: Anonymous Communications Scheme [14]; DCARPS: Destination Controlled Anonymous Routing Protocol for Sensor Nets [2]; PhID: Phantom ID [15]. None of these solutions provides location privacy against global ADVs and active attacks. A reasonable solution against global adversary introduced by Chen et al. [16] called efficient anonymous communication (EAC), which provides sender, link, and sink anonymity. The solution consists of three phases. Every SN creates multiple pseudonym IDs, which are global anonymous identity, anonymous broadcast identity, and anonymous one-hop identity, and anonymous acknowledgement identity. Each transmission should have a new anonymous pseudonym. The solution is relatively light; however, it easily breaks pseudonym synchronization. In addition, it fails to provide BSLP and it is not secure against traffic rate analysis attacks. We know that anonymization is not enough to achieve end-to-end privacy. There are some solutions based on dummy data sources where SNs send out fake packets to other nodes within the network. A fake packet does not contain any information about any real event but it helps to obfuscate the real traffic and to divert the adversary. The literature shows reasonable solutions using dummy sources. Some of them are designed to handle local ADV. However, few are suitable for global ADV. Some of the literatures presume a certain routing scheme, topology, network, and threat models. Ouyang et al. [17] introduced three different solutions to handle the global ADVs. In this work, we will enhance EAC, the efficient anonymous communicating protocol [16, 18]. EAC does not handle the pseudonyms synchronization very well. There are many situations where the system will get unsynchronized. It also could not handle multi-colluding adversaries and lacks a mechanism for time correlation attack. There is an extension for EAC presented in [4] which has reasonably addressed some of the issues of EAC. Most of the other solutions do not handle global or multi-colluding adversaries. Each solution focuses on certain scenarios of attacks. Our work is aimed to be comprehensive.
4. Preliminaries
Our contribution in this work provides sender anonymity, receiver anonymity, link anonymity, SLP, BSLP, and data privacy. The system is fed with inputs, such as the nature of the ADVs in the network and the residual energy in the SNs. We assume bidirectional links where two nodes are considered neighbors if they can hear each other [2]. The network considers one BS which aggregates sensed data from all the SNs. The BS works as an interface to the wired network [11]. Data packets generated by SNs are addressed to the BS; however, they go through multihop paths. Control packets are sent from the BS to the SNs by unicast or broadcast messages. To enhance BSLP, the BS acts like a normal SN in the network when it communicates with other SNs, to make it indistinguishable. The WSN runs in two phases: startup phase and communication phase. We assume that the SNs have the ability to obfuscate the addresses at the MAC level header [11]. All sensors are time synchronized [11]. There are many algorithms for SNs time synchronization, which is out of the scope of this work. The WSN will need a protocol for network topology discovery that allows the BS to view the global topology of the network without revealing the BS location [2]. The adversary nodes have very strong capabilities compared to the SNs. They are resource-rich; they have sufficient energy supply, computation capabilities, and unlimited storage memory. An adversary could run both passive and active attacks. We presume that only few compromised nodes could exist at one time due to the implementation of intrusion detection system (IDS) which is able to detect compromised SNs. We assume a global adversary, which can monitor the traffic of the entire network and can determine the node responsible for the initial transmission. Assuming a global adversary means the following: (a) the worst-case scenario for area coverage where colluding sensors can cooperate to cover the whole network area [21]; (b) the worst-case scenario for timing where the coverage area of the adversary is not known at any time [21]. We also assume that the adversary is capable of observing SNs transmissions over extended periods. It is, however, not able to break the encryption algorithms or the hash functions used for securing data during transmission. We presume abundant traffic where sensors detect and transmit many packets, such as in the applications of environment monitoring.
5. Proposed Anonymous Model
The communication process is divided into two phases, namely: startup phase and communication phase.
5.1. Startup Phase
Prior to actual distribution of the sensors in the field of application, the SNs need to be tested, fully charged, and preloaded with some parameters which are needed during the startup phase and then during the communication phase. All parameters and terms used for the proposed solution are mentioned in Notations [4, 16, 19].
It is typical to presume that the WSN is considered secure for some short period after SNs' deployment in the field and before the steady communication phase. Zhu et al. [22] presented that WSN has a lower bound on the time interval that the adversary needs to be able to compromise a SN. During this time, the SNs can communicate and exchange all preloaded and calculated parameters safely. The SNs need to know their relative locations to the BS and to the neighboring SNs. Likewise, the BS needs to know the location of all the SNs participating in the WSN. There are Nemours localization schemes which are proposed in the literature [11, 16, 23, 24]. After the localization process is completed, each SN will know its smallest hop-count to the BS
5.1.1. Issuing the Pseudonyms
A SN uses any issued pseudonym only once. That means we need to use a one disposable pseudonym per a transmission. There are five kinds of transmissions that could happen in the WSN [4, 16]: (i) multihop transmission between a SN and BS, (ii) transmission between two neighbor SNs, (iii) broadcast sent by a SN or the BS, (iv) acknowledgement, and (v) fake broadcast. The process starts by creating a pseudonym ID for each

Two SN neighbors share the required pseudonyms.
5.1.2. Deleting Security Information
After storing all required pseudonyms, parameters, and keys in
5.2. Communication Phase
During the steady communication phase, when sensing and sending data by nodes to the BS takes place, seven operations continue until the network lifetime ends. These operations are [4] (i) sensing and sending a message to a neighbor, (ii) forwarding a message to a neighbor, (iii) broadcasting a message, (iv) acknowledgement, (v) sending a fake message, (vi) SN addition, and (vii) SN removal. A SN will have three roles, in terms of data transmission, during the communication phase [4, 16, 25]: (i) role as a sensor, (ii) as a forwarder or intermediary node, and (iii) as a broadcaster. In the following sections, we will use
5.2.1. Transmission as a Sensor
When
5.2.2. Transmission as a Forwarder
When
5.2.3. Acknowledgment
As expected in data networks, message could be lost or become corrupted. In either case, retransmission is required. Because SNs change PIDs after each transmission, synchronizing PIDs is crucial. Updating the pseudonyms depends on successful message transmission and reception. Technically, a sender and a receiver should alter the pseudonym value only after making sure that the data are sent correctly and received perfectly by the BS. The lack of direct connection between the sender and the receiver makes it a complicated process. The BS cannot send direct acknowledgement to the source if it is multiple hops away. We have to depend on multiple acknowledgements along the path between the source and the destination. Transmitting data through a route of y hops needs y acknowledgements.
Scenario 1.
The packet sent by
Scenario 2.
The packet is received correctly by

Sliding window for received PIDs [19].
5.2.4. Transmission as a Broadcaster
Typically, the BS is required to broadcast messages for control and management purposes. Likewise, a sensor might need to broadcast a message to the BS for network setup, maintenance, and other management issues. A sensor, as well, could broadcast a message for emergency or urgency reasons depending on the application at hand. The framework requires keeping all the packets, transmitted throughout the network, indistinguishable by the adversary. Thus, all the messages need to have the same size. Broadcast is one-to-many transmission; a
5.2.5. Limited Broadcast Messages
A sensor inside network maze can only recognize the neighbor sensors and the BS. When SN broadcasts a message uplink, then all sensor neighbors should hear it. When intermediary sensor gets broadcast message, it should rebroadcast the message if and only if the message is received from SN with bigger HC. This way the message is directed to the BS through multiple routes while saving unnecessary traffic. The
5.2.6. Fake Broadcast Message
The sensors need to send fake messages to prevent time correlation attack and statistical analysis. The rate of the fake messages follows a certain protocol. Fake message is a one-hop broadcast message. However, to prevent correlation, the message needs to behave similar to real messages. Therefore, the message needs to be encrypted and has similar size as real messages. This will make the fake messages indistinguishable from the real messages. Since it has to carry a dummy data, we chose to make it carry residual energy (Δ) of the source sensor. This information will be extracted by the recipient neighbors and saved in the related tuple in the table TBL. The fake broadcast message sent by
5.3. SN Removal
There are many reasons why we need to remove a sensor from the network. For instance, when the battery of the sensor is about to deplete, it should refrain from participating in transmission in preparation of removing it from the network. This would protect against lost data and keep the pseudonyms synchronized.
In some cases, WSN uses intrusion detection system (IDS) [26], to protect against active attacks. Once SN is captured by the ADV, SN must be removed from the network and banned from participation in data transmission. Procedurally, if
5.4. SN Addition
To add
6. Security Analysis
We need to analyze our solution for both passive and active ADV attacks. The ADV has global view of the WSN. Usually, ADV starts by monitoring transmission somewhere in the network and then attempts to capture the location of source SNs or the BS. Once ADV determines the identity and location of a source SN, it consequently launches active attacks against certain SNs and tries to disrupt the operation of the entire WSN. The main strength of passive ADV is the fact that neither SNs nor the BS will know about their existence since they act passively. However, active attacks can be detected if the WSN implements a good IDS.
6.1. Security against Passive Attacks
SNs use disposable pseudonyms to recognize each other instead of using the real IDs. There is absolutely no real ID stored in the SN and absolutely no one pseudonym is used more than once. In addition, data is encrypted all the way from the source to the destination using pairwise shared keys between sources and the BS.
Eavesdropping and Content Analysis. ADV can intercept messages without being able to decrypt them because they are encrypted. The only information ADV can get of a captured message is the pseudonyms which are all temporary IDs and have no further use except to calculate new pseudonyms. However, the ADV cannot get, from the captured messages, important parameters
Hop-by-Hop Trace. ADV can track hop-by-hop messages from one node to another by overhearing the transmitted messages to capture either the source SN or the BS. The ADV will be faced with many real and fake transmissions, which are identical, during the lifetime of the WSN. Furthermore, each node uses different routes and the message content changes completely after each transmission.
Size Correlation. This attack would not work for our framework since all the messages have commensurate size whether it is real or fake.
Identity Correlation. ADV cannot relate overheard IDs to SNs. It is not possible since SNs use different pseudonyms every time a message is transmitted.
Rate Monitoring. ADV tries to find different transmission rates in the network, such as having higher transmission rate nearby the BS or busy SNs. This is handled by issuing fake messages all over the network to maintain similar transmission rate over the lifetime of the WSN.
6.2. Security against Active Attacks
ADV could physically compromise SN which gives the ADV the advantage of monitoring messages sent and received through the captured SN. We assume ADV knows encryption protocols used by the system while the system hides encryption keys and IDs. In such attacks, ADV tries to compromise SNs to get information related to security of the system. Consequently, it will monitor all packets going through the compromised SN to capture the identities and important parameters. Once ADV captures some privacy information, it reports it to external executers to do further damages (such as killing the Panda). Yet the IDS cannot detect that the SN is physically captured due to the passive behavior of ADV. In some other attacks, ADV captures SNs and actively forge messages, send replay messages, and do other forms of invasive attacks. Moreover, ADV could load powerful computers with captured credentials to launch more catastrophic attacks.
If ADV physically compromises a SN, then it captures two sets of information:
information related to the node itself: the current pseudonym PID, the parameters used to calculate the pseudonyms, the hash functions, the keys, and other information as listed in Notations; information related to the neighbors as listed in Table 1.
In this case, the ADV will have all it needs to issue new pseudonyms and send messages out to the neighbors.
Scenario 1.
If ADV physically compromises
Scenario 2.
If ADV physically compromises multiple SNs, let us call it set
Scenario 3.
If the message sent by source
Scenario 4.
If a message sent by source
It is unrealistic situation to have many compromised SNs. However, this proves that one or very few physically compromised SNs do not for sure leak the ID and location of the SN. None of the literature we have seen discussed this worse-case situation where you have physically compromised SNs. We have proved that even if we have some physically compromised nodes, our system still can limit the leak of the IDs and locations.
6.3. BS Security
ADV can learn about a SN receiving a message in two ways: (i) when SN retransmits the message, which is traced by ADV and (ii) the ADV being able to correlate the ID with the recipient SN. ADV cannot locate the BS by one compromised SN because every message uses a different pseudonym. It will need to compromise multiple SNs along the path to the BS, providing that compromised SNs can collude together. The goal of this model is to delay the capturing of the BS in the presence of multiple captured and colluding SNs.
Let us presume
6.4. SLP and BSLP
SLP and BSLP are achieved at first by having successful source, link, and BS anonymity, which was thoroughly argued earlier. ADV cannot infer any information from the intercepted messages. Passive attacks will not endanger the location privacy. Having IDS will handle active attacks. However, having active attacks by physically compromised SNs could hinder the location privacy if many SNs are captured in one area.
6.5. Timing Privacy
By using fake messages at variable interval times, it becomes super hard for the ADV to correlate messages being transmitted over the network as exhibited in Figure 3.

6.6. Routing Privacy
Although shortest path routing is used in this framework, choosing the next hop is done according to certain probabilistic algorithm, which accounts for the residual energy levels and usage frequency to increase the route privacy, as exhibited in Algorithm 1 and Figure 4. ADV cannot relate routes to nodes [20]. Even if two messages follow the exact same route, ADV will see them as if they are two different routes since each node along the route has different PIDs.
Procedure: SELECT_FORWARD_NODE_PROC; Input: Output: Messages out for i in Average_ for i in
x = Average_ if (SN J = SN X = SN
If j != null Return; for i in
x = Average_Δ; if (SN J = SN X = SN
If j != null Return; for i in
x = Average_Δ; if (SN J = SN X = SN
If j != null Return;

6.7. Data Privacy
Messages are encrypted before transmission and reencrypted at every hop along the route to the BS. Data will be authenticated by a message digest created by a secure hash function. The physically compromised nodes are able to inject data in the network but good IDS can detect the falsified data. The system adopts a secure procedure to remove compromised SNs from the WSN and adopts another procedure to add sensors as needed.
6.8. Energy and Location Safety Period
In this work, we will assume a simple energy consumption model [20, 27, 28]. The radio consumes ℇ nJ/bit for both transmission and reception by the SN's circuitry. In addition, it consumes ε nJ/bit/m2 for the transmitter amplifier to achieve an acceptable signal to noise ratio. Therefore, to transmit k bits for r distance, the total transmission energy dissipation will be
Moreover, the receiver would consume for reception of k-bit message
The exact value of q cannot be known since it is an event driven value, which depends on the situation of the WSN at that time. The total energy consumed in the network to send fake messages in one interval where
Energy transmission efficiency ETE [20] can be calculated as
The energy safety period is the time when the first sensor is depleted. However, transmission of data is distributed among all sensors to make sure the safety period lasts longer. The level of the energy in every sensor is measured and considered as a decision factor (among others) for selecting the next-hop for the message. This will definitely help to distribute the network accumulative energy consumption over the whole WSN. The location safety period is the period between capturing the first packet and finding out the location of the source or the BS. It is measured as the ratio between the period of discovering the location and the interval period [25]. We have implemented a WSN of 300 SNs uniformly distributed over 200 × 200 area. The average distance between SNs is 15. For the simulation, a SN sends only 10% fake message of the total messages. That means that, among ten real messages, it will send only one fake message since we presumed a busy network. This could be scaled up or down. Increasing the fake message will increase the power dissipation.
Figure 5 shows that EEAC provides improved source safety period compared to EAC [4]. Figure 6 shows that EEAC also provides a stronger BS safety period compared to EAC [4]. However, the source safety period is much more than the sink safety period in both schemes. It is always harder to achieve BSLP due to the volume of transmissions nearby the BS. The efficiency of energy dissipation is due to adopting Algorithm 1 for selecting the forward SN where it takes in consideration the residual energy levels of the SNs, which was not considered in EAC. Figure 7 shows the delay in EEAC is very comparable to the EAC due the extra processing and security implementation. Furthermore, Figure 8 shows that the average total delivery time source-to-BS is also very comparable to EAC.

Source safety period.

BS safety period.

Transmission delay.

Message delivery time.
7. Performance Evaluation
In this section, we evaluate the performance of our system, by evaluating the storage, processing, computational, and communication costs.
7.1. Storage Evaluation
There are two sets of information stored in a
If we presume that the keys, the random numbers, the pseudonyms, and the hash functions are all n bits long in average and the required bits for miscellaneous data altogether are two bytes and the average number of neighbors
7.2. Processing and Computational Evaluation
Hash functions are used to calculate the pseudonyms and symmetric cryptography is used to encrypt the messages. Because we need to calculate three pseudonyms and one acknowledgement after each transmission, thus using encryption to create pseudonyms was avoided since it requires more processing power compared to hash functions. When a SN senses data, it needs to have OWH calculations for PID, OHPID, and APID at the sender and OHPID at the receiver. If the system opts for data authentication, then another hash function is needed. The source node needs only one encryption for the data if
7.3. Communication Cost Evaluation
The most expensive operation for power consumption is transmission of bits from one node to another. We use two stages for air communication in our framework: (i) the startup phase and (ii) the communication phase. The data transmission during startup phase is minimal since the BS beacon and the neighbors' information exchange happens once before the steady communication phase. During communication phase, data will be forwarded hop-by-hop to the BS. Every packet is equally sized to prevent time and size correlation. We have introduced a probabilistic fake packet transmission scheme which none of the other protocols adopted. When a SN does not have real message, it will send fake message according to the protocol discussed earlier.
The cost per message at one interval time is
8. Conclusions and Future Work
The scheme presented in this work provides source, link, and BS anonymity, SLP and BSLP. Most of the previous work assumed local adversary view and passive attack model. This work addressed local and global adversary network view. It handles both passive and active attack models. Anonymity cannot provide temporal privacy. To provide temporal privacy, the global adversary needs to see a maze of transmissions happening all over the network. Fake messages were introduced. However, using fake messages needs to be adjusted to manage the energy consumption. The presented model can handle both homogenous and heterogeneous sensor nodes in terms of initial energy levels. It uses the energy level as the main indicator in deciding how to route packets forward. We have demonstrated that our framework can withstand most of the known passive and active attacks. We have discussed many scenarios and provided solutions. The storage cost was mathematically analyzed for the framework. We also have discussed the complexity of computational operations performed in the system, which includes encryption and hash functions. To provide security against multi colluding active attackers, we have introduced onion encryptions. The future work would include enhancement on the probabilistic scheme for fake messages usage. We also will implement our model for clustered networks.
Footnotes
Notations
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
