A Collaborative Self-Governing Privacy-Preserving Wireless Sensor Network Architecture Based on Location Optimization for Dynamic Service Discovery in MANET Environment

Abstract

Due to the characteristics of a MANET, none of the existing solutions for service discovery work well in decentralized mobile environments. In this paper, we propose a collaborative self-governing privacy-preserving wireless sensor network architecture to address the issue of service discovery in MANET environment. The proposed architecture is able to dynamically adjust the working modes between directory-based and directory-less modes according to the network status of a MANET. The dynamic network status of a MANET is monitored by a wireless sensor network. Two location optimization algorithms are developed for topology control and the preservation of user location privacy. Simulation results show that the proposed architecture can greatly improve the performance of service discovery.

1. Introduction

With the booming development of the Internet, more and more services are deployed on the web. Hence, the need for service discovery mechanisms is becoming increasingly urgent. Much research attention has been drawn to the designing of service discovery mechanisms. Meditskos and Bassiliades [1] developed a web service discovery framework using OWL-S advertisements. Junghans et al. [2] provided a review of formalisms for semantic web service discovery and developed distinct formalisms to describe functionalities and service requests. The proposal presented by Santhana and Balasundaram [3] employed k-means clustering to address the problem of web service discovery. In addition, service discovery methods which deal with functional and nonfunctional properties of services were also extensively studied [4]. Some techniques for web service discovery were contributed in the form of a patent [5]. In general, the service discovery could be conducted in three styles: reactive way, proactive way, and hybrid way. In reactive way, a user node which has a need of some kind of service issues a service query. Other nodes which received the service query may answer it by issuing a service reply. The nodes which answer the service query could be a directory node, service node, or user node. In proactive way, service information is periodically advertised by service nodes. Both directory nodes and user nodes will receive the service advertisements. If a user node which has a potential need of service happens to receive an appropriate service advertisement could directly use the service information, rather than issuing a service query to search service information.

Generally, the reactive way only generates network traffic when there is a need. To some degree, this leads to as less message overhead as possible. During the query process, the original service query might be forwarded by a node which is unable to answer it. In order to avoid flooding the whole network, it is necessary that a subtle time-to-live (TTL) value should be set. Here, the dilemma is that a small TTL value will limit the transmission range and then brings bad influence in the service discovery, while a large TTL value will expand the transmission range and then floods the query to a wide area. Since service nodes do not actively issue service advertisements in reactive way, both the latency and the failure rate of the service discovery are high. Similarly, the TTL value of periodic service advertisements should be cautiously selected in proactive way. Moreover, in order to mitigate the flooding effect caused by periodic service advertisements, the frequency of service advertisement should be elaborately tuned. As service nodes periodically issue service advertisements no matter whether there is a need for service information, both the latency and the failure rate of the service discovery in proactive way are lower than that in reactive way. However, this in turn constantly imposes a certain amount of message overhead on the whole network. As both the reactive way and the proactive way have advantages and disadvantages, the hybrid way combines the two complementary ways. However, since disadvantages of both ways are introduced, how to achieve a tacit cooperation of the two ways is challenging.

In this paper, we propose a collaborative self-governing privacy-preserving wireless sensor network (WSN) architecture for dynamic service discovery in MANET environment. Our model is based on the famous Chord protocol and could operate in both directory-based mode and directory-less mode. The WSN which consists of directory nodes greatly facilitates the service discovery. The topology control for the WSN is conducted through local location optimization and global location optimization. The location privacy of a user is preserved based on a direction-probing algorithm used in the local location optimization. The key function of our model is the autonomic mode switch between the directory-based mode and the directory-less mode.

The rest of this paper is organized as follows. Section 2 provides the related works about service discovery architectures and discusses the existing problems. Section 3 elaborates our collaborative self-governing privacy-preserving service discovery architecture. In Section 4, we evaluate the proposed model by simulation and present a detailed analysis of the simulation results. Section 5 concludes the paper and highlights the future work.

2. Review of Service Discovery Architectures

Traditionally, service discovery architectures are classified into two types: directory-based architecture and directory-less architecture.

2.1. Directory-Based Architecture

In directory-based architecture, a directory structure could be implemented either in centralized manner or in distributed manner.

For a centralized directory, all service information is stored in the same place. All services in the whole network should be registered with a centralized repository. At the same time, the centralized repository is responsible for all service queries. Examples of centralized directory are as follows: the service location protocol (SLP) [6] suggested by IETF, the Jini technology [7] developed by Sun Microsystems, the salutation protocol [8] proposed by IBM, the universal plug and play device architecture (UPnP) [9] specified by Microsoft, the Bluetooth service discovery protocol [10] presented by Bluetooth consortium, the secure wide-area service discovery service (SDS) [11], and the intentional naming system (INS) [12]. Generally speaking, a centralized manner means attractive simplicity, inevitable single point of failure, poor scalability, and weak mobility. On the contrary, a distributed manner indicates certain complexity, robustness, good scalability, and strong mobility. Taking the characteristics of a MANET environment into consideration, we hold that the distributed manner is preferable. Hence, we omit the details of the above centralized directories and focus on the instances of distributed directory.

For a distributed directory, the problem of wide-area service discovery needs to be focused on. There are two aspects to the concept of wide-area service discovery: one is the replication of service registration, and the second is the forwarding of service query. Since directory agents reside in various locations, the service registrations they received are usually different. Then, each directory agent only knows about a portion of service information in the whole network. If there is no replication scheme in a distributed directory, service information is only locally available around the particular directory agent with which it was registered. Rather, a distributed directory with a replication scheme enables the communication among directory agents. By replication of service information, each directory agent becomes aware of all services registered in the whole network. Suppose there is a distributed directory which does not support replication of service registration. For the purpose of improving the success rate of the service discovery, it is reasonable that a service query which could not be answered locally should be forwarded to another directory agent. This operation relies on a forwarding scheme provided by the distributed directory. Recently, a considerable amount of work has been done on the distributed directory. Based on the organization of the directory agents, distributed directory could be divided into four categories: backbone-based [13–15], cluster-based [16–19], DHT-based [20, 21], and other distributed directories [22, 23].

Kozat and Tassiulas [13] provided a virtual backbone which facilitates the registration and search of services. The virtual backbone is constructed by a subset of nodes in the network, and these nodes compose a dominating set. Though there is no replication scheme, nodes belong to the backbone forward service queries to each other when they cannot be answered locally. However, there is a flaw that the destination of the forwarding operation is random. The proposal [14] presented by Sailhan and Issarny remedied the flaw by forwarding an unsolved service query to the nodes which are expected to possess the corresponding service information. The selection of destination to which a service query is forwarded is based upon the exchange of profiles among directory agents. The protocol [15] proposed by Koubaa and Fleury aimed to build a backbone which could cover all service information in the whole network. Nevertheless, as only a service provider is allowed to join the backbone, it is likely that an integral backbone might be broken down into several isolated parts. In this case, the global communication of the backbone is cut off. Consequently, the disposal of service queries is crippled.

Klein et al. [16] presented a semantic overlay of hierarchical service rings. A service ring is composed of a cluster of service providers and is considered to be a directory, since the functions of accepting service registration and answering service query are contributed in the form of a service ring. Although replication is not supported, a service ring forwards a service query which it is unable to answer. Klein and König-Ries [17] proposed that services are dynamically organized into multilayer clusters. The authors make an assumption that there exists a common ontology which could be used to describe the services within the whole network. Upon receiving a service query, a node checks its leaf-level cluster. If the service query cannot be answered, higher-level clusters are examined in the pursuit of a suitable service. It is worth mentioning that nodes are clustered based on the physical proximity and the semantic proximity [16, 17]. Schiele et al. [18] suggested that clusters are formed according to the similarity of mobility patterns. A clusterhead is always ready to answer service queries. The remaining nodes in a cluster go to sleep when idle. Tyan and Mahmoud [19] presented a service discovery approach which clusters the nodes according to physical proximity. A gateway of each cluster plays the role of a directory. Wide-area service discovery is feasible since an unsolved service query could be forwarded to another gateway.

As mentioned above, when a replication scheme is not present, an unsolved service query should be forwarded to other nodes. However, it is not easy to determine an appropriate destination for an unsolved service query. Unlike backbone-based and cluster-based models, an exciting feature that the DHT-based approaches inherently possess is hash techniques which could facilitate the provision of location information. Seada and Helmy [20] developed a rendezvous-based architecture. The basic idea is that the whole network is divided geographically and each geographical area is responsible for a portion of service information in the network. Service information is mapped to a geographical area according to a hash-table-like mapping scheme. A part of nodes within a geographical area are selected to maintain the mapped information. Since the hash function used by service query is the same as the hash function used in the mapping of service information, the proper forwarding destination of a service query could be readily obtained by a directory agent. Sivavakeesar et al. [21] described a mechanism which is similar to the approach proposed in [20] except for the fact that the mapped information is handled by all nodes within a geographical area.

Other approaches which do not involve backbone, cluster, or DHT techniques are also notable. Lee et al. [22] presented a service discovery and delivery protocol called Konark. The proposed method is decentralized and targeted at a MANET environment. Under a pure peer-to-peer design, each node in the network has a database which stores the service information provided by other nodes. Jeong et al. [23] developed an ad hoc name service (ANS) system for IPv6 MANET to facilitate the service discovery and name-to-address resolution. It is assumed that each node could be configured with a site-local scoped IPv6 unicast address by ad hoc stateless address autoconfiguration [24, 25].

2.2. Directory-Less Architecture

As a MANET environment is born with no infrastructure, it is argued that a directory-less architecture is more appropriate than a directory-based architecture [26]. However, there is an important distinction between a directory-less architecture and a directory-based architecture: there are no directory agents in a directory-less architecture. Namely, there are only user agents and service agents in the whole network. The missing of the organization and maintenance for directory agents reduces the complexity of service discovery architecture: user agents issue service queries and service agents issue service advertisements. Though service queries are generated on demand, the number of replicates of a service query during the forwarding process could become very large. Furthermore, in order to achieve a decrease in latency and an increase in service availability, the number of service advertisements is also considerable. In this case, a pursuit of good service discovery performance might be prone to bring about a serious flooding problem.

Much research attention has been drawn to the operations of service queries and service advertisements. Chakraborty et al. [27] proposed a group-based service discovery (GSD) protocol for pervasive environment. In general, the protocol consists of two components. The first component is a P2P caching scheme dealing with the spread of service advertisements. A node stores the received service advertisements in a local database called service cache. A least-remaining-lifetime cache replacement policy is used to update the cache. The second component is a group-based intelligent forwarding strategy for regulating the forwarding of service queries. In specific, an unsolved service query is just forwarded to the nodes which possess matching service information. The determination of destination nodes is based on the information contained in the service cache. The reply info caching enhanced flexible forward probability (RICFFP) service discovery protocol presented by Gao et al. [28] also consists of two parts: reply info caching (RIC) and flexible forward probability (FFP). The RIC is a technique similar to the P2P caching scheme in [27]. In particular, a node could cache the service information in a received service reply. For a received service query, if there exists a match in the cache, the service query could be answered conveniently. The essence of the FFP is that the forward probability of a service query is in reverse proportion to the number of hops it has already traveled. Nedos et al. [29] introduced an optimal forwarding mechanism for service advertisements. By monitoring the neighbors, a node merely forwards the received service advertisements to a part of its one-hop neighbors. Therefore, the broadcasting of service advertisements is turned into several unicasts. Nidd [30] set the target application environment to single-hop short-range wireless systems. For each node, the broadcasting of service advertisements is scheduled to take place at regular intervals. The task of scheduling is performed by an exponential back-off algorithm. The dissemination strategies proposed by Campo et al. [31] and Lee et al. [32] are similar that they both cache service information contained in received service advertisements. Moreover, only service advertisements which are not expired and have not been seen recently are broadcasted.

2.3. Problems

As the directory-based architecture and the directory-less architecture are complementary, hybrid architecture is another option for service discovery in a MANET environment. For a node in hybrid architecture, if there is no directory agent within its radio range, both service advertisements and service queries are sent in the same way of directory-less architecture. Otherwise, service information contained in service advertisements is registered with directory nodes. Service queries could be answered by both directory agents and service agents. Though a great deal of research has been done to the above three types of service discovery architectures, it is still controversial that which one is superior to the other two. It is recognized that the difficulty of comparison is largely due to the characteristics of a MANET environment. For instance, the nodes in a MANET might possess various degrees of mobility. Moreover, both the velocity and the moving direction of a node are unpredictable. The regular communication among nodes in the whole network should not be interrupted by the node mobility. Since a MANET is self-organized, there are no constraints on the joining and leaving of a node. The appearance of a new participator or departure of an existing participator could bring a significant change to the network topology. Again, the communication in the whole network should not be influenced.

Considering a MANET with directory-based architecture, the TTL value of service query should at least guarantee that a service query could arrive at a directory node. If the frequency of service query is low, the message overhead introduced by service registration and directory maintenance is likely to surpass the message overhead brought in by service advertisement with a directory-less architecture. For directory-less and hybrid architectures, large TTL values of both service query and service advertisement could improve the performance of service discovery. However, there is a compromise between flooding problem and performance enhancement. When there are a large number of service queries, a directory-less architecture has to increase the number of service advertisements. But since both service query and service advertisement are simply broadcasted, a risk of network congestion still could not guarantee a satisfactory performance of service discovery. Ververidis and Polyzos [33] pointed out that it could be useful to develop a flexible architecture which is able to adjust its parameters and working modes between directory-based and directory-less modes according to the status of a MANET.

3. A Collaborative Self-Governing Privacy-Preserving WSN Architecture for Dynamic Service Discovery

3.1. Network Model

3.1.1. Nodes

In our architecture, there are three possible roles for a mobile node: service provider, service requestor, and service directory. In order to perform the operations which facilitate the service discovery process, the above three types of nodes are equipped with three key components service agent (SA), user agent (UA), and directory agent (DA), respectively. For the sake of simplicity, in the remainder of this paper, we refer to service provider nodes, service requestor nodes, and service directory nodes as SAs, UAs, and DAs, respectively. To facilitate the presentation of our model, we consider a MANET consists of DAs, SAs, and UAs. In addition, there is a WSN composed of all DAs.

3.1.2. Directory Agent

Each DA is a wireless sensor node. DAs offer a place storing service information and SAs register service information with DAs. Then, DAs answer the service queries issued by UAs based on the service information they have. The existence of DA is to improve the effectiveness of the service discovery architecture. The WSN constituted by all the DAs is based on Chord [34]. Chord is a scalable distributed lookup protocol which has been applied to service discovery [35–37]. The most important operation that Chord provides is mapping a given key onto a node. By consistent hashing, each data item could be associated with a unique key. The $〈k e y, d a t a i t e m〉$ pair is stored at the corresponding node to which the key maps. The mechanisms facilitating the joining and leaving of node could guarantee the normal operations of the whole network. In classic sensor networks, all sensor nodes are scheduled to send the obtained information to a single sink node. Though this approach eases the routing work, authors of [38, 39] point out that the nodes which are in vicinity of the sink node bear heavy work load of relaying. As a result, the batteries of these nodes are drained out much earlier than other nodes. Thus, there is no sink in our WSN of all the DAs. The status of the whole network is summarized and presented in a monitoring token. In clustered sensor networks, the principle of clusterhead rotation is studied extensively in [40–42]. However, the election and rotation of clusterhead inevitably introduce extra overhead. On the contrary, since the consistent hashing mechanism employed by Chord maps the keys of data items uniformly, the querying process of service is inherently load-balanced for all the DAs. This feature also helps to avoid a quick energy exhaustion of certain DAs.

3.1.3. Unified Service Information Management

Our service discovery architecture employs a unified service information management scheme. It is necessary that all DAs, SAs, and UAs which participated in the architecture are aware of the scheme. In particular, there is a list of properties which is used for profiling a service. Each property has a number of values. We denote the list of properties and values by L and all nodes in the network have knowledge of this list. For example, a property called scope may indicate the application domain of a service, such as search engine, news archive, and anonymous FTP. We assume that there exists a predetermined list of properties and the corresponding values. The properties are denoted by set $p r o p e r t y = \{p r_{1}, p r_{2}, \dots, p r_{n p}\}$ . The corresponding values of $p r_{i}$ are denoted by set $p_{i} = \{p v_{i 1}, p v_{i 2}, \dots, p v_{i m}\}$ . We suggest that one property consumes one byte. Then, there could be $m = 256$ different values for a property. Now, we introduce the definition of service description.

Definition 1.

By description of a service we mean the character string $d e s p = p v_{1 s}, p v_{2 j}, \dots, p v_{(n p - i) k}$ which is formed by concatenating the values of $n p - i$ properties sequentially, where $0 ⩽ i < n p$ and $1 ⩽ s, j, k ⩽ 256$ . When $i = 0$ , we call $d e s p$ a complete description. Otherwise, we call $d e s p$ an incomplete description.

Then, we know that there are totally $\prod_{i = 1}^{n p} ‍ |p_{i}|$ different complete descriptions for profiling services. The format of service description is shown in Figure 1.

Figure 1

Service description.

3.2. Directory-Based Mode

In directory-based mode, all DAs in the MANET form a Chord network. All nodes in the MANET are mobile. SAs register their service information with DAs. A service query issued by a UA is handled by DAs.

3.2.1. Service Registration and Deregistration in WSN

For service registration, the service information submitted to a DA consists of three parts: description of the service, ID of the service provider (e.g., company name), and URL for accessing the service. Note that only a complete service description can be used for service registration. There are 32 bytes to accommodate the ID of service provider. Though the HTTP protocol does not stipulate the maximum length of a URL, we limit it to a maximum of 2048 bytes. The format of service registration is shown in Figure 2.

Figure 2

Service registration.

Since the WSN is based on Chord, we denote the identifier space of the Chord network by $2^{m}$ . In order to accommodate the total $\prod_{i = 1}^{n p} ‍ |p_{i}|$ different complete descriptions for profiling services, the inequality $2^{m} ⩾ \prod_{i = 1}^{n p} ‍ |p_{i}|$ should be satisfied.

In our model, we use the SHA-1 [43] as the consistent hash function to generate keys for both DAs and services. We denote the hash function by $s H a s h (s t r i n g)$ . Upon receiving a valid service registration, a DA first sends a service registration acknowledgment to the SA which issued the service registration. Then, the DA extracts the description and the ID parts as a character string $s t r$ . The key of the service is obtained by $k e y = s H a s h (s t r)$ . The $〈k e y, U R L〉$ pair is then sent to the DA which is responsible for the key. An SA could register its service information with an arbitrary DA within its radio range. As long as an SA receives a service registration acknowledgment, there is no need to register the service again. If the corresponding DA becomes out of the radio range when an SA is registering a series of services, the SA could contact another DA to register the remaining services. The service deregistration is conducted in a similar way to the service registration.

3.2.2. Key Generation and Searching in WSN

In unstructured P2P network, the searching based on a keyword is feasible. However, the consistent hashing of filename in structured DHT prohibits such operation in Chord. As a result, only the approach of exact match works. The inverted index proposed in [44] employs the mapping of $〈k e y w o r d, n o d e〉$ . Then, the searching based on a keyword is implemented by redundancy storage. The major drawback of this approach is the problem of common keywords. The nodes which are responsible for common keywords suffer from heavy burden. In our model, a key is generated by hashing the character string of the description of a service concatenated by the service provider ID. The inclusion of the service provider ID resolves the above problem of common keywords.

When there is a potential need of certain service, several interested values are selected from the list L by a UA. Then, a service query with the same format as a service description is sent to a DA. Upon receiving a service query, a DA first checks whether there are missing values for properties. For a complete description, the DA which received the service query simply appends the IDs of service providers known to it and hashes the character string composed of the description and the ID. Then, the DA searches the obtained keys. However, it is very likely that values of some properties are missing due to uncertainty. Consequently, we have an incomplete description. For i missing values, each one is filled by several most popular values queried in the past. We denote the number of most popular values by $n m p v$ .

After the filling operation of the original incomplete description, we have $n m p v$ complete descriptions. Then, the IDs of service providers known to the DA are appended to the complete descriptions. We denote the number of service providers a DA is aware of by $n s p$ . For a scenario of $i = 1$ , $n m p v = 3$ , and $n s p = 2$ , the filling and appending process of an incomplete description is illustrated in Figure 3. Since there are six character strings of the descriptions concatenated by the service provider IDs, we could obtain six different keys for searching. The missing of property values is allowed, as it provides an open search result for a UA. However, since the keys are generated based on service information which is wishfully created by a UA and a DA, it is worth noting that both a complete description and an incomplete description are likely to yield no search results. For an existing service with a search key, the responsible DA will send a service reply to the original DA which issued the search. Then, the search result is fed back to the UA by the original DA.

Figure 3

Filling and appending.

3.2.3. Topology Control and Function Tuning for WSN

Since DAs are mobile in directory-based mode, the mobility has a strong impact on the topology of the WSN. Moreover, the unpredictable node failures also exert influence on the topology of the WSN. In our model, the topology control for WSN consists of two parts: local location optimization and global location optimization.

The local location optimization (LLO) is performed between a DA and the UAs which are communicating with the DA. For an individual DA, we divide its 360-degree radio range into six equal adjacent areas: $a_{i} (i = 1,2, \dots, 6)$ . As shown in Figure 4, the upward direction represents north.

Figure 4

Six areas.

The two salient features of the local location optimization are privacy preservation and energy conservation. As a long communication distance requires more energy for radio transmission, it is beneficial that the communication distance between a DA and a UA could be shortened once they get in touch with each other. In our model, the result of a service query is sent to a UA by the original DA it queried. This procedure makes the shortening of communication distance even more important. Since the mobile nodes are equipped with omni-directional antennas, a DA has no idea of where a UA is and therefore is unable to determine a correct moving direction. For the sake of privacy protection, a UA is unwilling to share its location in any form (e.g., GPS coordinates). To address this problem, we develop a direction-probing algorithm to determine a moving direction for a DA. As shown in Figure 5, this algorithm contains a four-step movement: $s_{i} (i = 1,2, \dots, 4)$ .

Figure 5

Four-step movement.

Considering a DA whose original location is $(x_{0}, y_{0})$ and a UA whose location is unknown, the DA performs the following movements consecutively: $s_{1} (N, Δ y)$ , $s_{2} (E, Δ x)$ , $s_{3} (S, Δ y)$ , and $s_{4} (W, Δ x)$ . The values of $Δ x$ and $Δ y$ are both positive. After a four-step movement, the DA returns to its original location $(x_{0}, y_{0})$ . During each step, the variation of signal intensity is recorded. We denote an increase and a decrease of signal intensity by + and −, respectively. Then, the moving direction of the DA is determined according to Table 1.

Table 1

Moving direction determination.

	N	W	S	E
$a_{1}$	+	−	−	−
$a_{2}$	+	−	−	+
$a_{3}$	−	−	+	+
$a_{4}$	−	−	+	−
$a_{5}$	−	+	+	−
$a_{6}$	+	+	−	−

When there are several UAs which are communicating with the DA, the desired moving directions related to different UAs might be various. Generally, it is considered that the energy consumption is in direct proportion to the communication distance. Hence, in order to reduce the total energy consumption, the sum of distances between the DA and each UA should be as small as possible. Considering two UAs $u a_{1}$ and $u a_{2}$ , the communication distances between the two UAs and the DA are denoted by $r_{1}$ and $r_{2}$ , respectively. In addition, we denote the signal intensities of the two UAs by $s_{1}$ and $s_{2}$ , respectively. We make the assumption that the signal intensity and the communication distance follow the inverse square law $s_{i} = δ / r_{i}^{2}$ , where δ is a constant coefficient and $i = 1,2$ . The condition of equality for $r_{1} + r_{2} = 2 \sqrt{r_{1} \cdot r_{2}}$ is $r_{1} = r_{2}$ . Moreover, $r_{1} = r_{2}$ is equivalent to $s_{1} = s_{2}$ . Similarly, for n UAs, we have the condition of equality for the inequality $r_{1} + r_{2} + \dots + r_{n} ⩾ n \sqrt[n]{r_{1} \cdot r_{2} \cdot \dots \cdot r_{n}}$ as $s_{1} = s_{2} = \dots = s_{n}$ . Thus, the moving direction of the DA is adjusted based on the idea of equalizing the signal intensities of all UAs. However, when there are several UAs, this equalization process could be complicated. Alternatively, we choose to synthesize the n moving directions determined for the n UAs with weight values. The weight values of the UAs are calculated in reverse proportion to their signal intensities. The detailed direction-probing algorithm is illustrated in Algorithm 1. Once a final moving direction of the DA is obtained, the DA starts to move along the direction. When the n signal intensities become approximately equal to each other, the DA will stop moving. Each time a new UA gets in touch with the DA or an existing session is terminated, a new direction-probing procedure is initiated.

Algorithm 1: Direction-probing procedure.

Direction-Probing $(d a, u a [n])$

(1) original signal intensity $u a [j] . s (0)$

(2) for $i \leftarrow 1$ to 4

(3) for $j \leftarrow 1$ to n

(4) if $u a [j] . s (i) > u a [j] . s (i - 1)$ then

(5) signal variation $u a [j] . v (i) \leftarrow$ “+”

(6) else signal variation $u a [j] . v (i) \leftarrow$ “−”

(7) end if

(8) $s [j] [i] \leftarrow u a [j] . v (i)$

(9) end for

(10) end for

(11) for $j \leftarrow 1$ to n

(12) switch $(s [j])$

(13) case “ $+ - - -$ ”: $u a [j] . d = a_{1}$ break

(14) case “ $+ - - +$ ”: $u a [j] . d = a_{2}$ break

(15) case “ $- - + +$ ”: $u a [j] . d = a_{3}$ break

(16) case “ $- - + -$ ”: $u a [j] . d = a_{4}$ break

(17) case “ $- + + -$ ”: $u a [j] . d = a_{5}$ break

(18) case “ $+ + - -$ ”: $u a [j] . d = a_{6}$ break

(19) end switch

(20) end for

(21) ${u a}^{'} [n] = o r d e r (a s c e n d i n g, u a [n], u a [n] . s (0))$

(22) $s . s u m \leftarrow 0$

(23) for $j \leftarrow 1$ to n

(24) $s . s u m \leftarrow s . s u m + {u a}^{'} [j] . s (0)$

(25) end for

(26) for $j \leftarrow 1$ to n

(27) ${u a}^{'} [j] . w = s . s u m / {u a}^{'} [j] . s (0)$

(28) end for

(29) $d a . d = s y n t h e s i s ({u a}^{'} [j] . d, {u a}^{'} [j] . w)$

The global location optimization (GLO) is performed between DAs and it is based on the direction-probing algorithm described in the local location optimization. Other than local location optimization, the global location optimization concentrates on topology maintenance.

As a DA in the WSN has a maximum transmission range, in order to keep the connectivity of the WSN, the relative positions among the DAs should be carefully controlled. The global connectivity of the WSN could be decomposed into several groups of connectivity. In particular, we develop a connectivity maintenance algorithm to maintain the local connectivity of three DAs. Considering an individual DA which is denoted by $d a$ , by the Chord protocol, $d a$ has a predecessor and a successor. We denote the two DAs by $d a . p r e d e c e s s o r$ and $d a . s u c c e s s o r$ , respectively. The signal intensity between $d a$ and $d a . p r e d e c e s s o r$ is denoted by $d a . p r e d e c e s s o r . s$ . The signal intensity between $d a$ and $d a . s u c c e s s o r$ is denoted by $d a . s u c c e s s o r . s$ . When $d a . p r e d e c e s s o r . s$ or $d a . s u c c e s s o r . s$ keeps decreasing, $d a$ becomes aware of the fact that its predecessor/successor is moving away from it. Once $d a . p r e d e c e s s o r . s$ or $d a . s u c c e s s o r . s$ is lower than a threshold $s_{l o w}$ , $d a$ initiates a direction-probing procedure to determine the direction of the predecessor/successor. Meanwhile, it sends a SLOW_DOWN message to the predecessor/successor for notification. Once a moving direction is determined, $d a$ starts to move toward the predecessor/successor in order to prevent the signal intensity from getting too low. In addition, the predecessor/successor which receives the SLOW_DOWN message will adjust its movement for the purpose of getting close to $d a$ . Once the signal intensity between $d a$ and the predecessor/successor reaches a certain level, the predecessor/successor stops getting close to $d a$ . We denote the signal intensity of this level by $s_{n o r m a l}$ . Note that when both $d a . p r e d e c e s s o r . s$ and $d a . s u c c e s s o r . s$ are greater than or equal to $s_{n o r m a l}$ , the $d a$ stops moving. For both $d a . p r e d e c e s s o r$ and $d a . s u c c e s s o r$ , a compromise of directions might be needed. In this case, a composite moving direction is used by $d a$ . The detailed connectivity maintenance algorithm is illustrated in Algorithm 2. Since the energy of a DA is limited, we hold that the function of connectivity maintenance stops when the remaining energy of a DA is critically low, and this feature is indicated by the last statement of Algorithm 2. A complete function tuning algorithm is described later.

Algorithm 2: Connectivity maintenance procedure.

$C o n n e c t i v i t y M a i n t e n a n c e (d a, d a . p r e d e c e s s o r, d a . s u c c e s s o r)$

(1) repeat

(2) $F L A G_{1} \leftarrow f a l s e$

(3) $F L A G_{2} \leftarrow f a l s e$

(4) if $d a . p r e d e c e s s o r . s < s_{low}$ then

(5) $F L A G_{1} \leftarrow t r u e$

(6) $d a . s e n d (SLOW_DOWN, d a . p r e d e c e s s o r)$

(7) $d a . d_{1} = d a . d i r e c t i o n - p r o b i n g (d a, d a . p r e d e c e s s o r)$

(8) if $d a . p r e d e c e s s o r$ received SLOW_DOWN then

(9) $d a . p r e d e c e s s o r . d = d a . p r e d e c e s s o r . d i r e c t i o n - p r o b i n g (d a . p r e d e c e s s o r, d a)$

(10) $d a . p r e d e c e s s o r . m o v e (d a . p r e d e c e s s o r . d)$

(11) end if

(12) end if

(13) if $d a . s u c c e s s o r . s < s_{low}$ then

(14) $F L A G_{2} \leftarrow t r u e$

(15) $d a . s e n d (SLOW_DOWN, d a . s u c c e s s o r)$

(16) $d a . d_{2} = d a . d i r e c t i o n - p r o b i n g (d a, d a . s u c c e s s o r)$

(17) if $d a . s u c c e s s o r$ received SLOW_DOWN then

(18) $d a . s u c c e s s o r . d = d a . s u c c e s s o r . d i r e c t i o n - p r o b i n g (d a . s u c c e s s o r, d a)$

(19) $d a . s u c c e s s o r . m o v e (d a . s u c c e s s o r . d)$

(20) end if

(21) end if

(22) if $F L A G_{1} & & F L A G_{2} = = t r u e$ then

(23) $d a . p r e d e c e s s o r . w = (d a . p r e d e c e s s o r . s + d a . s u c c e s s o r . s) / d a . p r e d e c e s s o r . s$

(24) $d a . s u c c e s s o r . w = (d a . p r e d e c e s s o r . s + d a . s u c c e s s o r . s) / d a . s u c c e s s o r . s$

(25) $d a . d = s y n t h e s i s (d a . d_{1}, d a . d_{2}, d a . p r e d e c e s s o r . w, d a . s u c c e s s o r . w)$

(26) else if $F L A G_{1} = = t r u e$ then

(27) $d a . d = d a . d_{1}$

(28) else if $F L A G_{2} = = t r u e$ then

(29) $d a . d = d a . d_{2}$

(30) end if

(31) end if

(32) end if

(33) $d a . m o v e (d a . d)$

(34) if $d a . p r e d e c e s s o r . s ⩾ s_{normal}$ then

(35) $d a . p r e d e c e s s o r . s t o p (d a . p r e d e c e s s o r . d)$

(36) end if

(37) if $d a . s u c c e s s o r . s ⩾ s_{normal}$ then

(38) $d a . s u c c e s s o r . s t o p (d a . s u c c e s s o r . d)$

(39) end if

(40) if $(d a . p r e d e c e s s o r . s ⩾ s_{normal}) & & (d a . s u c c e s s o r . s ⩾ s_{normal})$ then

(41) $d a . s t o p (d a . d)$

(42) end if

(43) until $E (d a) ⩽ e_{γ}$

As energy conservation is a critical issue in WSN for DAs and network life, we develop a function tuning algorithm based on the remaining energy of a DA. Moreover, this algorithm also assists the topology maintenance. We denote the remaining energy of $d a$ by $E (d a)$ . In specific, the remaining energy of a DA is classified into four statuses according to three critical values: $e_{α}$ , $e_{β}$ , and $e_{γ}$ . The four energy statuses $S_{n}$ , $S_{l}$ , $S_{a}$ , and $S_{s}$ are detailed in Table 2.

Table 2

Energy statuses and descriptions.

Status	Description
$S_{n}$	Normal status. The remaining energy is $E (d a) > e_{α}$
$S_{l}$	Low status. The remaining energy is $e_{β} < E (d a) ⩽ e_{α}$
$S_{a}$	Alert status. The remaining energy is $e_{γ} < E (d a) ⩽ e_{β}$
$S_{s}$	Serious status. The remaining energy is $E (d a) ⩽ e_{γ}$

When $d a$ is in the normal status, it is fully functional. Namely, it accepts service registrations, handles service queries, sends service replies, performs local location optimization and global location optimization, and so forth. Once $d a$ enters the low status, it stops performing local location optimization. When it turns into the alert status, an ALERT message is sent to both $d a . p r e d e c e s s o r$ and $d a . s u c c e s s o r$ . As soon as $d a . p r e d e c e s s o r$ and $d a . s u c c e s s o r$ receive the ALERT message, they both stop the local location optimization temporarily (even when they are in the normal status). Now, the three DAs $d a$ , $d a . p r e d e c e s s o r$ , and $d a . s u c c e s s o r$ are in preparation for the future serious status of $d a$ . Once $d a$ enters the serious status, it disables all functions except for performing a data transfer. By the Chord protocol, a node which is about to leave the network should transfer all the $〈k e y, d a t a i t e m〉$ pairs it is responsible for to its successor. When the transfer completes, the leaving node notifies its predecessor to change successor to the successor of the leaving node. Similarly, the successor of the leaving node is notified to change predecessor to the predecessor of the leaving node. Finally, the leaving node is completely isolated from the Chord network, the successor of its former predecessor is now its former successor, and the predecessor of its former successor is now its former predecessor. Then, the former predecessor of $d a$ sets the availability of $d a$ to LOGOUT in a monitoring token; the information about the monitoring token is provided later in Section 3.4.

As described above, when the status of a DA changes from normal to low, there is only one function that is removed, namely, the local location optimization. All the remaining functions of a DA are operational in three statuses: normal, low, and alert. The function of global location optimization is removed when the status of a DA changes to serious. Hence, this function tuning algorithm significantly assists the connectivity maintenance. The detailed function tuning algorithm is illustrated in Algorithm 3.

Algorithm 3: Function tuning procedure.

$F u n c t i o n T u n i n g (d a, e_{α}, e_{β}, e_{γ})$

(1) if $E (d a) > e_{α}$ then $S = S_{n}$

(2) else if $e_{β} < E (d a) ⩽ e_{α}$ then $S = S_{l}$

(3) else if $e_{γ} < E (d a) ⩽ e_{β}$ then $S = S_{a}$

(4) else $E (d a) ⩽ e_{γ}$ then $S = S_{s}$

(5) end if

(6) switch $(S)$

(7) case $S_{n}$ : break

(8) case $S_{l} : d a . f u n c t i o n \leftarrow d a . f u n c t i o n$ ∖ ${LLO}$

(9) break

(10) case $S_{a} : d a . s e n d (ALERT, d a . p r e d e c e s s o r)$

(11) $d a . s e n d (ALERT, d a . s u c c e s s o r)$

(12) break

(13) case $S_{s} : d a . f u n c t i o n \leftarrow \emptyset$

(14) $d a . t r a n s f e r (〈 k e y, URL 〉, d a . s u c c e s s o r)$

(15) $d a . s e n d (CHANGE_SUCCESSOR (d a . s u c c e s s o r), d a . p r e d e c e s s o r)$

(16) $d a . s e n d (CHANGE_PREDECESSOR (d a . p r e d e c e s s o r), d a . s u c c e s s o r)$

(17) break

(18) end switch

3.3. Directory-Less Mode

In directory-less mode, all DAs in the MANET are stationary. However, they still keep forming a Chord network. Other nodes in the MANET are mobile, namely, UAs and SAs. Though the WSN based on the Chord protocol still exists, there are no topology control and key searching based on the Chord network. Thus, the directory-less mode is much simpler than the directory-based mode. For DAs and SAs, the service registration is operational as before. This is done to get prepared for a mode switch. The mode switch between the directory-based mode and directory-less mode is detailed later in Section 3.4. In classic directory-less architectures, there are only UAs and SAs. Hence, there is no so called relaying function associated with DAs. The service discovery is only conducted based on service queries issued by UAs and service advertisements issued by SAs. In our model, the DAs provide relaying function for service queries, service advertisements, and service replies. Moreover, since DAs possess a large amount of service information, a DA could issue a service reply to answer a service query based on the service information it has (service information it originally knows and service information it learns through the relaying). For the purpose of mitigating the flooding problem and improving the performance of service discovery, we develop a two-hop zone scheme for the directory-less mode.

3.3.1. Two-Hop Zone Scheme

For a node $n_{i}$ in the MANET network, the two-hop zone of $n_{i}$ consists of two parts: one-hop neighbors and two-hop neighbors. A one-hop neighbor $n_{o}$ of node $n_{i}$ is a node which is one hop away from node $n_{i}$ . Though there might be several paths between $n_{i}$ and $n_{o}$ , the shortest path between them is one hop. The sufficient and necessary condition for a node $n_{o}$ to be a one-hop neighbor of node $n_{i}$ is that they are able to directly communicate with each other. The degree of node $n_{i}$ , namely, the number of one-hop neighbors of $n_{i}$ , is denoted by $d_{(i, 1)}$ . Then, we denote the set of one-hop neighbors of $n_{i}$ by

\begin{matrix} N_{(i, 1)} = \{n_{i 1}, n_{i 2}, \dots, n_{i d_{(i, 1)}}\} . \end{matrix}

(1)

Similarly, a two-hop neighbor $n_{t}$ of node $n_{i}$ is a node which is two hops away from node $n_{i}$ . Though there might be several paths between $n_{i}$ and $n_{t}$ , the shortest path between them is two hops. The sufficient and necessary condition for a node $n_{t}$ to be a two-hop neighbor of node $n_{i}$ is that they are not able to directly communicate with each other and node $n_{t}$ is a one-hop neighbor of node $n_{i}$ 's one-hop neighbor(s). The number of two-hop neighbors of $n_{i}$ is denoted by $d_{(i, 2)}$ . The set of two-hop neighbors of $n_{i}$ is

\begin{matrix} N_{(i, 2)} = N_{(i 1,1)} \cup N_{(i 2,1)} \cup \dots \cup N_{(i d_{(i, 1)}, 1)} ∖ N_{(i, 1)} . \end{matrix}

(2)

By the definition of a two-hop neighbor, the number of two-hop neighbors of node $n_{i}$ can be calculated as

\begin{matrix} d_{(i, 2)} = |⋃_{n_{i j} \in N_{(i, 1)}} ‍ N_{(i j, 1)} ∖ N_{(i, 1)}| - 1 . \end{matrix}

(3)

Note that node $n_{i}$ is a one-hop neighbor of its one-hop neighbor(s); namely,

\begin{matrix} n_{i} \in ⋃_{n_{i j} \in N_{(i, 1)}} ‍ N_{(i j, 1)} ∖ N_{(i, 1)} . \end{matrix}

(4)

Thus, node $n_{i}$ should be excluded from the calculation of $d_{(i, 2)}$ .

For an individual node $n_{i}$ in the MANET network, we denote the frequency of service advertisement of $n_{i}$ by $d f_{i}$ . A service advertisement contains a set of service records. We denote the set of service records contained in a service advertisement by $D (n_{i})$ , where $|D (n_{i})| ⩾ 1$ . During a certain measurement period, the average number of service records sent by node $n_{i}$ is $\bar{d_{i}}$ per advertisement. Then, we have $\bar{d_{i}} ⩾ 1$ . In order to facilitate the presentation of our two-hop zone scheme, we make the following premise.

Premise 1. Let $[t_{a}, t_{b}]$ be a measurement period, where $t_{a} < t_{b}$ . We assume that the period is sufficiently long to observe the behaviors of node $n_{i}$ together with its one-hop neighbors and two-hop neighbors. Moreover, the behavior characteristics of every node considered remain unchanged, such as the frequency of service advertisement, the frequency of service query, and the average number of service records per advertisement.

Node $n_{i}$ receives the service advertisements from all its one-hop neighbors. An update of its local database is carried out according to the service information contained in the received service advertisements. Since the frequencies of service advertisement vary from node to node, the service advertisements from the one-hop neighbors of node $n_{i}$ are arrived asynchronously. Moreover, the frequency of service advertisement of a particular node may change from time to time. Hence, the service advertisements from the one-hop neighbors of node $n_{i}$ are arrived randomly. For simplicity, we assume that the frequencies of service advertisement of all nodes in the network remain constant during the measurement period $[t_{a}, t_{b}]$ . In essence, this assumption is given by Premise 1.

We denote the total number of service records contained in the service advertisements received by node $n_{i}$ by $s d_{(i, r)}$ , and it is computed as

\begin{matrix} s d_{(i, r)} = \sum_{n_{i j} \in N_{(i, 1)}} ‍ {\bar{d}}_{(i j)} \cdot ⌊d f_{(i j)} \cdot (t_{b} - t_{a})⌋ . \end{matrix}

(5)

For each one-hop neighbor of $n_{i}$ , the number of service records advertised by $n_{i j}$ is calculated by multiplying the average number of service records per advertisement by the times of service advertisement during the measurement period $[t_{a}, t_{b}]$ . Premise 1 guarantees that each one-hop neighbor of $n_{i}$ performs the service advertisement at least once during the measurement period $[t_{a}, t_{b}]$ . Thus, the result of $d f_{(i j)} \cdot (t_{b} - t_{a})$ is greater than or equal to 1. For a noninteger number of the times of service advertisement, the floor operation rounds the result to the biggest integer which is less than $d f_{(i j)} \cdot (t_{b} - t_{a})$ .

If node $n_{i}$ is a UA or DA, it forwards the received service advertisements to all its one-hop neighbors. When node $n_{i}$ is an SA, besides forwarding the received service advertisements, it also issues service advertisements. We denote the total number of service records sent by node $n_{i}$ by $s d_{(i, s)}$ , and it consists of two parts:

\begin{matrix} s d_{(i, s)} = s d_{(i)} + s d_{(i, f)} . \end{matrix}

(6)

Node $n_{i}$ issues its own service advertisements to all its one-hop neighbors at the frequency of $d f_{i}$ . The total number of service records issued by node $n_{i}$ is denoted by $s d_{(i)}$ , which is computed as

\begin{matrix} s d_{(i)} = \bar{d_{i}} \cdot ⌊d f_{i} \cdot (t_{b} - t_{a})⌋ . \end{matrix}

(7)

s d_{(i)}

is calculated by multiplying the average number of service records per advertisement by the times of service advertisement during the measurement period

[t_{a}, t_{b}]

. Premise 1 guarantees that node

n_{i}

performs the service advertisement at least once during the measurement period

[t_{a}, t_{b}]

. Hence, the result of

d f_{i} \cdot (t_{b} - t_{a})

is greater than or equal to 1. For a noninteger number of the times of service advertisement, the floor operation rounds the result to the biggest integer which is less than

d f_{i} \cdot (t_{b} - t_{a})

Node $n_{i}$ forwards the service advertisements received from other nodes. The total number of service records forwarded by node $n_{i}$ is denoted by $s d_{(i, f)}$ . Among the service advertisements received by $n_{i}$ , whose TTL values are greater than 0 should be forwarded by node $n_{i}$ . As the two-hop zone scheme we proposed stipulates that the initial TTL value of a service advertisement is 2, the TTL value of each service advertisement received by node $n_{i}$ is either 1 or 0. Thus, the coverage of the original service advertisement issued by node $n_{i}$ is its one-hop and two-hop neighbors. In other words, the propagation range of an original service advertisement is limited within the two-hop zone of the issuing node. This characteristic could mitigate the flooding problem and reduce the message overhead imposed on the whole network.

Suppose the TTL value of the received service advertisements is under Poisson distribution

\begin{matrix} P \{T T L = k\} = \frac{λ_{d}^{k} \cdot e^{- λ_{d}}}{k!}, λ_{d} > 0, k = 0,1 . \end{matrix}

(8)

In our two-hop zone scheme, node $n_{i}$ is supposed to forward the received service advertisements to all its one-hop neighbors. However, it is necessary that the service advertisements sent to $n_{i}$ by its own one-hop neighbor $n_{i j}$ should not be forwarded back to $n_{i j}$ by $n_{i}$ . Though the corresponding one-hop neighbor might vary from one service advertisement to another, we denote it by a node $n_{i y}$ universally. Then, the total number of service advertisements forwarded by node $n_{i}$ is computed as

\begin{matrix} λ_{d} \cdot e^{- λ_{d}} \cdot \sum_{n_{i j} \in N_{(i, 1)}} ‍ (⌊d f_{(i j)} \cdot (t_{b} - t_{a})⌋ \cdot (d_{(i, 1)} - 1)) . \end{matrix}

(9)

The universal node $n_{i y}$ is excluded by the minus one operation. Note that (9) is the number of service advertisements rather than the number of service records. As mentioned above, the number of service records contained in a valid service advertisement is greater than or equal to one. The total number of service records forward by node $n_{i}$ is computed as

\begin{array}{l} s d_{(i, f)} & = λ_{d} \cdot e^{- λ_{d}} \\ \cdot \sum_{n_{i j} \in N_{(i, 1)}} ‍ ({\bar{d}}_{(i j)} \cdot ⌊d f_{(i j)} \cdot (t_{b} - t_{a})⌋ \cdot (d_{(i, 1)} - 1)) . \end{array}

(10)

Combining (5) and (10) we have the total number of service records forwarded by node $n_{i}$ as

\begin{matrix} s d_{(i, f)} = λ_{d} \cdot e^{- λ_{d}} \cdot (d_{(i, 1)} - 1) \cdot s d_{(i, r)} . \end{matrix}

(11)

Then, (6) could be rewritten as

\begin{array}{l} s d_{(i, s)} & = \bar{d_{i}} \cdot ⌊d f_{i} \cdot (t_{b} - t_{a})⌋ \\ + λ_{d} \cdot e^{- λ_{d}} \cdot (d_{(i, 1)} - 1) \cdot s d_{(i, r)} . \end{array}

(12)

That is to say, each one-hop neighbor of node $n_{i}$ receives $s d_{(i, s)}$ service records from $n_{i}$ .

When node $n_{i}$ receives a service query, it checks its local database. If the service query could be answered locally, there is no need to send it out. Otherwise, node $n_{i}$ sends it to its one-hop neighbors. Every one-hop neighbor of $n_{i}$ will receive a duplicate of the service query. While a service query is traveling through the network, the intermediate nodes will be recorded successively in a travel path. This path information stored in the service query plays an important role for the transmission of a service reply.

We denote the total number of service queries sent by node $n_{i}$ by $s s_{(i, s)}$ , and it consists of two parts:

\begin{matrix} s s_{(i, s)} = s s_{(i)} + s s_{(i, f)} . \end{matrix}

(13)

Suppose node $n_{i}$ issues its own service queries to all its one-hop neighbors at the frequency of $s f_{i}$ . If the average number of service requests contained in the service queries issued by node $n_{i}$ is $\bar{s_{i}}$ per service query, the total number of service requests issued by $n_{i}$ is denoted by $s s_{(i)}$ , which is computed as

\begin{matrix} s s_{(i)} = \bar{s_{i}} \cdot ⌊s f_{i} \cdot (t_{b} - t_{a})⌋ . \end{matrix}

(14)

In our two-hop scheme, we stipulate that a service query contains exactly one service request. Hence, $\bar{s_{i}}$ is identically equal to 1. In the reminder of this paper, we use a service request and a service query interchangeably.

For the service queries received from other nodes, node $n_{i}$ checks its local database to determine whether a service query could be answered locally. If a service query could be answered locally, node $n_{i}$ will issue a service reply designated to the node which originally issued the service query. The service queries which cannot be answered by node $n_{i}$ are disposed according to their TTL values. If the TTL value of a service query is 0, it will be dropped by node $n_{i}$ , without any forwarding. The service queries whose TTL values are greater than 0 will be forwarded by node $n_{i}$ . Our two-hop scheme stipulates that the initial TTL value of a service query is 4. Thus, the TTL values of the service queries received by node $n_{i}$ are within the set $\{3,2, 1,0\}$ . Meanwhile, the coverage of an original service query issued by node $n_{i}$ is two two-hop zones. In another word, an original service query is restricted within two two-hop zones of the issuing node. The same as the service advertisement, an upper bound of the initial TTL of a service query is to alleviate the flooding problem and weaken the message overhead imposed on the whole network.

Suppose the TTL value of the received service queries is under Poisson distribution

\begin{matrix} P \{T T L = k\} = \frac{λ_{s}^{k} e^{- λ_{s}}}{k!}, λ_{s} > 0, k = 0,1, 2,3 . \end{matrix}

(15)

We consider all the service queries sent to node $n_{i}$ are directly from its one-hop neighbors. We denote the total number of service queries received by node $n_{i}$ by $s s_{(i, r)}$ ; it could be calculated as

\begin{matrix} s s_{(i, r)} = \sum_{n_{(i j)} \in N_{(i, 1)}} ‍ {\bar{s}}_{(i j)} \cdot ⌊s f_{(i j)} \cdot (t_{b} - t_{a})⌋ . \end{matrix}

(16)

Suppose that node $n_{i}$ could be able to answer $a_{i}$ percent of the service queries sent to it. The same as the service advertisement, a service query sent to $n_{i}$ by its one-hop neighbor $n_{i j}$ should not be forwarded back to $n_{i j}$ by $n_{i}$ . Then, the number of service queries that should be forwarded by node $n_{i}$ is obtained analogously:

\begin{array}{l} s s_{(i, f)} & = (\frac{λ_{s} e^{- λ_{s}}}{1!} + \frac{λ_{s}^{2} e^{- λ_{s}}}{2!} + \frac{λ_{s}^{3} e^{- λ_{s}}}{3!}) \\ \cdot (d_{(i, 1)} - 1) \cdot (1 - a_{i}) \cdot s s_{(i, r)} . \end{array}

(17)

Then, (13) could be rewritten as

\begin{array}{l} s s_{(i, s)} & = ⌊s f_{i} \cdot (t_{b} - t_{a})⌋ + (\frac{λ_{s} e^{- λ_{s}}}{1!} + \frac{λ_{s}^{2} e^{- λ_{s}}}{2!} + \frac{λ_{s}^{3} e^{- λ_{s}}}{3!}) \\ \cdot (d_{(i, 1)} - 1) \cdot (1 - a_{i}) \cdot s s_{(i, r)} . \end{array}

(18)

Compared with service advertisement and service query, the number of service replies is small in nature. In general, the occurrence of a service reply is unpredictable and irregular. It is actually meaningless to impose restrictions on the sending frequency of a service reply. Moreover, in order to improve the performance of service discovery, we do not assign a specific upper bound to the initial TTL value of a service reply. A service reply uses the path information contained in the corresponding service query. The service replies designated to node $n_{i}$ are corresponding to the service queries issued by $n_{i}$ .

We denote the total number of service replies sent by node $n_{i}$ by $s r_{(i, s)}$ ; it consists of two parts:

\begin{matrix} s r_{(i, s)} = s r_{(i)} + s r_{(i, f)} . \end{matrix}

(19)

s r_{(i)}

denotes the number of service replies corresponding to the service queries which are answered by node

n_{i}

itself. As mentioned above, node

n_{i}

could be able to answer

a_{i}

percent of the service queries sent to it. Hence,

s r_{(i)}

could be computed as

\begin{matrix} s r_{(i)} = a_{i} \cdot s s_{(i, r)} . \end{matrix}

(20)

Besides sending its own service replies, node $n_{i}$ also forwards the service replies designated to other nodes. Among the service replies received by node $n_{i}$ , those designated to node $n_{i}$ will not be forwarded by $n_{i}$ . For a service reply designated to another node, node $n_{i}$ will try to forward it to the next node according the path information contained in the service reply. If the particular next node is not available, the service reply will be dropped by node $n_{i}$ . Now, we denote the total number of service replies received by node $n_{i}$ by $s r_{(i, r)}$ . Suppose that $b_{i}$ percent of the service replies received by node $n_{i}$ are designated to it. Among the service replies which should be forwarded by node $n_{i}$ , we assume that $f_{i}$ percent of them cannot be delivered to the next node, for the next node might be failed or out of the radio range of node $n_{i}$ . Then, the total number of service replies forwarded by node $n_{i}$ could be calculated as

\begin{matrix} s r_{(i, f)} = (1 - b_{i}) \cdot (1 - f_{i}) \cdot s r_{(i, r)} . \end{matrix}

(21)

Combining (19), (20), and (21), the total number of service replies sent by node $n_{i}$ could be rewritten as

\begin{matrix} s r_{(i, s)} = a_{i} \cdot s s_{(i, r)} + (1 - b_{i}) \cdot (1 - f_{i}) \cdot s r_{(i, r)} . \end{matrix}

(22)

3.3.2. Level of Connectivity

In order to facilitate a further investigation of the two-hop zone scheme, we propose a novel method to calculate the level of connectivity for the whole network based on the two-hop zone scheme. In the first place, we formulate the level of connectivity of two arbitrary nodes under the two-hop zone scheme. For two arbitrary nodes $n_{i}$ and $n_{j}$ , the relationship between them is denoted by $f_{k} (n_{i}, n_{j})$ , where k indicates the number of hops in the shortest path between $n_{i}$ and $n_{j}$ . As depicted in Figure 6, the shortest path between two arbitrary nodes $n_{i}$ and $n_{j}$ can be divided into five cases. The intermediate nodes are denoted by $n_{m i}$ .

Figure 6

Node relationships under the two-hop zone scheme.

In Figure 6(a), node $n_{i}$ and node $n_{j}$ are able to directly communicate with each other. The number of hops between them is 1. In this case, we have

\begin{matrix} f_{1} (n_{i}, n_{j}) ~ (n_{i} \in N_{(j, 1)}), (n_{j} \in N_{(i, 1)}) . \end{matrix}

(23)

In Figure 6(b), the number of hops between node $n_{i}$ and node $n o d e_{j}$ is 2. They share the same one-hop neighbor $n_{m 1}$ . Node $n o d e_{j}$ is within the two-hop zone of node $n_{i}$ . In this case, we have

\begin{array}{l} f_{2} (n_{i}, n_{j}) & ~ (n_{i} \in N_{(j, 2)}), (n_{j} \in N_{(i, 2)}), \\ (n_{m 1} \in N_{(i, 1)} \cap N_{(j, 1)}) . \end{array}

(24)

In Figure 6(c), the number of hops between node $n_{i}$ and node $n_{j}$ is 3. Node $n_{i}$ has a one-hop neighbor node $n_{m 1}$ and a two-hop neighbor node $n_{m 2}$ . Since node $n_{m 2}$ is a one-hop neighbor of node $n_{j}$ , $n_{j}$ is within the two-hop zone of node $n_{m 2}$ . In this case, we have

\begin{array}{l} f_{3} (n_{i}, n_{j}) & ~ (n_{m 1} \in N_{(i, 1)}), (n_{m 2} \in N_{(i, 2)}), \\ (n_{m 1} \in N_{(j, 2)}), (n_{m 2} \in N_{(j, 1)}) . \end{array}

(25)

In Figure 6(d), the number of hops between node $n_{i}$ and node $n_{j}$ is 4. Node $n_{i}$ has a one-hop neighbor node $n_{m 1}$ and a two-hop neighbor node $n_{m 2}$ . Since node $n_{m 2}$ is a two-hop neighbor of node $n_{j}$ , node $n_{j}$ is within the two-hop zone of node $n_{m 2}$ . Node $n_{i}$ and node $n_{j}$ share the same two-hop neighbor node $n_{m 2}$ . In this case, we have

\begin{array}{l} f_{4} (n_{i}, n_{j}) & ~ (n_{m 1} \in N_{(i, 1)}), (n_{m 2} \in N_{(i, 2)} \cap N_{(j, 2)}), \\ (n_{m 3} \in N_{(j, 1)}), (n_{m 2} \in N_{(m 1,1)} \cap N_{(m 3,1)}) . \end{array}

(26)

In Figure 6(e), the number of hops between node $n_{i}$ and node $n_{j}$ is greater that 4. Node $n_{i}$ has a one-hop neighbor node $n_{m 1}$ and a two-hop neighbor node $n_{m 2}$ . Node $n_{j}$ has a one-hop neighbor node $n_{m 4}$ and a two-hop neighbor node $n_{m 3}$ . The dotted line ellipse indicates that nodes $n_{m 2}$ and $n_{m 3}$ are connected as the same with node $n_{i}$ and node $n_{j}$ in the previous four cases, namely, $f_{1}$ , $f_{2}$ , $f_{3}$ , and $f_{4}$ . In this case, the relationship between $n_{i}$ and $n_{j}$ is expressed recursively by

\begin{array}{l} f_{> 4} (n_{i}, n_{j}) & ~ (n_{m 1} \in N_{(i, 1)}), (n_{m 2} \in N_{(i, 2)}), (n_{m 3} \in N_{(j, 2)}), \\ (n_{m 4} \in N_{(j, 1)}), f_{k - 4} (n_{m 2}, n_{m 3}), \end{array}

(27)

where

f_{k - 4} (n_{m 2}, n_{m 3})

indicates the relationship between

n_{m 2}

and

n_{m 3}

More complex situations can be decomposed and modeled through the iteration of different cases illustrated in Figure 6.

The level of connectivity between node $n_{i}$ and node $n_{j}$ is denoted by $d (n_{i}, n_{i})$ . It is defined as

\begin{matrix} d (n_{i}, n_{j}) = \{\begin{cases} 0, & k = 0 \\ \frac{1}{k}, & k = 1,2, 3,4 \\ \frac{1}{4 + 1 / d (n_{i}^{''}, n_{j}^{''})} & k > 4 . \end{cases} \end{matrix}

(28)

For $k = 0$ , there is no path between $n_{i}$ and $n_{j}$ ; then we define $d (n_{i}, n_{j}) = 0$ . When $k ⩾ 0$ , the level of connectivity between two nodes monotonically decreases with the increase of the number of hops between them. For $k = 1,2, 3$ , and 4, the reciprocal of k is used as the value of $d (n_{i}, n_{j})$ . For $k > 4$ , the calculation of $d (n_{i}, n_{j})$ cannot be done within the two-hop zones of $n_{i}$ and $n_{j}$ . The value of $d (n_{i}, n_{j})$ is determined recursively. Trivially, a two-hop zone of a node covers two hops. Thus, each time k is beyond a multiple of 4, another calculation of the level of connectivity is required. Then, the number of recursions could be calculated as $R = ⌊(k - 1) / 4⌋$ .

In the case of $k > 4$ , node $n_{i}^{''}$ is the particular two-hop neighbor of node $n_{i}$ which lies in the shortest path between $n_{i}$ and $n_{j}$ , so do node $n_{j}^{''}$ and node $n_{j}$ . Taking Figure 6(e) for example, the level of connectivity between node $n_{i}$ and node $n_{j}$ is computed by

\begin{matrix} d (n_{i}, n_{j}) = \frac{1}{4 + 1 / d (n_{m 2}, n_{m 3})} . \end{matrix}

(29)

Suppose the set of nodes which node $n_{i}$ could communicate with is denoted by set $N_{c} = \{n_{1}, n_{2}, \dots, n_{c}\}$ . Then, we define the level of connectivity for node $n_{i}$ as

\begin{matrix} d (n_{i}) = \sum_{j = 1}^{c} ‍ d (n_{i}, n_{j}) . \end{matrix}

(30)

For N nodes in the whole network, there exist $C_{N}^{2} = N (N - 1) / 2$ different pairs among them. We define the level of connectivity for the network as

\begin{matrix} d (N) = \sum_{\forall n_{i} \in N, \forall n_{j} \in N} ‍ d (n_{i}, n_{j}), n_{i} \neq n_{j} . \end{matrix}

(31)

Theoretically, a sufficiently large value of R could result in a situation that there always exists a path between two arbitrary nodes, irrespective of the magnitude of N. With the increase of R, the value of $d (N)$ will keep increasing until inflection point $(\hat{l}, \hat{d})$ is reached. Here, $\hat{d}$ is the maximum of $d (N)$ . For $R ⩾ \hat{l}$ , $d (N) \equiv \hat{d}$ . In addition, we define $\hat{l}$ as the degree of convergence of the network. The smaller the $\hat{l}$ is, the better the network converges.

3.4. Autonomic Switch between Two Modes

In order to facilitate the information updating and network status gathering in the WSN, we introduce a monitoring token which is kept passing around the Chord network. The autonomic switch between the directory-based mode and the directory-less mode is based on the network status gathering. By a round, we mean that the monitoring token is transmitted by all the DAs once. The current mode of the service discovery architecture is indicated by a flag called MODE, and its value is DA-BASED or DA-LESS. In addition, the monitoring token contains two lists: the SA list $s a l$ and the DA list $d a l$ . The structure of the monitoring token is shown in Figure 7.

Figure 7

Monitoring token.

Though the unified service information management scheme described in Section 3.1.3 stipulates that all nodes in the MANET have knowledge of the properties and values of services, the DAs initially have no idea of any SAs. Thus, we use the monitoring token to perform the information updating associated with SAs. In general, the appearance of a new SA is accompanied by service registration initiated by the SA. For a DA which received a service registration, it updates local database with the service information. When the monitoring token arrives at the DA, the DA checks whether the SA list of the monitoring token contains the SA of the service registration. If the SA already exists, the DA did nothing for the SA. Otherwise, the DA adds the SA to the SA list of the monitoring token. Meanwhile, the DA checks whether there are unknown SAs. The local database is updated with the information of SAs which are unknown to the DA. That is to say, when the monitoring token arrives at a DA, the DA initiates a two-way update about the information of SAs. When an existing SA is about to leave the MANET, it issues a logout message to an arbitrary DA within its radio range. The DA which receives the logout message first removes the service information associated with the SA in the local database. When the monitoring token arrives at the DA, the DA marks a logout flag in the SA list for the SA. During the transmission of the monitoring token, other DAs become aware of the logout of the SA through the two-way update and remove the service information associated with the SA in their local databases. When the monitoring token arrives at the DA next time, the DA issues a logout acknowledgment to the SA. As long as the SA receives the logout acknowledgment, it could leave the network freely and there is no need to logout again. Although there is no topology control and key searching based on the Chord network in the directory-less mode, the transmission of the monitoring token is still operational. Thus, the two-way update, service registration, and SA logout are also performed normally. The detailed information updating procedure is shown in Algorithm 4.

Algorithm 4: Information updating procedure.

$I n f o r m a t i o n U p d a t i n g (d a, SReg [n], t o k e n, SA . logout [m], r o u n d (0))$

(1) for each $SA \in t o k e n . s a l$

(2) if $SA \notin d a . s a l & & SA . a v a i l a b i l i t y \neq LOGOUT$ then

(3) $d a . s a l \leftarrow d a . s a l \cap {SA}$

(4) end if

(5) if $SA \in d a . s a l & & SA . a v a i l a b i l i t y = = LOGOUT$ then

(6) $d a . s a l \leftarrow d a . s a l$ ∖ ${SA}$

(7) $d a . d a t a b a s e \leftarrow d a . d a t a b a s e$ ∖ ${records of SA}$

(8) end if

(9) end for

(10) for $i \leftarrow 1$ to n

(11) if $SReg [i] . SA \notin d a . s a l$ then

(12) $d a . s a l \leftarrow d a . s a l \cap {SReg [i] . SA}$

(13) end if

(14) if $SReg [i] . SA \notin t o k e n . s a l$ then

(15) $t o k e n . s a l \leftarrow t o k e n . s a l \cap SReg [i] . SA$

(16) end if

(17) end for

(18) for $j \leftarrow 1$ to m

(19) if $t = = r o u n d (0)$ then

(20) $d a . s a l \leftarrow d a . s a l ∖ {SA . logout [j]}$

(21) $d a . d a t a b a s e \leftarrow d a . d a t a b a s e ∖ {records of SA . logout [j]}$

(22) $d a . s e t (t o k e n . s a l . e n t r y [SA . logout [j]] . a v a i l a b i l i t y, LOGOUT)$

(23) else if $t = = r o u n d (1)$ then

(24) $d a . s e n d (LOGOUT_ACK, SA . logout [j])$

(25) end if

(26) end if

(27) end for

Besides information updating, the monitoring token also plays another vital function: network status gathering. A DA in the WSN senses the local network status in its vicinity. The information of local network statuses is gathered by the monitoring token, and then a high-level overview of the network status is revealed. This overview of the network status facilitates the autonomic switch between the directory-based mode and the directory-less mode. Suppose our service discovery architecture is under the directory-less mode. If the frequency of service query decreases, the frequency of service advertisement should be decreased accordingly. However, if the frequency of service query keeps decreasing significantly, a switch to the directory-based mode is needed. Now, assume our service discovery architecture is under the directory-base mode; if the frequency of service query increases, the frequency of service advertisement should be increased. If the frequency of service query keeps increasing significantly, a switch to the directory-less mode is needed.

In particular, each time the monitoring token arrives at a DA, if the DA realizes that the frequency of service query it perceived locally is beyond a threshold $q f_{0}$ , it adds a plus sign to its entry in the monitoring token. However, if the DA realizes that the frequency of service query it perceived locally is lower than the threshold $q f_{0}$ , it checks whether there exists a plus sign in its entry in the monitoring token. A plus sign which already existed will be removed by the DA. Suppose our service discovery architecture is under the directory-base mode. When a DA keeps adding plus signs for several rounds, a local overload is occurring. Moreover, if a considerable number of DAs have encountered a local overload, the directory-less mode should be used. When the number of DAs which encountered a local overload is small, the directory-based mode should be used. We denote the number of plus signs which indicates a local overload by $l o$ (local overload). Moreover, we denote the percentage of locally overloaded DAs which indicates a mode switch between the two modes by $g o$ (global overhead). Suppose the service discovery architecture is under directory-based mode. When the monitoring token arrives at a DA, the DA first updates the number of plus signs in its entry. Then, it examines the entries of all DAs. If the percentage of locally overloaded DAs is larger than $g o$ , it changes the value of MODE in the monitoring token from DA-BASED to DA-LESS for the purpose of informing other DAs to switch their modes. Similarly, a reverse switch from DA-LESS to DA-BASED is triggered by a percentage of locally overloaded DAs which is smaller than $g o$ . The detailed autonomic mode switch procedure is shown in Algorithm 5.

Algorithm 5: Autonomic mode switch procedure.

$A u t o n o m i c M o d e S w i t c h (d a [n], q f_{0}, l o, g o, t o k e n)$

(1) if $d a [i] . q f_p e r c e i v e d ⩾ q f_{0}$ then

(2) $t o k e n . d a l . e n t r y [i] . s t a t u s \leftarrow t o k e n . d a l . e n t r y [i] . s t a t u s$ + “+”

(3) else if $t o k e n . d a l . e n t r y [i] . s t a t u s \neq \emptyset$ then

(4) $t o k e n . d a l . e n t r y [i] . s t a t u s \leftarrow t o k e n . d a l . e n t r y [i] . s t a t u s$ − “+”

(5) end if

(6) end if

(7) $n l o \leftarrow 0$

(8) for $k \leftarrow 1$ to n

(9) if $| t o k e n . d a l . e n t r y [k] . s t a t u s | ⩾ l o$ then

(10) $n l o \leftarrow n l o + 1$

(11) end if

(12) end for

(13) if $t o k e n . MODE = = DA-BASED$ then

(14) if $n l o / n ⩾ g o$ then

(15) $d a [i] . s e t (t o k e n . MODE, DA-LESS)$

(16) end if

(17) else

(18) if $(n l o / n < g o)$ then

(19) $d a [i] . s e t (t o k e n . MODE, DA-BASED)$

(20) end if

(21) end if

4. Numerical Results

4.1. Experimental Environment and Simulation Parameters

For the purpose of evaluating our model, we develop a MANET platform which is based on the NS-2 [45]. All simulation results are obtained on HP with Inter Core2 Q9550 2.83 GHz, 4 GB RAM with Debian 2.6.32-48squeeze1 (Linux version is 2.6.32-5-686) and gcc 4.3.5 (Debian 4.3.5-4).

In our platform, the number of DAs in the WSN is 32. The identifier space of the underlying Chord network is $2^{m} = 2^{16}$ . The numbers of SAs and UAs are 25 and 200, respectively. The number of different complete descriptions for profiling services should be $\prod_{i = 1}^{n p} ‍ |p_{i}| ⩽ 2^{16}$ . In [36, 46], the authors use a famous dataset [47] to conduct their experiments. The dataset contains 2507 real web services on the Internet. In this paper, we generate 8000 records of different service information based on the dataset. The consistent hash function based on SHA-1 is employed to calculate the IDs of DAs and the keys of services. The 8000 keys and the corresponding service information (URL, SA) are mapped to the 32 DAs according to the Chord protocol. The IDs of the 32 DAs and the numbers of service records for which each node takes responsibility are shown in Table 3.

Table 3

Information of the 32 DAs.

Number	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
ID	1456	2383	2461	6511	7002	7560	11438	12704	13087	15018	15408	20197	24126	24700	33059	33837
Records	562	124	4	514	55	73	468	153	41	241	34	611	456	71	991	93

Number	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32

ID	37271	38208	39176	40539	42247	43982	46340	47226	47396	49090	49122	52350	53654	55839	59680	62300
Records	430	104	135	174	212	221	278	102	15	218	3	423	126	248	485	335

For a DA $d a$ and its predecessor $d a . p r e d e c e s s o r$ on the Chord network, if the IDs of $d a$ and $d a . p r e d e c e s s o r$ are the smallest one and the largest one among all the DAs, respectively, the distance between $d a$ and $d a . p r e d e c e s s o r$ is calculated as

\begin{matrix} d i s (d a, d a . p r e d e c e s s o r) = 2^{m} - d a . p r e d e c e s s o r . i d + d a . i d . \end{matrix}

(32)

Otherwise, we compute the distance between them by

\begin{matrix} d i s (d a, d a . p r e d e c e s s o r) = d a . i d - d a . p r e d e c e s s o r . i d . \end{matrix}

(33)

We define the responsibility ratio of $d a$ as the ratio of the number of service records for which $d a$ takes responsibility and the distance between $d a$ and $d a . p r e d e c e s s o r$ . And it is computed as

\begin{matrix} d a . r r = \frac{|d a . r e c o r d s|}{d i s (d a, d a . p r e d e c e s s o r)} . \end{matrix}

(34)

As shown in Figure 8, the responsibility ratios of the 32 DAs are marked as blue crosses and the linear regression equation of them is drawn in red. The responsibility ratios of most DAs are within the range of $[0.08,0.14]$ , except for $D A_{3}$ . The responsibility ratio of $D A_{3}$ is close to 0.05. The IDs of $D A_{2}$ and $D A_{3}$ are 2383 and 2461, respectively. Since, $D A_{3}$ is close to its predecessor $D A_{2}$ and the number of service records it is responsible for is small, it takes little responsibility for the service records in the network. As the responsibility is measured by a ratio, it is rational that the responsibility ratio of $D A_{27}$ is larger than $D A_{3}$ .

Figure 8

Responsibility ratio of 32 DAs.

4.2. Performance Criteria

Though there are various criteria which are used to evaluate the effectiveness of service discovery architectures, we adopt three fundamental criteria: service availability, message overhead, and delay. Each of these criteria is presented below.

4.2.1. Service Availability

Gao and Ma [48] considered that the service availability is the probability that a service is available. It is formulated as the ratio of the number of times a service responds to a user request to the number of total requests made by the user. This definition is targeted at the evaluation of user experience toward a known service. However, the service availability in our model aims to evaluate the performance of service discovery. Thus, we define the service availability from another perspective. In particular, we introduce the service availability of a single node and the service availability of the network. Since service queries and service replies are flowing through all nodes in the network (including UAs, SAs, and DAs), we define the service availability of a single node as the ratio of the total number of incoming service replies to the total number of outgoing service queries. Considering a node $n_{i}$ , the service availability is denoted as

\begin{matrix} s a_{i} = \frac{s r_{(i, r)}}{s s_{(i, s)}} . \end{matrix}

(35)

Then, the service availability of the network is defined as

\begin{matrix} s a_{N} = \frac{\sum_{i = 1}^{N} s a_{i}}{N} . \end{matrix}

(36)

4.2.2. Message Overhead

The concept of message overhead mainly concerns two types of messages: service queries and service replies. In order to improve the performance of service discovery, both the two kinds of messages are duplicated to a certain extent. Specifically, a service query is duplicated for the purpose of reaching more nodes. While a service reply is duplicated in another way, since there are several copies of service queries, it is rational that there may appear several service replies which are originated from various nodes. As service replies are generated in response to the received service queries, we hold that normal service replies should not be considered as message overhead. Thus, we hold that the message overhead should be measured by the portion of service queries which are duplicated redundantly. We denote the number of service queries issued by node $n_{i}$ by $s s_{(i)}$ . The number of the service replies received by node $n_{i}$ is denoted by $s r_{(i, r)}$ . And $b_{i}$ percent of the service replies received by node $n_{i}$ are designated to it. We denote the number of duplicates of $s s_{(i)}$ by $d u p (s s_{(i)})$ . Then, the message overhead of node $n_{i}$ is defined as

\begin{matrix} m o_{i} = \frac{d u p (s s_{(i)})}{b_{i} \cdot s r_{(i, r)}} . \end{matrix}

(37)

Then, the message overhead of the network is defined as

\begin{matrix} m o_{N} = \frac{\sum_{i = 1}^{N} m o_{i}}{N} . \end{matrix}

(38)

4.2.3. Delay

The response time is a prominent factor of a harmonized user experience. In practice, the delay is greatly concerned by many users. In our model, the delay perceived by a UA consists of two parts: the processing time and the transmission time. Since we do not focus on the relationship between the processing time and the transmission time, we measure the delay in all. Though the number of service queries issued by node $n_{i}$ is $s s_{(i)}$ , it is likely that a part of the service queries finally get no service replies. We denote the number of service queries which finally get one service reply (or several service replies) by $r_{i} \cdot s s_{(i)}$ , where $0 < r_{i} ⩽ 1$ . For a service query which at least has one service reply, the delay associated with the service query indicates the time period between node $n_{i}$ sends the service query and node $n_{i}$ receives the first corresponding service reply. We denote the time period by t. Then, the delay which node $n_{i}$ perceived is defined as

\begin{matrix} d e l a y_{i} = \frac{\sum_{j = 1}^{r_{i} \cdot s s_{(i)}} t_{j}}{r_{i} \cdot s s_{(i)}} . \end{matrix}

(39)

Then, the delay of the network is defined as

\begin{matrix} d e l a y_{N} = \frac{\sum_{i = 1}^{N} d e l a y_{i}}{N} . \end{matrix}

(40)

4.3. Experimental Results and Analysis

The two characteristics of the directory-based mode are energy conservation and function tuning. Both characteristics concentrate on prolong the network lifetime for the WSN. In practice, as certain applications require all the nodes are operational, the network lifetime may mean the time period before the first node failure due to energy exhaustion. However, in most applications, the network lifetime means the time period before a number of the nodes in the network exhaust their energy. Notable topology management schemes have been proposed for energy conservation issues, such as [49–51]. However, a major assumption of these schemes is high density of sensors. In order to investigate the performance of our model, we compare it with the approach proposed by Pan et al. [51]. The simulation result shown in Figure 9 indicates a slightly better performance of our model.

Figure 9

Operational nodes versus time step.

As shown in Figure 9, both percentages of the operational nodes for the two models decrease with the increase of time step. In our experiments, we assume that the percentage of failed nodes which indicates the end of the network lifetime is 50%. Note that this percentage is just used for the comparison between our model and Pan et al.'s model. In fact, our model is able to operate normally even when there are few nodes left in the network. The percentage of the operational nodes falls below 50% around $t = 275$ in our model. Pan et al.'s model performs similar to our model during the period of $t \in [0,100]$ . When $t > 100$ , the percentage of the operational nodes of Pan et al.'s model decreases quickly compared to that of ours. Moreover, the percentage of the operational nodes falls below 50% around $t = 225$ in Pan et al.'s model.

Now, we consider our directory-based mode alone. For three fixed values of the frequency of service query $s f_{i} = 0.006$ , 0.008, and 0.01, the service availability of the network and the delay of the network are plotted in Figures 10 and 11, respectively.

Figure 10

Service availability versus time step.

Figure 11

Delay versus time step.

As shown in Figure 10, the values of service availability of the network for $s f_{i} = 0.006$ , 0.008, and 0.01 are fluctuated in the range of 60%~90%. In our model, the $〈k e y, U R L〉$ information of a node which leaves the network is transferred to the successor of the node. Thus, the decrease of operational nodes imposes slight influence on the service availability of the network. Note that service queries are generated based on service information which is wishfully created by UAs and DAs. Though the Chord network itself is able to guarantee the normal search of a key, there are chances that some service queries lead to no service replies. This is why the values of the service availability of the network are less than 100% even at the early stage of simulation. Since a large frequency of service query brings more burden to the DAs than that of a small frequency of service query, the slight differences of service availability of the network among the three curves are caused by DAs which are about to fail. These DAs miss issuing service replies by chance. Thus, the ordering of the service availability of the network is roughly $s f_{i} = 0.006 > s f_{i} = 0.008 > s f_{i} = 0.01$ .

As shown in Figure 11, the values of delay of the network for $s f_{i} = 0.006$ , 0.008, and 0.01 increase with the increase of time step. Since the number of operational nodes is decreasing with the increase of time step, the work load which a DA confronts with is getting heavier with the increase of time step. Thus, a service query may encounter a long processing time. Though there are fewer DAs in the WSN, the shortening of transmission time is no match for the growth of processing time. Hence, the total delay is inevitably increased. In addition, a large frequency of service query also results in a larger backlog than that of a small frequency of service query. Thus, the ordering of the delay of the network is roughly $s f_{i} = 0.01 > s f_{i} = 0.008 > s f_{i} = 0.006$ .

For the directory-less mode, we adopt a fixed setting of $b_{i} = 0.1$ and $f_{i} = 0.15$ . Namely, $b_{i}$ percent of the service replies received by node $n_{i}$ are designated to it. Among the service replies which should be forwarded by node $n_{i}$ , $f_{i}$ percent of them cannot be forwarded to the next node. The number of plus signs which indicates a local overload is $l o = 4$ . The percentage of locally overloaded DAs which indicates a mode switch between the two modes is $g o = 75 %$ . Since a sufficiently large value of R could result in a situation that there always exists a path between two arbitrary nodes, we adopt the maximum of k as 10. Then, the number of recursions $R = ⌊(k - 1) / 4⌋ = 2$ . In order to facilitate an in-depth analysis of our model, we consider two cases of distribution for the level of connectivity for the network: uniform distribution and proportional distribution. For all nodes in the MANET, the numbers of pairs for 10 levels of connectivity are illustrated in Figure 12. In the case of uniform distribution, the numbers of pairs for 10 levels of connectivity are uniformly distributed. In the case of proportional distribution, the number of pairs is in proportion to the level of connectivity.

Figure 12

Number of pairs for 10 levels of connectivity.

As previously described, the mode switch between the directory-based mode and the directory-less mode is triggered by the percentage of locally overloaded DAs. Essentially, a local overload for a DA is caused by excessively frequent service queries. In order to investigate the mode switch of our model, we conducted extensive simulations to find an empirical threshold of the frequency of service query. Finally, we learned that the empirical threshold of the frequency of service query is $q f_{0} = 0.003$ . Namely, a value of $q f_{0}$ which is greater than 0.003 will lead to more than 75% locally overloaded DAs and then triggers a mode switch from directory-based to directory-less.

Figure 13 shows the service availability of the network under the two distributions. When $q f_{0} \in [0.0005,0.003]$ , the simulation is under directory-based mode. Since the early stage of simulation is under directory-based mode, the difference between the two distributions is not significant. When $q f_{0} \in [0.003,0.0055]$ , the simulation is under directory-less mode. As the uniform distribution could provide more small levels of connectivity between two nodes than that of proportional distribution, a node could communicate with more distant nodes. Thus, the values of service availability of the network under the uniform distribution are larger than their counterparts under the proportional distribution.

Figure 13

Service availability versus frequency of service query.

As shown in Figure 14, the delay of the network monotonically increases with the increase of the frequency of service query when $q f_{0} \in [0.0005,0.003]$ and monotonically decreases when $q f_{0} \in [0.003,0.0055]$ . The simulation result indicates that the differences between the two distributions under both directory-less mode and directory-based mode are not significant.

Figure 14

Delay versus frequency of service query.

As shown in Figure 15, when $q f_{0} \in [0.0005,0.003]$ , the values of message overhead of the network under both distributions remain in one. When the mode switches to the directory-less mode, the values of message overhead of the network under both distributions monotonically increase with the increase of the frequency of service query. In addition, the differences between the two distributions under both directory-less mode and directory-based mode are not significant when $q f_{0} \in [0.003,0.0055]$ . Thus, the performance of the directory-based mode is superior to the directory-less mode in terms of message overhead.

Figure 15

Message overhead versus frequency of service query.

Through the experimental results shown in Figures 13–15, it can be observed that the mode switch function of our model could significantly improve the performance of service discovery in terms of service availability, message overhead, and delay.

5. Conclusions

With the increasing need for flexible service discovery architectures for service discovery in MANET environment, we propose and evaluate a collaborative self-governing privacy-preserving wireless sensor network architecture. The proposed architecture is based on the Chord protocol and focuses on location optimization and energy conservation. The key function of the architecture is that it provides an autonomic mode switch between the directory-based mode and the directory-less mode. In order to evaluate our model, we developed a MANET platform and conducted extensive simulations regarding three critical criteria concerning service discovery: service availability, message overhead, and delay. The simulation results indicate that the autonomic mode switch between the directory-based mode and the directory-less mode greatly improves the performance of service discovery in terms of the above three criteria. One open issue is how to effectively determine an empirical threshold of the frequency of service query.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive comments on this paper. This work is supported by Program for the Key Program of NSFC-Guangdong Union Foundation (U1135002), Major National S&T Program (2011ZX03005-002), National Natural Science Foundation of China (60872041, 61072066), the Fundamental Research Funds for the Central Universities (JY10000903001, JY10000901034, K5051203010), and the GAD Pre-Research Foundation (9140A15040210HK61).

References

Meditskos

Bassiliades

Structural and role-oriented web service discovery with taxonomiesin OWL-S

IEEE Transactions on Knowledge and Data Engineering 2010 22 2 278 290

10.1109/TKDE.2009.89

2-s2.0-75449105654

Junghans

Agarwal

Studer

Towards practical semantic web service discovery

The Semantic Web: Research and Applications 2010 6089 2 15 29

10.1007/978-3-642-13489-0_2

2-s2.0-77954417968

Santhana

V. A.

Balasundaram

S. R.

Effective web-service discovery using K-means clustering

Distributed Computing and Internet Technology 2013 7753

Berlin, Germany

Springer

455 464 Lecture Notes in Computer Science

10.1007/978-3-642-36071-8_36

Junghans

Agarwal

Web service discovery based on unified view on functional and non-functional properties

Proceedings of the 4th IEEE International Conference on Semantic Computing (ICSC ′10)

September 2010

224 227

10.1109/ICSC.2010.36

2-s2.0-79952059915

Warner

Ladner

Gupta

Petry

F. E.

Aha

D. W.

Philip

System and method for web service discovery and access

U.S. Patent no. 8,209,407, June 2012

Guttman

Veizades

Service location protocol, version 2

1999

Ken

Scheifler

Waldo

Jini Specification 1999

Addison-Wesley Longman

Salutation Consortium Salutation Architecture Specification 1999

The Salutation Consortium Inc.

http://ftp.salutation.org/salute/sa20e1a21.ps

Microsoft Corporation

Universal Plug and Play Device Architecture Version 1.0

June 2000, http://upnp.org/index.php/sdcps-and-certification/standards/referenced-specifications/

10.

Bluetooth Consortium Specification of the Bluetooth System Core Version 1.0 b: Part E, Service Discovery Protocol (SDP) 1999

11.

Hodes

T. D.

Czerwinski

S. E.

Zhao

B. Y.

Joseph

A. D.

Katz

R. H.

An architecture for secure wide-area service discovery

Wireless Networks 2002 8 2-3 213 230

10.1023/A:1013772027164

ZBL1012.68971

2-s2.0-0036498835

12.

Adjie-Winoto

Schwartz

Balakrishnan

Lilley

The design and implementation of an intentional naming system

ACM SIGOPS Operating Systems Review 1999 33 5 186 201

13.

Kozat

U. C.

Tassiulas

Network layer support for service discovery in mobile ad hoc networks

Proceedings of the 22nd Annual Joint Conference on the IEEE Computer and Communications Societies

April 2003

1965 1975

2-s2.0-0041973505

14.

Sailhan

Issarny

Scalable service discovery for MANET

Proceedings of the 3rd IEEE International Conference on Pervasive Computing and Communications (PerCom ′05)

March 2005

235 246

10.1109/PERCOM.2005.36

2-s2.0-33646576167

15.

Koubaa

Fleury

A fully distributed mediator based service location protocol in ad hoc networks

Proceedings of the IEEE Global Telecommunicatins Conference (GLOBECOM ′01)

November 2001

2949 2953

2-s2.0-0035684645

16.

Klein

Konig-Ries

Obreiter

Service rings—a semantic overlay for service discovery in ad hoc networks

Proceedings of the 14th IEEE International Workshop on Database and Expert Systems Applications

September 2003

180 185

10.1109/DEXA.2003.1232020

17.

Klein

König-Ries

Multi-layer clusters in ad-hoc networksan approach to service discovery

Web Engineering and Peer-to-Peer Computing 2002

Springer

187 201

18.

Schiele

Becker

Rothermel

Energy-efficient cluster-based service discovery for ubiquitous computing

Proceedings of the 11th Workshop on ACM SIGOPS European Workshop (EW ′11)

September 2004

10.1145/1133572.1133604

2-s2.0-77951490284

19.

Tyan

Mahmoud

Q. H.

A comprehensive service discovery solution for mobile ad hoc networks

Mobile Networks and Applications 2005 10 4 423 434

10.1007/s11036-005-1555-z

2-s2.0-20844431526

20.

Seada

Helmy

Rendezvous regions: a scalable architecture for service location and data-centric storage in large-scale wireless networks

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS ′04)

April 2004

Santa Fe, NM, USA

218

2-s2.0-12444321998

21.

Sivavakeesar

Gonzalez

O. F.

Pavlou

Service discovery strategies in ubiquitous communication environments

IEEE Communications Magazine 2006 44 9 106 113

10.1109/MCOM.2006.1705986

2-s2.0-33750115168

22.

Lee

Helal

Desai

Verma

Arslan

Konark: a system and protocols for device independent, peer-to-peer discovery and delivery of mobile services

IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans. 2003 33 6 682 696

10.1109/TSMCA.2003.819493

2-s2.0-0347337935

23.

Jeong

Park

Kim

Service discovery based on multicast DNS in IPv6 mobile ad-hoc networks

Proceedings of the 57th IEEE Semiannual Vehicular Technology Conference (VTC ′03)

April 2003

1763 1767

2-s2.0-0041468542

24.

Perkins

IP address autoconfiguration for ad hoc networks

IETF, 2001, http://www.cs.ucsb.edu/~ebelding/txt/autoconf.txt

25.

Jeong

Park

Autoconfiguration technologies for IPv6 multicast service in mobile ad-hoc networks

Proceedings of the 10th IEEE International Conference on Networks: Towards Network Superiority (ICON ′02)

August 2002

261 265

10.1109/ICON.2002.1033321

2-s2.0-77950013572

26.

Barbeau

Service discovery protocols for ad hoc networking

Proceedings of the CASCON 2000 Workshop on Ad Hoc Communications

2000

27.

Chakraborty

Joshi

Yesha

Finin

Toward distributed service discovery in pervasive computing environments

IEEE Transactions on Mobile Computing 2006 5 2 97 112

10.1109/TMC.2006.26

2-s2.0-33646357885

28.

Gao

Z.-G.

Yang

X.-Z.

T.-Y.

Cai

S.-B.

RICFFP: an efficient service discovery protocol for MANETs

Embedded and Ubiquitous Computing 2004 3207

Berlin, Germany

Springer

786 795 Lecture Notes in Computer Science

10.1007/978-3-540-30121-9_75

29.

Nedos

Singh

Clarke

Service*: Distributed service advertisement for multi-service, multi-hop manet environments

Proceedings of the 7th IFIP International Conference on Mobile and Wireless Communication Networks

2005

30.

Nidd

Service discovery in DEAPspace

IEEE Personal Communications 2001 8 4 39 45

10.1109/98.944002

2-s2.0-0035428326

31.

Campo

Muñoz

Perea

J. C.

Marín

García-Rubio

PDP and GSDL: a new service discovery middleware to support spontaneous interactions in pervasive systems

Proceedings of the 3rd IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom ′05)

March 2005

178 182

10.1109/PERCOMW.2005.61

2-s2.0-33646775683

32.

Lee

Helal

Lee

Gossip-based service discovery in mobile ad hoc networks

IEICE Transactions on Communications 2006 89 9 2621 2624

10.1093/ietcom/e89-b.9.2621

2-s2.0-33748776005

33.

Ververidis

C. N.

Polyzos

G. C.

Service discovery for mobile ad hoc networks: a survey of issues and techniques

IEEE Communications Surveys & Tutorials 2008 10 3 30 45

10.1109/COMST.2008.4625803

2-s2.0-56749174768

34.

Stoica

Morris

Karger

Kaashoek

M. F.

Balakrishnan

Chord: a scalable peer-to-peer lookup service for internet applications

ACM SIGCOMM Computer Communication Review 2001 31 4 149 160

35.

Zhang

Huang

Yang

Zhang

A hierarchical and chord-based semantic service discovery system in the universal network

International Journal of Innovative Computing, Information and Control 2009 5 11 3745 3753

2-s2.0-71149100418

36.

Gao

Jiang

On analysis of a chord-based traffic model for web service discovery in distributed environment

Journal of Engineering Science and Technology Review 2013 6 5 129 136

2-s2.0-84893502030

37.

Adala

Tabbane

Discovery of semantic Web Services with an enhanced-Chord-based P2P network

International Journal of Communication Systems 2010 23 11 1353 1365

10.1002/dac.1110

2-s2.0-78649712740

38.

Mhatre

Rosenberg

Design guidelines for wireless sensor networks: communication, clustering and aggregation

Ad Hoc Networks 2004 2 1 45 63

10.1016/S1570-8705(03)00047-7

2-s2.0-4143145711

39.

Mhatre

V. P.

Rosenberg

Kofman

Mazumdar

Shroff

A minimum cost heterogeneous sensor network with a lifetime constraint

IEEE Transactions on Mobile Computing 2005 4 1 4 15

10.1109/TMC.2005.2

2-s2.0-12844249406

40.

Heinzelman

W. B.

Chandrakasan

A. P.

Balakrishnan

An application-specific protocol architecture for wireless microsensor networks

IEEE Transactions on Wireless Communications 2002 1 4 660 670

10.1109/TWC.2002.804190

2-s2.0-33646589837

41.

Lindsey

Raghavendra

C. S.

PEGASIS: power-efficient gathering in sensor information systems

Proceedings of the IEEE Aerospace Conference

March 2002

1125 1130

10.1109/AERO.2002.1035242

2-s2.0-51949089156

42.

Mhatre

Rosenberg

Homogeneous vs heterogeneous clustered sensor networks: a comparative study

Proceedings of the IEEE International Conference on Communications

June 2004

3646 3651

2-s2.0-4143083941

43.

FIPS Publication 180-1. Secure Hash Standard 1995 17

National Institute of Standards and Technology

44.

Reynolds

Vahdat

Efficient peer-to-peer keyword searching

Proceedings of the ACM/IFIP/USENIX International Conference on Middleware

2003

21 40

45.

The Network Simulator C ns-2

http://www.isi.edu/nsnam/ns/index.html

46.

Malik

Bouguettaya

Rateweb: reputation assessment for trust establishment among web services

The VLDB Journal 2009 18 4 885 911

10.1007/s00778-009-0138-1

2-s2.0-70350543894

47.

Al-Masri

Mahmoud

Q. H.

Discovering the best web service

Proceedings of the 16th ACM International Conference on World Wide Web (WWW ′07)

May 2007

1257 1258

10.1145/1242572.1242795

2-s2.0-35348885486

48.

Gao

A collaborative QoS-aware service evaluation method for service selection

Journal of Networks 2013 8 6 1370 1379

10.4304/jnw.8.6.1370-1379

2-s2.0-84879144752

49.

Cerpa

Estrin

ASCENT: Adaptive self-configuring sensor networks topologies

IEEE Transactions on Mobile Computing 2004 3 3 272 285

10.1109/TMC.2004.16

2-s2.0-4544225837

50.

Chen

Jamieson

Balakrishnan

Morris

Span: an energy-efficient coordination algorithm for topology maintenance in ad hoc wireless networks

Wireless Networks 2002 8 5 481 494

10.1023/A:1016542229220

2-s2.0-0036739784

51.

Pan

Hou

Y. T.

Cai

Shi

Shen

S. X.

Topology control for wireless sensor networks

Proceedings of the 9th Annual International Conference on Mobile Computing and Networking (MobiCom ′03)

September 2003

286 299

2-s2.0-1542358975