RMDS: Ranging and multidimensional scaling–based anchor-free localization in large-scale wireless sensor networks with coverage holes

Abstract

Sensor node localization is a crucial aspect of many location-related applications that utilize wireless sensor networks. Among the many studies in the literature, multidimensional scaling-based localization techniques have been proven to be efficient, obtaining high accuracy with lower information requirements. However, when applied to large-scale wireless sensor networks with coverage holes, which are common in many scenarios, such as underground mines, the transmission path can become deviated, degrading the localization performance of this type of connectivity-based technique. Furthermore, in such complex wireless environments, non-line-of-sight reference objects, the presence of obstacles and signal fluctuations change the communication range and make it difficult to obtain an accurate position. In this article, we present a anchor-free localization scheme for large-scale wireless sensor networks called the ranging and multidimensional scaling–based localization scheme. We use ranging and non-line-of-sight error mitigation techniques to estimate accurate distances between each node pair and attempt to find inflection nodes using a novel flooding protocol to correct transmission paths that have become deviated by a coverage hole. Moreover, we replace the singular value decomposition with an iterative maximum gradient descent method to reduce the computational complexity. The results of the simulations and experiments show that our scheme performs well on wireless sensor networks with different coverage holes and is robust to varying network densities.

Keywords

Wireless sensor networks localization multidimensional scaling inflection node non-light-of-sight error mitigation

Introduction

Localization techniques for wireless sensor networks (WSNs) have attracted increasingly more attention due to their wide application for emergency response, intelligent transportation systems, target tracking, and so on.^1,2 It is essential for the sensor nodes to be able to automatically discover their locations in many WSN applications. In practice, it is infeasible to obtain the position of each node with manual configurations or positioning equipment such as global positioning system (GPS). Instead, localization protocols usually require some anchor nodes whose locations are known in advance, and the other nodes can localize themselves by calculating the relative locations between themselves and the anchor nodes by measuring the physical distances using ranging hardware.³

In large-scale WSNs with hundreds or even thousands of nodes, which are typically highly resource constrained, it is important to find a relatively simple and energy-preserving method for performing network localization.⁴ Multidimensional scaling (MDS), a technique from mathematical psychology, is used to calculate the positions of all the nodes in a network given only basic connectivity information.^5–7 With the proposed MDS-MAP⁵ localization method, the algorithm can detect the shortest paths between each node pair with simply the given network connectivity information. The algorithm roughly estimates the distance between each pair of nodes and normalizes the resulting coordinates considering the anchor nodes, whose positions are known. This localization scheme first calculates the hop counts between the anchor nodes and a node. Then, the Euclidean distance of two nodes can be approximated through the hop counts multiplied by the average physical hop distance. After obtaining the Euclidean distance, it is easy to locate nodes through a simple calculation. Because of its simplicity, there has been growing interest in localization methods based on connectivity of the network.

The problems facing MDS-MAP lie in the fact that connectivity-based (CB) and constant-anchor (CA) algorithms use only connectivity information to perform localization without any ranging techniques. To improve accuracy, the MDS technique can be extended to full network localization using the measurements of range differences (related to the time of arrival (TOA)) and range difference rates (related to the frequency difference of arrival (FDOA)).^8–10 In these works, an accurate relative distance instead of communication range between two nodes is used to perform localization. Obviously, adding the extra information of the distance can be expected to yield smaller errors, especially in non-line-of-sight (NLOS) scenarios, where classical MDS algorithms are very sensitive to poor and dynamic transmission environments.¹¹ However, in certain application scenarios, such as underground mines and indoors, the complexity of such environments often severely affects localization accuracy. Complications include NLOS reference objects, the presence of obstacles, signal fluctuations or noise, and changes in an environment.^12,13 It can be easily determined that it is very important and difficult for wireless signal propagation parameter estimation to obtain an accurate position due to the complex signal transmission patterns characterizing such environments. Therefore, with traditional ranging-based localization methods, the performance will be degraded in such environments.

Furthermore, in ordinary cases, most WSNs have to be deployed randomly with the help of aircraft, such as helicopters, which can hardly guarantee a high coverage quality of the networks.¹⁴ Moreover, for an already-deployed WSN, the sensor nodes can be easily destroyed due to environmental factors such as wind, obstacles, or vibration. Therefore, the coverage hole problem of WSNs is unavoidable and has a negative effect on the performance of CB localization schemes. Obviously, the paths between different nodes around the coverage holes are highly bent rather than being straight lines. The bent distance calculated using the hop counts multiplied by the average physical hop distance strongly deviates from the true Euclidean distance, which leads to large localization errors.

In this article, we present an anchor-free localization scheme, named the Ranging and MDS-based localization scheme (RMDS). We assume that all the nodes in a network with coverage holes can measure the physical distances from their nearby nodes using time-of-flight (TOF) ranging with certain hardware, which is practical in state-of-the-art WSN node hardware architectures.^12,15 To address the above-mentioned problems, we attempt to estimate the degree of curvature around the coverage hole and use it to correct the distances between the node pairs whose communication paths will be bent around the hole. To achieve this, we perform a coverage-aware and link-correlation-balanced flooding (CLF) mechanism to detect the boundary of the hole. Moreover, we embed the TOF ranging in the flooding process and propose an NLOS elimination method based on an adaptive Kalman algorithm to improve the accuracy of the one-hop-measured distance results obtained by refraction and reflection of radio signals in complex wireless environments. Finally, we can obtain a more accurate result of the distance matrix consisting of the smallest distances between every pair of nodes and then calculate the position of each node with an improved MDS algorithm. In summary, our contributions are two-fold.

We propose an efficient and energy-aware flooding mechanism based on link correlation and group one-hop neighbors into clusters to prevent the ACK explosion problem. Then, we estimate the subtree size and inflection degree of each node and detect the inflection nodes.

We combine the TOF ranging and NLOS error mitigation technique with the MDS algorithm to estimate the real Euclidean distance between the node pairs whose transmission paths are modified by the coverage hole. With this method, a more accurate distance matrix consisting of the shortest distances between every pair of nodes can be obtained to improve the localization accuracy.

It is worth noting that although only one coverage hole is considered in this article, our proposed scheme is also efficient in a network with multiple coverage holes. In fact, in a large-scale and dense WSN application, there may exist more than one coverage hole. If there are several adjacent small holes, we can compute the convex hull of all the wireless nodes and consider it as a big one. If the multiple coverage holes are far from each other, we can divide the network into several parts, that each one contains one hole in it. Thus, multiple-hole scenario can be always converted into one-hole scenario, and our scheme is also applicable. It should also be feasible to design the scheme for network with multiple holes without combining into one big hole. How to design the scheme considering each individual hole is left as the future work.

The remainder of this article is organized as follows: Related work is surveyed in the next section. An overview of the RMDS is given in the third part. Then, the fourth section introduces some important auxiliary nodes used. The next two sections describe the RMDS in detail. The seventh section presents simulation results, and the final part concludes the article.

Related work

Overview of localization techniques

Localization has been extensively researched in many studies on WSNs.^16–19 Current localization approaches can be divided into many groups, including connectivity-only-based, range-based, and range-free approaches. These localization algorithms provide some effective solutions, and we will briefly review such work in the following.

The centroid-based approach proposed by Bulusu et al.²⁰ is one of the earliest works on CB localization. One node estimates its relative position simply by calculating the centroid of several anchors’ positions. In the distance vector (DV)-hop,²¹ at least three anchors are selected, and each node will measure the number of hops to the anchors. Then, the node’s location can be obtained by triangulation. Render Path (REP) is another typical localization technique that only utilizes connectivity information.²² The algorithm includes a method that uses geometric features to correct the estimated length of a path when a coverage hole exists in the network. REP detects the boundaries of the holes and labels the boundary nodes. It then creates a virtual hole at the corner where the path has deviated from the shortest path. The virtual hole induces a new shortest path based on the geometric information. Combining the original path and the new shortest path, REP can determine the real Euclidean distance between the two nodes. The algorithm depends on the full and exact boundary of the network; however, it is nearly impossible to obtain such a boundary. In addition, the real Euclidean distance is approximated by the hop count of the new shortest path multiplied by the average hop distance. This concept may lead to large localization errors because of the bent shortest path.

MDS-based localization

The MDS technique is also often used for node localization in WSNs because of its simplicity and applicability to large-scale networks. In traditional MDS-based schemes, only connectivity information is needed. Every pair of nodes will count the number of hops between them; then, a distance matrix can be built and is taken as an input to calculate the relative coordinates of each node in a centralized manner.^5,7,23

However, the connectivity information is very limited, and the distance calculated using hop numbers multiplied by communication range is inaccurate, especially in complex wireless environments such as underground mines and indoors. In such environments, wireless links exhibit asymmetry and irregularity and can be easily affected by multipath and reflection.^24,25 Therefore, directly applying traditional MDS-based algorithms in such environments will lead to large errors. Thus, some researchers have attempted to combine ranging techniques with the MDS localization method. The classical ranging-based MDS algorithm assumes that the distance between any two nodes is measurable or calculable so that nodes can be located through matrix operations on the distance matrix. In practice, distances exceeding the range of communication cannot be measured. Therefore, obtaining distances beyond the range remains a problem.

Some modified MDS algorithms have achieved various results through some indirect methods for calculating and correcting distances. Wang et al.²⁶ proposed an MDS-based localization method with multipath mitigation for passive ultra-high-frequency band (UHF) radio frequency identification (RFID) systems. They attempted to measure the distance between two nodes and eliminate the multipath error to improve the localization accuracy of the MDS method. MDS-DMC (MDS-Distance matrix) has been investigated as a range-based algorithm in Popescu and Hedley.²⁷ When there is a coverage hole in the network and when boundary nodes cannot be found around holes, this will lead to some distances being missed in the matrix, eventually resulting in localization failure. Y Wang et al.²⁸ proposed a localization scheme for large-scale sensor networks with complex shapes and possibly with holes. The scheme first selects landmarks on network boundaries with a certain density and then constructs a landmark diagram and its dual combinatorial Delaunay complex on these landmarks. This scheme faces a challenge with respect to networks with holes: determining the correct network layout. In addition, when the node number is large, this algorithm for configuring the network runs very slowly, and if the node range is in a non-convex area or an inner ring, an illegal triangle is produced. Similarly, landmark and edge nodes are also used in the connectivity-based and anchor-free three-dimensional localization (CATL) scheme,²⁹ which performs well in both two-dimensional (2D) and three-dimensional (3D) spaces with coverage holes. However, due to the iterative procedure, CATL suffers from error propagation and thus cannot be used in complex wireless environments.

When a coverage hole exists in the wireless network, the communication paths between different nodes will always overlap with the holes, thus deviating from the ordinary paths and becoming bent paths. Obviously, the bent paths will deviate greatly at the hole boundary, which makes the communication distances larger than their Euclidean distances, as illustrated in Figure 1. In the example, there exists a triangular coverage hole in the WSN area. The three vertices of the hole are expressed as black dots, which are called inflection nodes (for simplicity, the ordinary nodes are omitted, and we assume that a sufficient number of nodes are placed in the area). Moreover, the dotted lines represent the data transmission of message flows from one node to another node (it is worth noting that Figure 1 is just a sketch to show the concept of inflection node and its connection to the coverage hole). It can be easily determined that the message flow will be performed in a near straight-line manner when the hole is not present. However, with the existence of the coverage hole, the message flows will be bent at the inflection nodes, and obviously, the transmission times of inflection nodes will be greater than those of other nodes. Therefore, we select two types of auxiliary nodes, named boundary nodes and landmark nodes, to collect the number of message flows. Both auxiliary nodes are real and not virtual. After the auxiliary nodes are selected, the landmarks will start to flood messages across the entire network to form a tree architecture. To avoid the ACK explosion problem and reduce energy consumption, we propose a coverage-aware and link-correlation-based mechanism in this article to perform an efficient flooding process. Obviously, in Figure 1, the data packets passed around the hole will be forced to deviate to the inflection nodes, resulting in a large-scale subtree.

Figure 1.

Example of message flow crossing the inflection nodes.

Thus, in this article, we attempt to calculate the scale of each node’s subtree to detect the inflection nodes and modify the distance between two nodes whose communication path contains the inflection nodes. To the best of our knowledge, this is the first research on combining ranging techniques and the MDS algorithm that also considers complex wireless environments.

It is worth noting that the performance of our proposed algorithm will degrade when the coverage hole is located at the edge of the network, as the inflection nodes will be selected inappropriately, especially on the junction of coverage hole and the network boundary. This will affect the bent length estimation and the final localization result. But note that the case when the coverage hole is at the edge of the network rarely happens and how to deal with this kind of scenario is left as the future work, which we are currently working on.

Key ideas of RMDS

In this section, we present an overview of the RMDS, including the key ideas of the RMDS and the main procedures.

When using the MDS method to calculate the positions of all nodes, it is crucial to obtain an accurate distance matrix that consists of the shortest paths between each pair of nodes. However, as mentioned above, when coverage holes exist in the network, the paths between different nodes around coverage holes will deviate from straight lines, thus producing large errors if we treat the bent paths as ordinary paths and use hop counts multiplied by the average physical hop distance to calculate their lengths. Thus, the heart of RMDS is a method for identifying the inflection nodes and correcting the lengths of bent paths in the distance matrix. As described above, the inflection nodes are nodes near or at the network corners, where transmission paths are bent and deviate from their straight-line course. They are critical for identifying coverage holes and calculating the lengths of bent paths.

Obviously, every single node and its multi-hop neighbors in a wireless network will form a tree architecture, and every node in the tree has its array of subtree sizes. It can be easily determined that if the network has no holes, nodes at the same depth in the tree have approximately the same subtree sizes for any hops. However, this rule will be violated in a network with holes or boundaries. Messages that flow in a nearly straight-line manner will deviate and be forced to travel via the inflection nodes, which contributes to creating a fatter subtree. In other words, an inflection node is more likely to have a large subtree. For example, in Figure 1, more messages flow along the dotted line crossing the inflection nodes than along other nodes, which will produce a fatter subtree in the tree architecture. Based on the statistics of each node in these trees created by boundary nodes, every node can compare its subtree sizes with some standards, and the inflection degree of each node will increase with the subtree sizes over the standards. If a node finds that its own inflection degree is higher than a certain threshold, it will be taken as an inflection node. There are many paths between any two nodes in these trees, and thus, we select the path with the minimum number of hops as the shortest path.

In a large-scale WSN, the energy consumption and ACK explosion problem of traditional flooding protocols^30,31 are unacceptable. In the RMDS, we propose a high-efficiency and energy-aware flooding mechanism to estimate the subtree size and inflection degree of each node. The flooding protocol is based on link correlations for grouping one-hop neighbors into clusters and electing cluster heads for aggregate ACKs to prevent the ACK explosion problem. By collecting the aggregate ACKs from each cluster head, the sender calculates its real-time probability of achieved reliability, which is relatively poor in complex wireless environments. Then, the sender node can decide which node will rebroadcast and how many times to retransmit the message based on the real-time probability and opportunistically. In addition, a comprehensive consideration of rapid coverage and balanced energy that allocates different priorities to one-hop neighbors can suppress the transmission of nodes that have less residual energy and fewer non-covered neighbors. Thus, we can detect the size of the subtree and inflection degree of each node to determine the positions of coverage holes and inflection nodes. In addition, by embedding the TOF ranging into the message transmission of the flooding protocol, we can easily measure the one-hop distance between two nodes. In addition, to improve the ranging accuracy in complex wireless environments, such as underground mines and indoors, we propose an adaptive Kalman filter algorithm with colored measurement noise to eliminate NLOS errors, which we have proven to be non-Gaussian-distributed noise.¹² Finally, we can correct the distance between the node pairs whose transmission paths are bent around the coverage hole using the inflection degree, and thus, the MDS localization process is able to be performed successfully.

Auxiliary nodes

In the following, we will present the meanings and methods for selecting two types of auxiliary network nodes, called network boundary nodes and network landmark nodes, which exist in the wireless networks and are not virtual nodes. The purpose of the two types of nodes is to assist the flooding process for identifying inflection nodes and the shortest paths.

Boundary nodes

Boundary nodes involve a set of nodes on or close to the edge of a network. Nodes broadcast messages, and each node that receives a message will respond with an ACK. With the flooding protocol, every node can obtain its one-hop neighbors. In addition, the information is exchanged between nodes so that every node is able to learn the number of nodes within its two-hop distance. Then, each node exchanges its number of two-hop neighbors and can compare with the average degree. If the number of neighbors is substantially smaller than the average degree, the node marks itself as a boundary node. It is worth noting that the boundary nodes should not be identical to the network boundary. Under certain rough conditions, they may not cover the entire network edge; sometimes, they will only approximately cover, and not exactly, the network boundary. We do not want to look for full or exact boundaries, as they are not required by our algorithm. These boundary nodes merely help us select some nodes for flooding and thus do not need to be full or exact in any sense. Figure 2(a) shows the original network with a hole, and Figure 2(b) shows the selected boundary nodes.

Figure 2.

An example of boundary and landmark nodes. (a) Original network with a coverage hole. (b) Boundary nodes in the network. (c) Landmark nodes in the network.

Landmark nodes

The landmark nodes are nodes uniformly sampled from the original network with a density controlled by a small parameter $L$ , which is a hop number. Initially, there are no landmark nodes in the network. Once the localization process begins, all nodes start to flood a message to their L-hop neighbors. If a node has been marked as a landmark node and receives the message, it will respond with an ACK. If a message sender has not received any ACK, in other words, there are no previously established landmark nodes in its neighborhood, the sender marks itself as a landmark and then notifies its neighbors. During this process, if a node has already received such a notification before its own flooding, then it cancels its action. The process is run from the network boundary to the whole network. Figure 2(c) shows the landmark nodes in the network. The parameter $L$ only helps in sampling nodes from the network and thus does not impact the whole process. The parameter is generally set according to the density of the nodes in the network. When the density is small, the parameter should be set as a large value in consideration of the coverage. In our article, the parameter $L$ is set as three hops. After the two types of auxiliary nodes have been selected, nodes that are both boundary nodes and landmark nodes mark themselves as boundary landmarks.

Procedure of the RMDS

In the following, we describe the main steps of the RMDS.

Select two types of auxiliary nodes to assist in flooding.

Calculate the subtree size and inflection degree using an efficient flooding protocol and identify the inflection nodes.

Perform TOF ranging, NLOS mitigation, and MDS localization.

We assume that there is a coordinator for the network that controls the flooding process and the whole localization process. There are some notations and symbols used to describe the RMDS algorithm, which are listed in Table 1 for ease of reference.

Table 1.

Notation and definition.

Notation	Definition
$L$	Landmark nodes sampling the hop counter
$α$	Energy priority
$β$	Coverage priority
$R$	Preset expected reliability
$E (k)$	Residual energy of node $k$
$C (k)$	Relative coverage of node $k$
$P_{s} (k)$	Retransmission priority of node $s$ ’s neighbor $k$
$E R_{sk}$	Local expected reliability of node $s$ ’s neighbor $k$
$l_{sk}$	Link error rate from $s$ to $k$
$n (s)$	Retransmission times of node $s$
${ER}_{sk}^{n} (s)$	Local expected reliability of node $s$ ’s neighbor $k$ , with retransmission times of $n (s)$
$E R_{s}$	Local expected reliability of node $s$
$M_{1} (s)$	One-hop neighbors of node $s$
$M_{2} (s)$	Two-hop neighbors of node $s$
$M'_{1} (s)$	Retransmission subset of node $s$
$B_{sv} (i)$	A bitmap representing the ith reception state, where $v$ is the receiver and $s$ is the transmitter
$L C_{k} (m \| k)$	Link correlation between nodes $m$ and $k$
$C_{TH}$	Clustering threshold
$S iz e_{s} (le, h)$	h-hop subtree size of node $s$ with level $le$
$S P_{\min (m, k)}$	The shortest path between nodes m and k
$(\bar{d})$	Average node degree
$d (i)$	Inflection degree of node $i$
$t_{m, k}$	Measured distance between nodes $m$ and $k$
$S_{m, k}$	Corrected bent length of the path $S P_{\min (m, k)}$
$Δ$	Distance matrix
$Y$	Number of nodes in the network

Network structure and detection of shortest paths

In this section, every boundary landmark will flood the network to detect inflection nodes that are described in subsection “MDS-based localization” of section “Related work” and the shortest path between any two nodes. Before this action, we have to cluster the neighbors of nodes with an effective flooding.

Coverage-aware and link-CLF mechanism

In large-scale networks with frequent flooding, the energy consumption of nodes cannot be ignored. These large numbers of floods expend energy rapidly, leading to nodes with less residual energy. To reduce the energy consumption of flooding, we divide the one-hop neighbors into clusters. Nodes within a cluster reply with fewer ACKs during flooding. Meanwhile, to guarantee the performance of the flooding, it is usually assumed that there is no packet loss during data transmission. However, in the above-mentioned complex wireless environment, this assumption is no longer valid.³¹

Calculating retransmission priorities

First, a node that will broadcast a message (or rebroadcast a received message) will calculate the priorities of all neighbors. Suppose that the sender is $s$ , the number of neighboring nodes is $M_{1} (s)$ , and each neighbor $k$ has residual energy $E (k)$ and relative coverage $C (k)$ . Then, we can define the priority $P_{s} (k)$ of every neighbor, which is given by

P_{s} (k) = \frac{E {(k)}^{α} C {(k)}^{β}}{\sum_{1}^{M_{1} (s)} E {(k)}^{α} C {(k)}^{β}}

(1)

where $α$ and $β$ are defined as energy priority and coverage priority, respectively. If $α > β$ , route selection has the characteristic of tending toward energy balance instead of rapid coverage and vice versa.

This CLF mechanism suffers from two issues in determining the retransmission priority when a packet loss occurs. One issue is the residual energy of all one-hop neighbors, which makes each node maintain a one-hop neighbor table to record the real-time residual energy of neighbors. The other issue is determining the one-hop neighbors’ relative coverage, which refers to the number of one-hop neighbors’ uncovered neighbors by the packet from the sender. The information updates of the multi-hop neighbor tables are as follows.

Detecting the residual energy of all neighbors. Each node monitors its residual energy in real time and informs its neighbors by sending a periodic broadcast. A neighbor who receives this broadcast will update its local one-hop neighbor table corresponding to the sender’s index. Thus, the priority of nodes with low residual energy can be decreased in the next round of selecting retransmission nodes. For instance, in Figure 3, the black circles including $s_{1}$ and $s_{2}$ are nodes that received the same message from a common sender. The white circles are neighbors of $s_{1}$ and $s_{2}$ that are not covered by the message that is received by $s_{1}$ and $s_{2}$ . Each block that is adjacent to $s_{1}$ and $s_{2}$ represents 20% of the initial energy, whereas the black circles represent residual energy. As shown in the example in Figure 3, since the residual energies of $s_{1}$ and $s_{2}$ are 60% and 40% of the initial energy, respectively, the competitiveness of $s_{1}$ is enhanced.

Determining the relative coverage. In Figure 3, except for the black nodes, which have already received the sender’s message, the remaining white circles represent the relative coverage of $s_{1}$ and $s_{2}$ . Therefore, the relative coverage of $s_{1}$ is 4 and that of $s_{2}$ is 3. It is obvious that $s_{1}$ covers a larger new area and has more residual energy; hence, it has a higher priority than $s_{2}$ .

Figure 3.

Relative coverage and residual energy.

Local expected reliability

During the flooding process, it is important to ensure that every packet can be transmitted correctly. However, in a complex wireless environment, it is very difficult to achieve a 100% successful transmission rate.³² Thus, we usually have an expected reliability (ER) $R$ under certain retransmission attempts; this reliability may be 90% and can vary between different application scenarios. It is worth noting that $R$ is a whole-network metric indicating the percentage of data and control packets that can be transmitted to their destinations on average. Similarly, each node can measure its own local ER.

After allocating one-hop neighbors’ different priorities, the sender $s$ starts to calculate its local ER for deciding how many nodes to rebroadcast and how many times to retransmit. The local ER can be calculated by

E R_{sk} = {\begin{matrix} 1 - l_{sk} & if k \in M_{1} (s) \\ (1 - l_{sm}) (1 - l_{mk}) & if k \in M_{2} (s), m \in M_{1} (s) \end{matrix}

(2)

where $l_{sk}$ indicates the link error rate from $s$ to $k$ , which can be maintained and updated with a periodic “Hello” message, and $M_{1} (s)$ and $M_{2} (s)$ are the one-hop and two-hop neighbors of node $s$ , respectively.

First, $s$ sets its retransmission times $n (s)$ to 1 and inserts the node that has the highest priority among all one-hop neighbors into the retransmission subset (which is denoted as $M'_{1} (s)$ ). Then, it calculates the local ER to its neighbors; if $\exists m, E R_{sm} < R$ , $s$ will select a node from the remaining neighbors with highest priority into $M'_{1} (s)$ . All the nodes in $M'_{1} (s)$ will retransmit the data packets received from $s$ . If adding all one-hop neighbors into $M'_{1} (s)$ still cannot satisfy the target reliability $R$ , $s$ will increase $n (s)$ to 2. Then, it clears the previous subset and starts selecting again. Obviously, the local ER of $s$ increases to $1 - l_{sk}^{n (s)}$ if $s$ decides to transmit the message $n (s)$ times. However, for the two-hop neighbors, $s$ cannot know how many times these nodes will transmit because each node will decide its number of transmissions independently. Therefore, we make a conservative assumption that all one-hop neighbors between the sender and the sender’s two-hop neighbors will perform the same number of transmissions. Therefore, the following equation holds

{ER}_{sk}^{n (s)} = {\begin{matrix} 1 - l_{sk}^{n (s)} & if k \in {M'}_{1} (s) \\ (1 - l_{sm}^{n (s)}) (1 - l_{mk}^{n (s)}) & if k \in M_{2} (s), m \in {M'}_{1} (s) \end{matrix}

(3)

Thus, overall, for the sender $s$ , its local ER is given by

E R_{s} = \frac{\sum_{k \in M_{1}^{'} (s) \cup M_{2} (s)} {ER}_{ks}^{n (s)}}{| {M'}_{1} (s) | + | M_{2} (s) |}

(4)

$s$ stops selecting the retransmission subset or increasing the retransmission number once the achieved ER is greater than $R$ . To avoid the calculation of the retransmission number becoming an endless loop problem, we set a maximum number of retransmissions, and the node that has the maximum number of retransmissions will be considered invalid.

Link correlation and clustering

To avoid the ACK explosion problem in the traditional flooding algorithm, we exploit link correlations in the clustering and aggregating ACK process to reduce the energy consumption during both data and ACK transmissions. Nodes that have similar link conditions are grouped into one cluster, and one ACK from the cluster head can acknowledge on behalf of all the nodes in the cluster. This effectively ameliorates the ACK explosion problem. First, we define the link correlation $LC$ as

L C_{k} (m | k) = \frac{\sum_{i = 1}^{W} B_{sk} (i) & p B_{sm} (i)}{\sum_{i = 1}^{W} B_{sk} (i)}

(5)

where $s$ is the sender and $k$ and $m$ are two sink nodes of $s$ . Supposing that $k$ receives a broadcast from $s$ , $B_{sk} (i)$ is the bit representing the receiver’s reception state of the ith basic flooding message sent from $s$ , where “1” means that the reception is successful and “0” means that the reception failed. In this article, taking the required memory space and overhead of the control message into account, each node only keeps the last 10 messages, and thus, $W$ equals 10.

For example, in Figure 4, a bitmap of “1101100111” from node $m$ indicates that $m$ does not receive the fourth, fifth, and eighth messages transmitted by $s$ , and node $k$ does not receive the fourth, fifth, eighth, and ninth messages. Node $k$ can use equation (5) to calculate the link correlation between $s$ and itself, which is expressed as $L C_{k} (m | k) = \frac{1 & 1 + 0 & 1 + 0 & 0 + 1 & 1 + 1 & 1 + 0 & 0 + 0 & 0 + 1 & 1 + 1 & 1 + 1 & 1}{1 + 1 + 0 + 1 + 1 + 0 + 0 + 1 + 1 + 1} = 85.7 %$

Figure 4.

Example of calculating link correlation.

ACK aggregation

After the sender collects all the receivers’ bitmaps and link correlations, it begins to group neighbors into clusters and aggregate ACKs. As mentioned above, we choose the link error rate and the link correlation as the clustering metrics.

Although the bitmaps capture a distinguishing feature of link correlations, it seems unclear how to find a criterion for precisely dividing one-hop neighbors into clusters based on link correlation. Furthermore, the impact of such a criterion on the final clustering results is quite different between various circumstances. In CLF, we decide not to pursue an ideal and unique criterion. Instead, we use a very simple linear function while letting the clustering threshold change adaptively in response to the clustering process. As mentioned above, in a complex wireless environment, we usually have a preset ER $R$ within certain retransmission times; then, we can set an initial clustering threshold $C_{TH} = R$ for all neighbors, where $C_{TH}$ decreases by $0.1 R$ in our setting. This initial threshold can be made even higher. This means that nodes do not readily consider themselves as link correlations, which of course generates a cluster of a few nodes with a strict criterion. A new cluster that satisfies the updated $C_{TH}$ is generated from the remaining unassigned nodes in each round.

During the clustering process, the sender checks whether $C_{TH} < 0.5$ , meaning that the receptions of a broadcasting message at different receiving nodes are almost independent; therefore, the clustering process stops, and the remaining unassigned nodes will be their own cluster head. The adaptive mechanism saves us the trouble of searching for an optimal clustering threshold for all possible scenarios in a case-by-case manner. The mechanism also guarantees that every node will be able to be clustered.

Figure 5 shows an example of the clustering process based on link correlations, wherein five receivers are divided into two groups. Clearly, there is a strong link correlation between the link states represented by bitmaps within the nodes grouped in the same cluster. A single ACK from the cluster head can easily represent the acknowledgments for the memberships. Specifically, with a high link correlation, nodes within the same cluster have high probabilities to receive the same packet, and the reception at all these nodes can be acknowledged with one ACK. The node with the highest link error rate in each cluster will be selected as the cluster head to send the ACK. For instance, in Figure 5, nodes $m_{2}$ and $m_{5}$ are selected as the cluster head of the corresponding clusters for $s_{1}$ . In addition, for cluster 1 of $s_{1}$ , the probability that the entire cluster has received a broadcast packet based on successful reception at the cluster head can be easily calculated to be 100%.

Figure 5.

Example of the clustering process.

We summarize the coverage-aware and link-CLF mechanism in Algorithm 1.

Algorithm 1. Coverage-aware and Link-correlation-balanced flooding
Require: $\forall s, M_{1} (s), M_{2} (s), R$ 1: $M'_{1} (s) = ϕ$ , $n (s) = 1$ , ${ER}_{s}^{n (s)} = 0$ 2: for all $k \in M_{1} (s)$ do 3: Calculate $P_{s} (k)$ with Equation 1 4: end for 5: Sort out $M_{1} (s)$ 6: while (( ${ER}_{s}^{n (s)} < R$ ) && $n (s)$ < Maximum retransmission times) do 7: select node $k$ with the highest priority from $M_{1} (s)$ 8: $M'_{1} (s) = M'_{1} (s) - {k}$ 9: Calculate ${ER}_{s}^{n (s)}$ with Equation 4 10: if $M'_{1} (s) = M_{1} (s)$ then 11: $n (s) + +$ 12: continue 13: end if 14: end while 15: Calculate link correlation with Equation 5 16: Cluster, select cluster heads 17: for $\forall k \in M'_{1} (s)$ do 18: retransmit the data message 19: if $k$ is cluster head then 20: transmit ACK to $s$ 21: end if 22: end for

Algorithm 1. Coverage-aware and Link-correlation-balanced flooding

Require:

\forall s, M_{1} (s), M_{2} (s), R

M'_{1} (s) = ϕ

n (s) = 1

{ER}_{s}^{n (s)} = 0

2: for all

k \in M_{1} (s)

do
3: Calculate

P_{s} (k)

with Equation 1
4: end for
5: Sort out

M_{1} (s)

6: while ((

{ER}_{s}^{n (s)} < R

) &&

n (s)

< Maximum retransmission times) do
7: select node

k

with the highest priority from

M_{1} (s)

M'_{1} (s) = M'_{1} (s) - {k}

9: Calculate

{ER}_{s}^{n (s)}

with Equation 4
10: if

M'_{1} (s) = M_{1} (s)

then
11:

n (s) + +

12: continue
13: end if
14: end while
15: Calculate link correlation with Equation 5
16: Cluster, select cluster heads
17: for

\forall k \in M'_{1} (s)

do
18: retransmit the data message
19: if

k

is cluster head then
20: transmit ACK to

s

21: end if
22: end for

Detecting inflection nodes

Now, we will describe how to detect the inflection nodes using the above-mentioned efficient flooding protocol. During this step, boundary landmarks flood the whole network using the coverage-aware and link-CLF mechanism, while the network coordinator both monitors the flooding process and records the total number $M$ , which is also the number of boundary landmarks. A boundary landmark begins to set its level as zero and performs flooding using a message. The message carries a hop counter such that every node can learn its own level; they will also be clustered based on the link correlation described in subsection “Coverage-aware and link-CLF mechanism” of section “Network structure and detection of shortest paths.” Suppose that node $s$ is flooding the message and that its level is $le$ , it will divide its neighbors into clusters and assign a head to each cluster. Each of the neighbors will mark its own level as $le + 1$ , and the cluster head will reply to node $s$ with an ACK representing the cluster. If a node’s level has been marked, it will not be marked again. A node that does not receive any ACK of its forwarding message after a certain period of time regards itself as a leaf node. When the flooding process finishes, all nodes constitute a tree whose root is a boundary landmark.

Leaf nodes begin to return reports to the tree. The reports contain an array of a node’s subtree sizes, defined as $Siz e_{s} (le, 1), siz e_{s} (le, 2), \dots, siz e_{s} (le, h)$ , where $le$ is the level of node $s$ and $h$ is the number ID in the same level. When $s$ has collected reports from all its children, it will calculate its own subtree sizes and send a report back up the tree. In this article, we use a similar method as proposed in Tan et al.²⁹ to detect the inflection nodes. Every node will save its own subtree sizes, and after the root obtains its sizes, the detection process is completed. If $s$ is not a root node, it checks whether its subtree sizes satisfy the criteria $siz e_{s} (le, h) > k \times h$ and $h > 3$ , where $h$ is the depth of the node in the tree and $k$ is a parameter. If the equation is satisfied by a certain node, the node increases its inflection degree by one.

It is clear that a node’s inflection degree is no greater than the number of flooding sources, that is, $M$ . In this article, we define the inflection degree threshold as $0.1 M$ . Every node compares its inflection degree with the threshold, and nodes with a greater inflection degree mark themselves as inflection nodes. Then, we can determine the transmission path from each node pair in the tree. First, the source and sink nodes in the pair, denoted $m$ and $k$ , for example, send messages containing their node IDs to their parents simultaneously. The message is transmitted to ancestor nodes until a node has found their common ancestor. With these recorded nodes, we can extract the path between $m$ and $k$ in the tree. Obviously, we can obtain multiple paths between $m$ and $k$ . For these paths, we select the path with the minimum number of hops as the shortest path, that is, $S P_{\min (m, k)}$ . In Figure 7, the straight bold line indicates the true path between two nodes. The shortest path is curved by the hole and crosses the inflection node.

Ranging between nodes and filtering

After detecting the inflection nodes, we can measure the distance between any two nodes and amend the distances of the node pairs whose transmission includes the inflection nodes, meaning that their path is bent around a coverage hole. TOF is a ranging method that utilizes the radio flight time between nodes to measure their distances.³³ Because a node is able to communicate with its one-hop neighbors directly, TOF ranging is feasible. Of course, TOF requires hardware support. A node performs TOF ranging with its one-hop neighbors and saves its measurements. In a harsh communication environment, these measurements always contain NLOS error, which mainly affects location accuracy. NLOS error adds a positive bias to measured range, which was modeled in different ways in the literature such as exponentially distributed,³⁴ uniformly distributed,³⁵ Gaussian distributed³⁶ and constant along a time window.³⁷ Typically, the model depends on the wireless propagation channel and the specific technology under consideration.³⁸ In our previous work,¹² we have proved that it is accurate to model this bias error with colored noise, which can be modeled by the sum of an accumulated error and zero-mean white Gaussian noise. Therefore, in this step, we use an NLOS mitigation algorithm based on the adaptive Kalman filter similar to the algorithm presented in Li et al.¹² The system equation and the observation equation of the Kalman filter can be given as follows

x_{k + 1} = A x_{k} + B α_{k}

(6)

r_{k} = C x_{k} + β_{k}

(7)

where $A$ , $B$ , and $C$ are known constant matrices, $α_{k}$ is the process noise, and $β_{k}$ is the ranging error. Colored noise in the $k$ terms consists of both previous noise and zero-mean white Gaussian noise. Therefore, $β_{k}$ can be calculated as follows

α_{k} = N_{k - 1} β_{k - 1} + γ_{k}

(8)

where $N_{k}$ is an auto-regression coefficient and $γ_{k}$ is white Gaussian noise. Based on these equations, the filter process is as follows

P_{k, k - 1} = A P_{k - 1, k - 1} A^{T} + BQ B^{T}

(9)

H_{k - 1} = [CA - N_{k - 1} C]

(10)

\begin{matrix} G_{k} = (A P_{k - 1, k - 1} H_{k - 1}^{T} + BQ B^{T} C^{T}) \\ {(H_{k - 1} P_{k - 1 . k - 1} H_{k - 1}^{T} + CBQ B^{T} C^{T} + {\hat{R}}_{k - 1})}^{- 1} \end{matrix}

(11)

P_{k, k} = (A - G_{k} H_{k - 1}) \cdot P_{k - 1, k - 1} A^{T} + (I - G_{k} C) BQ B^{T}

(12)

{\hat{x}}_{k / k} = A {\hat{x}}_{k - 1 / k - 1} + G \cdot (v_{k} - N_{k - 1} v_{k - 1} - H_{k - 1} {\hat{x}}_{k - 1 / k - 1})

(13)

The process consisting of the above formulas is iterative, and the detailed derivation can be found in Li et al.¹² The output of the iteration is the distance required for localization. These optimized distances help us to calculate the smallest distances and localize nodes in the subsequent steps.

Bent length of path and distance matrix

In a network with coverage holes, there often exist two nodes whose distance cannot be measured directly due to indirect communication. In this case, the distance is usually calculated based on the sum of all measurable distances in the path. However, because of the existence of coverage holes and inflection nodes, the distances between node pairs whose transmission paths are bent around the hole obtained from such a sum will be very different from the true distances. From the description above, it is obvious that inflection degree can be used to measure how bent the path is. In Figure 6, we can determine that the greater the number of inflection nodes along the transmission path of one node pair, the curvier the path is, which means that the real distance will be much shorter than the result calculated using the sum of the measured distances between each hop in this path. To avoid errors and obtain an accurate distance matrix, we can correct the bent length for a one-hop distance along the shortest path $S P_{\min (m, k)}$ for nodes $m$ and $k$

[X (i) \frac{\bar{d}}{d (i)} + (1 - X (i))] t_{i, j}

(14)

where $X (i)$ is an indicator function, taking on a value of 1 when node $i$ is an inflection node and exists among the $S P_{\min (m, k)}$ and 0 otherwise; $\bar{d}$ and $d (i)$ are the average node degree and inflection degree of node $i$ mentioned in the above section; and $t_{i, j}$ is the distance measured using the above-described TOF ranging and NLOS mitigation algorithm from node $i$ to node $j$ , which is the destination node of node $i$ in $S P_{\min (m, k)}$ . There is a simple intuition underlying the equation (14) is that, if node $i$ is an inflection node, then the length of transmission path passing through $i$ should be corrected. Obviously, the more inflection degree of $i$ is, the benter the path is. Then, the corrected distance $S P_{\min (m, k)}$ between nodes $m$ and $k$ is the sum of each hop distance. Based on this, we can construct the distance matrix of the entire network. Suppose that there exist $Y$ nodes in the 2D space, and for any two nodes $m$ and $k$ , the shortest distance is $S_{m, k}$ . The distance matrix can be expressed as

Δ = [\begin{matrix} S_{1, 1} & S_{1, 2} & \dots & S_{1, Y} \\ S_{2, 1} & S_{2, 2} & \dots & S_{2, Y} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ S_{Y, 1} & S_{Y, 2} & \dots & S_{Y, Y} \end{matrix}]

(15)

Figure 6.

The shortest path deviated by a coverage hole.

MDS-based localization

After the distance matrices of all the nodes are calculated, we utilize the multidimensional scaling algorithm to calculate the relative coordinates. The heart of the algorithm is the use of the similarity between nodes to establish a mapping and then obtaining the distribution of nodes in the space. Usually, the core of the location calculation using the MDS algorithm is the distance matrix decomposition, which can be implemented using singular value decomposition (SVD).^39,40 However, in a large-scale network, the computational complexity of SVD operations is high and will introduce errors into the final localization results. In this article, we replace the SVD with the iterative maximum gradient descent method to improve the localization accuracy while reducing the computational complexity.

Assume that the coordinate vector of all $Y$ nodes in the 2D space is $Z = [z_{1}, z_{2}, \dots, z_{Y}]$ , where $z_{i} = [x_{i}, y_{i}]^{T}$ . Then, we can define the global cost function as

\begin{matrix} f & = \frac{1}{2} \sum_{i = 1}^{Y} \sum_{j = 1}^{Y} {[S_{i, j} - d_{i, j}]}^{2} \\ = \frac{1}{2} [\sum_{i = 1}^{Y} \sum_{j = 1}^{Y} S_{i, j}^{2} + \sum_{i = 1}^{Y} \sum_{j = 1}^{Y} d_{i, j}^{2} - 2 \sum_{i = 1}^{Y} \sum_{j = 1}^{Y} S_{i, j} d_{i, j}] \end{matrix}

(16)

where $d_{i, j} = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}}$ means the estimated distance between nodes $i$ and $j$ .

This can be converted into matrix form as follows

f = \frac{1}{2} \sum_{i = 1}^{Y} \sum_{j = 1}^{Y} S_{i, j}^{2} - Y trace (ZH Z^{T}) - trace (Z Γ (Z) Z^{T})

(17)

where $H = I - (1 / Y) E$ , $I$ is the identity matrix of size $Y$ , $E = 11^{T}$ , and $trace (A)$ denotes the trace of the matrix $A$ . $Γ (Z)$ is a symmetric matrix of size $Y$ , whose elements are defined as

Γ_{i, j} (Z) = w_{i, j} = {\begin{matrix} - \frac{S_{i, j}}{d_{i, j}} & if i \neq j, d_{i, j} \neq 0 \\ - \sum_{p = 1, p \neq i}^{Y} w_{i, p} & if i = j \end{matrix}

(18)

Obviously, the optimal coordinate vector $\hat{Z}$ satisfies $\hat{Z} = min_{Z} f$ . It can be derived that $\frac{\partial f}{\partial Z} = YZH - Z Γ (Z)$ , and thus, we can obtain the final coordinates of each node with the well-known maximum gradient descent method.

Simulation

To evaluate the performance of the proposed algorithms, we design simulations and compare with the other two localization methods. We deployed 1600 wireless sensor nodes in an area of $3200 \times 3200 m^{2}$ . The parameter settings are summarized in Table 2. The coordinator of the WSN is assumed to be placed at the center of the area and that it can receive all the signals transmitted by other sensor nodes.

Table 2.

Parameter setting of the simulation.

Parameters	Values
Network topology	Random deployment
Shape of coverage hole	Rectangle and triangle
Size ( $m^{2}$ )	$3200 \times 3200$
Number of nodes	1600
Communication range (m)	80
Wireless propagation model	Shadowing model with path loss exponent of 3.1
TOF measurement noise model	Sum of a zero-mean Gaussian variable and a constant ranging error, similar to Li et al.¹²

Performance of CLF

First, we evaluate the performance of the proposed flooding mechanism. As mentioned before, in a complex wireless environment, it is very difficult to achieve the expected transmission reliability due to the relatively high link error rate and limited transmission times. Thus, during the experiments, we select two metrics: achieved reliability (network-wide)-to-target reliability ratio (ATR) and the number of transmitted packets (excluding ACK packets) per node (NTP). We compare with another classic flooding protocol called POFA,³⁰ which is also a probabilistic and opportunistic algorithm that considers the retransmission subset and real-time retransmission results. In the comparison, we chose 0.9 and 0.95 as the expected transmission reliabilities instead of 1, which is infeasible in most applications. Then, the performances of the two flooding schemes are evaluated under link error rates of 0.05, 0.1, and 0.15, which are common values in complex wireless environments.³²Figures 7 –9 present the performance of the algorithm in terms of the ATR and NTP. Clearly, the performance of CLF is as high as that of POFA in terms of reliability and the overhead of transmit packets (not including ACKs). Note that CLF effectively ameliorates the ACK explosion problem, conserves energy on ACKs, and balances the residual energy of sensor nodes in real applications. The fact that the network is steadier and that the lifetime is longer also indicates the effectiveness of the proposed strategy.

Figure 7.

Performance of the flooding mechanism with a link error rate of 0.15.

Figure 8.

Performance of the flooding mechanism with a link error rate of 0.10.

Figure 9.

Performance of the flooding mechanism with a link error rate of 0.05.

In CLF, we take rapid coverage and balanced energy into account to allocate different priorities to one-hop neighbors and suppress the transmission of nodes that have less residual energy and fewer uncovered neighbors. With such design, we can maintain a balanced residual energy in the whole network area. We also do some experiments to show the results, as illustrated in Figures 10 and 11. In the evaluation, we set initial energy of each node is $3 mJ$ and energy consumption of each transmission is $30 μ J$ . Then, we compare the unbalance of energy consumption of all the wireless nodes, which can be evaluated by variance of residual energy. Figures 10 and 11 show that the residual energy imbalance is proportional to flooding rounds, which means that the performance is adversely impacted by fading effects. However, CLF suppresses the transmission priority of nodes with less residual energy that helps in balancing energy depletion among sensor nodes. From Figure 10, we observe that the variance of CLF is lower than POFA since the third round of flooding when the energy priority $α$ is 5.0 and coverage priority $β$ is 1.0. What is more, the increment rate of variance becomes smaller over time when the energy superiority is strengthened from 5.0 to 9.0. This matches our conclusion in part “Calculating retransmission priorities.”

Figure 10.

Variance of residual energy when link error rate is 0.05 and expected reliability is 0.95.

Figure 11.

Variance of residual energy when link error rate is 0.1 and expected reliability is 0.9.

Figure 12 reveals that CLF saves more than 50% average number of ACK packets in most cases while achieving target reliability. When the link error rate is small (0.05), the ER of the senders with a single broadcasting and part of one-hop neighbors broadcasting can already make the ER greater than target reliability, $R$ . Therefore, each cluster head can transmit an aggregate ACK to the sender. In particular, this performance gain is increased as the link error rate becomes larger. CLF and POFA increase the number of retransmissions aggressively since it seeks to achieve the target reliability. Therefore, as the link error rate increases, not only a node but also its one-hop and two-hop neighbors retransmit the same message more times. When neighbors of a node $s$ have received the same message from $s$ more than one time, the cluster heads will send ACKs to $s$ . Besides, it can be seen that the average number of ACKs raises dramatically of POFA when the link error rate is 0.15.

Figure 12.

Average number of ACK packets.

Localization performance under different network topologies

We compare the RMDS against two 2D localization schemes from previous works. One scheme is the classic MDS-MAP scheme,⁵ and the other scheme is REP,²² which is also a CB localization scheme. In MDS-MAP, it just utilizes the network connectivity information and tries to estimate the distance between each two nodes with the hop counts. When applying MDS-MAP in a large-scale area with coverage holes, the communication path between nodes at the boundary of holes will deviate from the ordinary paths and become bent path. Obviously the bent path will deviate greatly at the hole boundary, which makes the estimated distance larger than their Euclidean distance, and yield great error to the localization result. In REP, it can also detect the coverage hole and amend the distance between two nodes locating at the boundary of the hole. However, it calculates the coordinates of each node in a triangulation manner with the assistance of three seeds. Our proposed scheme can calculate the distance between each two nodes, and then, all the coordinates can be obtained more precisely with decomposition of the distance matrix.

The localization accuracy is expressed quantitatively using the average localization error, defined as the average Euclidean distance between the obtained and true coordinates of all the nodes. To make the comparison meaningful, we calculate the absolute coordinates, achieved by setting the physical coordinates of five preselected nodes, and the other nodes can estimate their real positions using the relative coordinates to the preselected ones. The resulting network thus has the same scale and orientation as the original network.

In the simulations, we generate three types of node deployments with rectangular and triangular randomly generated size coverage holes, as illustrated in Figure 13(a), (c), and (e). The calculated node deployments are shown in Figure 13(b), (d), and (f). For the coverage holes with regular shape, that is, rectangular and triangular holes, the average localization error is 2.37 and 1.46 m, respectively. Compared with the network size, it suggests a very small distortion in the network layout. In contrast, MDS-MAP and REP perform poorly, with average errors of 3.37 and 2.69 m for rectangular hole, while 2.90 and 2.26 m for triangular hole. Regarding the coverage hole with randomly generated shape, RMDS also performs well, and the localization error is 3.12 m, while MDS-MAP and REP gain a error of 4.21 and 3.85 m. The error difference indicates that the RMDS can perform well in wireless networks with different coverage holes.

Figure 13.

The network topology of the boundary and landmark nodes. (a) The original network with a rectangular hole and an average node degree of 12.2. (b) Network topology calculated by the RMDS, with an average localization error of 2.37 m. (c) The original network with a triangular hole and an average node degree of 13.8. (d) Network topology calculated by the RMDS, with an average localization error of 1.46 m. (e) The original network with a randomly generated shape hole and an average node degree of 11.4. (f) Network topology calculated by the RMDS, with an average localization error of 3.12 m.

Localization performance under different node densities

During the experiments, we also change the node density to demonstrate the performance of the RMDS by changing the communication range of each node with a small step size. Obviously, the larger the communication range is, the greater the average node degree is, resulting in a higher node density. The lowest average node degree is 8. For the two different types of network deployments, the localization performances of the different algorithms are illustrated in Figures 14 and 15. In Figures 14 and 15, the X-axis indicates the average connectivity of the whole network, and the Y-axis is the average localization error. It can be observed that the performance of our proposed algorithm is substantially better than its counterparts. In the network topology with the rectangular hole, the localization error decreases by 21.1% and 12.8% compared with MDS-MAP and REP, and the error decreases by 36.4% and 28.5% in the network with the triangular hole. Moreover, the localization error decreases with increasing network connectivity (density of node coverage).

Figure 14.

Localization error with a rectangular hole.

Figure 15.

Localization error with a triangular hole.

Finally, we calculate the cumulative probability density (CDF) of the three algorithms by calculating the percentage of nodes whose absolute localization error is no greater than a certain value (indicated by the X-axis). The results are illustrated in Figures 16 and 17. From the results, we can see that the RMDS produces good results and is quite robust to varying network densities.

Figure 16.

CDF of localization error with a rectangular hole.

Figure 17.

CDF of localization error with a triangular hole.

Conclusion

This article discusses the localization problem in large-scale WSNs with coverage holes in complex wireless environments, which is a very common scenario in underground mines or indoors. When applying the classic MDS-based localization algorithms in such situations, calculating the bent path distances between nodes using hop counts multiplied by the average physical hop distance will result in large deviations from the true Euclidean distance, which—in turn—will lead to a large localization error. In this article, we present a localization scheme called RMDS to tackle these problems that estimates the degree of curvature around a coverage hole and uses that to amend the distances between the node pairs whose communication paths are bent around the hole. To achieve this, we perform a coverage-aware and link-CLF mechanism to detect the boundary of the hole. Moreover, we embed the TOF ranging in the flooding process and propose a NLOS elimination method based on the adaptive Kalman algorithm to improve the accuracy of one-hop measured distance results affected by the refraction and reflection of radio waves in complex wireless environment. The simulation experiments show that the proposed algorithm outperforms its counterparts, works well in large-scale WSNs containing different coverage hole shapes, and is robust to varying network densities.

Footnotes

Academic Editor: Vicente Traver

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported in part by grants from the National Natural Science Foundation of China (61301114 and 51304058).

References

Akyildiz

Sankarasubramaniam

et al . Wireless sensor networks: a survey. Comput Netw 2002; 38(4): 393–422.

Patwari

Ash

Kyperountas

et al . Locating the nodes: cooperative localization in wireless sensor networks. IEEE Signal Proc Mag 2005; 22(4): 54–69.

Xiao

Sun

et al . Toward collinearity-aware and conflict-friendly localization for wireless sensor networks. Comput Commun 2012; 35(13): 1549–1560.

Iliev

Paprotny

Review and comparison of spatial localization methods for low-power wireless sensor networks. IEEE Sens J 2015; 15(10): 5971–5987.

Shang

Ruml

Zhang

et al . Localization from mere connectivity. In: Proceedings of the 4th ACM international symposium on mobile ad hoc networking and computing (MobiHoc), Annapolis, MD, 1–3 June 2003, pp.201–212. New York, NY: ACM.

Costa

Patwari

Hero

AO.

Distributed weighted multidimensional scaling for node localization in sensor networks. ACM Trans Sens Netw 2006; 2(1): 39–64.

Wheeler

Ying

et al . Localization from connectivity in sensor networks. IEEE T Parall Distr 2004; 15(11): 961–974.

de Abreu

GTF

Destino

. Super MDS: source location from distance and angle information. In: 2007 IEEE wireless communications and networking conference (WCNC), Kowloon, China, 11–15 March 2007, pp.4430–4434. San Francisco, CA: IEEE.

Wei

Peng

Wan

et al . Multidimensional scaling analysis for passive moving target localization with TDOA and FDOA measurements. IEEE T Signal Proces 2010; 58(3): 1677–1688.

10.

Lin

Chen

Lin

et al . Multidimensional scaling algorithm for mobile location based on hybrid SADOA/TOA measurement. In: 2008 IEEE wireless communications and networking conference (WCNC), Las Vegas, NV, 31 March–3 April 2008, pp.3015–3020. Piscataway, NJ: IEEE.

11.

Chen

Wan

Jiang

et al . Dynamic multidimensional scaling algorithm for mobile location. In: 2006 IEEE region 10 conference (TENCON), Hong Kong, China, 14–17 November 2006, pp.1–4. New York: IEEE.

12.

et al . A novel adaptive Kalman filter based NLOS error mitigation algorithm. IFAC-PapersOnline 2015; 48(28): 1118–1123.

13.

Wang

Chen

et al . NLOS error mitigation for TOA-based localization via convex relaxation. IEEE T Wirel Commun 2014; 13(8): 4119–4131.

14.

Zhang

Wang

et al . Virtual edge based coverage hole detection algorithm in wireless sensor networks. In: 2013 IEEE wireless communications and networking conference (WCNC), Shanghai, China, 7–10 April 2013, pp.1488–1492. New York: IEEE.

15.

Sun

et al . Fingerprint and assistant nodes based Wi-Fi localization in complex indoor environment. IEEE Access 2016; 4: 2993–3004.

16.

Biswas

Lian

Wang

et al . Semidefinite programming based algorithms for sensor network localization. ACM Trans Sens Netw 2006; 2(2): 188–220.

17.

Aspnes

Eren

Goldenberg

et al . A theory of network localization. IEEE T Mobile Comput 2006; 5(12): 1663–1678.

18.

Halder

Ghosal

A survey on mobility-assisted localization techniques in wireless sensor networks. J Netw Comput Appl 2016; 60: 82–94.

19.

Win

Conti

Mazuelas

et al . Network localization and navigation via cooperation. IEEE Commun Mag 2011; 49(5): 56–62.

20.

Bulusu

Heidemann

Estrin

. GPS-less low-cost outdoor localization for very small devices. IEEE Pers Commun 2000; 7(5): 28–34.

21.

Niculescu

Nath

. DV based positioning in ad hoc networks. Telecommun Syst 2003; 22(1–4): 267–280.

22.

Liu

. Rendered path: range-free localization in anisotropic sensor networks with holes. IEEE/ACM Trans Netw 2010; 18(1): 320–332.

23.

Fan

Zhang

Dai

. D3D-MDS: a distributed 3D localization scheme for an irregular wireless sensor network using multidimensional scaling. Int J Distrib Sens N 2015; 11(2): 103564.

24.

Bagrodia

. Impact of complex wireless environments on rate adaptation algorithms. In: 2011 IEEE wireless communications and networking conference (WCNC), Cancun, Mexico, 28–31 March 2011, pp.168–173. New York: IEEE.

25.

Geng

Liu

et al . CC-KF: enhanced TOA performance in multipath and NLOS indoor extreme environment. IEEE Sens J 2014; 14(11): 3766–3774.

26.

Wang

Zhao

et al . A multipath mitigation localization algorithm based on MDS for passive UHF RFID. IEEE Commun Lett 2015; 19(9): 1652–1655.

27.

Popescu

Hedley

. Range data correction for improved localization. IEEE Wireless Commun Lett 2015; 4(3): 297–300.

28.

Wang

Lederer

Gao

. Connectivity-based sensor network localization with incremental Delaunay refinement method. In: The 28th annual IEEE international conference on computer communications (INFOCOM), Rio De Janeiro, Brazil, 19–25 April 2009, pp.2401–2409. New York: IEEE.

29.

Tan

Jiang

Zhang

et al . Connectivity-based and anchor-free localization in large-scale 2D/3D sensor networks. ACM Trans Sens Netw 2013; 10(1): 6.

30.

Chang

Cho

Choi

et al . A probabilistic and opportunistic flooding algorithm in wireless sensor networks. Comput Commun 2012; 35(4): 500–506.

31.

Shuo

et al . Opportunistic flooding in low duty-cycle wireless sensor networks with unreliable links. In: Proceedings of the 15th annual international conference on mobile computing and networking (Mobicom), Beijing, China, 20–25 September 2009, pp.133–144. New York: ACM.

32.

Han

Lee

. Efficient packet error rate estimation in wireless networks. In: The 3rd international conference on testbeds and research infrastructure for the development of networks and communities (TridentCom), Lake Buena Vista, FL, 21–23 May 2007, pp.1–9. New York: IEEE.

33.

Hua

Meng

Zhou

et al . Accurate and simple wireless localizations based on time product of arrival in the DDM-NLOS propagation environment. IEEE J Sel Top Signa 2015; 9(2): 239–246.

34.

Chen

. A non-line-of-sight error mitigation algorithm in location estimation. In: 1999 IEEE wireless communications and networking conference (WCNC), vol. 1, New Orleans, LA, 21–24 September 1999, pp.316–320. New York: IEEE.

35.

Venkatesh

Buehrer

. A linear programming approach to NLOS error mitigation in sensor networks. In: 2006 5th international conference on information processing in sensor networks (IPSN), Nashville, TN, 19–21 April 2006, pp.301–308. New York: IEEE.

36.

Jourdan

Roy

. Optimal sensor placement for agent localization. In: 2006 IEEE/ION position, location, and navigation symposium (PLANS), Coronado, CA, 25–27 April 2006, pp.128–139. New York: IEEE.

37.

Riba

Urruela

. A non-line-of-sight mitigation technique based on ML-detection. In: 2004 IEEE international conference on acoustics, speech, and signal processing (ICASSP), vol. 2, Montreal, QC, Canada, 17–21 May 2004, pp.153–156. New York: IEEE.

38.

Guvenc

Chong

. A survey on TOA based wireless localization and NLOS mitigation techniques. IEEE Commun Surv Tutorials 2009; 11(3): 107–124.

39.

Jackle

Fischer

Schreck

et al . Temporal MDS plots for analysis of multivariate data. IEEE Trans Vis Comput Graph 2016; 22(1): 141–150.

40.

Robinson

Bennett

. A typology of deviant workplace behaviors: a multidimensional scaling study. Acad Manage J 1995; 38(2): 555–572.