Abstract
The paper introduces a method which improves localization accuracy of the signal strength fingerprinting approach. According to the proposed method, entire localization area is divided into regions by clustering the fingerprint database. For each region a prototype of the received signal strength is determined and a dedicated artificial neural network (ANN) is trained by using only those fingerprints that belong to this region (cluster). Final estimation of the location is obtained by fusion of the coordinates delivered by selected ANNs. Sensor nodes have to store only the signal strength prototypes and synaptic weights of the ANNs in order to estimate their locations. This approach significantly reduces the amount of memory required to store a received signal strength map. Various ANN topologies were considered in this study. Improvement of the localization accuracy as well as speedup of learning process was achieved by employing fully connected neural networks. The proposed method was verified and compared against state-of-the-art localization approaches in real world indoor environment by using both stationary and mobile sensor nodes.
1. Introduction
Localization of sensor nodes is a necessary function for various emerging applications of wireless sensor networks (WSNs), such as road traffic control [1] and target tracking [2]. Accurate estimation of the sensor node location is important for efficiency of routing and location-aware services. In many cases sensor readings collected in WSN are not useful without the location information. Thus, there is a growing research interest in the localization methods due to their potential use in a variety of WSN applications [3].
Localization methods can be categorized into two main classes with regard to type of utilized information: range-based methods need information about node-to-node distances or angles for estimating locations; range-free methods do not need the distance or angle information as they estimate the location based on proximity of several reference nodes.
Since range-free methods are usually more efficient in terms of hardware and computational requirements, they become more popular than the range-based methods in WSN localization [4]. A most popular example of the range-free methods is the CL algorithm, which estimates position of a node as the centroid of the positions of all neighboring reference nodes [5, 6]. The positions of reference nodes are fixed or calculated during initialization stage [7].
For range-based methods, the distance information can be obtained by analyzing time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), or received signal strength indicators (RSSI) [8]. TOA algorithm calculates the distance on the basis of known transmission time and signal propagation speed. It requires high-resolution clocks to be installed at sensor nodes. In case of AOA algorithm, the sensor node needs several narrow beam receivers or an antenna array to determine the direction of the received signal. TDOA uses two transmission signals of different propagation speeds. Therefore, it requires two different transmitters and receivers on each node. The above range-based localization techniques have little practical use in WSNs due to the necessity of additional hardware, which increases cost, size, and energy consumption of sensor nodes. RSSI algorithms estimate the node-to-node distances by using a signal propagation model. However, for real world dynamic environments the propagation models are not capable of accurately predicting the impact of all environmental factors on received signal strength. The use of RSSI approach is especially difficult for indoor applications that experience multipath signal propagation.
The multipath signal propagation issues can be addressed by using methods that are based on fingerprinting [9]. According to the general fingerprinting approach, RSSI values from reference nodes have to be collected and stored in a database along with the information about positions of the reference nodes. In order to estimate location of a node, its fingerprint is compared with the fingerprint database. In this scheme, an infrastructure of existing wireless networks (e.g., Wi-Fi access points) can be utilized instead of dedicated reference nodes.
Currently, there is a considerable research interest in developing fingerprint localization methods based on artificial neural networks (ANNs) [10]. An important advantage of this approach is that the ANN enables accurate recognition of node location in case of noisy RSSI measurements. When using ANNs, the detailed information about indoor environment and locations of the reference nodes is not necessary. ANN interpolates the data collected in the fingerprint database to approximate a mapping between the multidimensional fingerprints space and the coordinates of nodes. In training phase, the collected RSSI vectors are used to tune weights of connections between neurons in the ANN. Although training can be time-consuming, the localization process is much faster than analytical estimation of the node location.
In this paper a method is proposed that improves localization accuracy of the ANN-based fingerprinting. According to the introduced method, the entire localization area is divided into regions by clustering the fingerprint database. A separate ANN is trained for each region by using only those fingerprints that belong to this region (cluster). During clustering, a prototype RSSI vector is determined for each region. When localization process starts, those prototypes are selected that are most similar to the vector of current RSSI measurements. The ANNs that correspond to the selected prototypes are used to estimate the node coordinates. Final estimation of the location is obtained by fusion of the coordinates delivered by ANNs. Further improvement of the localization accuracy as well as speedup of learning process was achieved by employing fully connected neural networks [11].
The paper is organized as follows. Section 2 describes previous works related to ANN applications for localization in WSN. The proposed clustered fingerprinting localization method based on fully connected neural networks (FCNNs) is presented in Section 3. Section 4 includes experimental results and comparison of the introduced method with state-of-the-art approaches. Conclusions and discussion are given in Section 5.
2. Related Works and Contribution
Several localization methods have been recently proposed for WSNs in the related literature. These methods are based on various theoretical frameworks, including ANNs [12], Voronoi diagrams [13], cooperative localization [14], Gaussian mixed models [15], particle filters [16], average distance per hop estimation [17, 18], and particle swarm optimization [19]. As it was discussed in Section 1, the application of ANNs is especially advantageous when dealing with node localization in dynamic indoor environments. This section provides a survey of the previous ANN-based localization methods and discusses main contributions of this study.
Multilayer perceptron (MLP) is the most popular type of ANN in recent applications for range-free wireless sensor node localization. In [20] the MLP ANN was used for fingerprint-based localization in WSN. Accuracy of this method was evaluated with thirteen backpropagation training algorithms. Similar approach was proposed in [21]. To tackle changes in wireless channel, the ANN training has been updated in regular time intervals. An ensemble, which consists of four MLP ANNs with different number of inputs, was presented in [22]. According to that method, when the localization process has to be performed, a selected ANN is used that has as many inputs as currently connected reference nodes. This approach is not well-scalable; thus the maximum number of reference node connections was set to four. Localization accuracy obtained for the ANNs ensemble was better than for approaches based on fuzzy learning system and genetic algorithm.
Implementation of multiple MLP ANN improved the Bluetooth based localization accuracy in [23]. This method assumes that different neural network is trained for each route of a user equipped with Bluetooth device.
Recently, S. Y. M. Vaghefi and R. M. Vaghefi [24] have comparatively investigated the number of hidden layers of MLP on accuracy of a WSN localization approach designed to moderate the negative effect of miscellaneous noise sources and harsh factory conditions. So-In et al. [25] have examined the application of different soft computing techniques in the RSSI fingerprint-based localization. In that study, the performance of MLP was compared against fuzzy logic system, genetic algorithm, and support vector machine.
Guo et al. [26] have proposed a localization method, which applies radial basis function (RBF) ANN for the RSSI fingerprinting. The authors have suggested that utilization of redundant RSSI information can improve the positioning performance. The reliability and precision of localization were improved by taking difference of the received signal strength as additional input of the ANN. To improve the positioning accuracy in dynamic environments, an online training mechanism was introduced for RBF ANNs. The RBF ANNs handle nonlinear problems well and are easy to train, but they need a hidden neuron for every training pattern. Thus, the number of neurons used in RBF networks may become very large.
There are also several range-based methods available that use ANNs. In [27] the MLP ANN was applied to reduce the localization error of TOA and AOA algorithms in non-line-of-sight environments. Shareef et al. [28] have compared the performance of different ANN types in filtering the noise of distance measurements obtained from the TDOA algorithm. The ANN types that were applied in that study include MLP, RBF ANN, and recurrent ANN. The RBF ANN has performed better than the other networks but it has higher memory and computation costs. On the other hand, the MLP network has the lowest memory and computation costs.
Abdelhadi et al. [29] have combined fuzzy inference system and MLP for three-dimensional localization of sensor nodes. The coordinates of sensor node are inferred by MLP on the basis of distances to reference nodes.
Rahman et al. [30] have proposed a location estimation algorithm for WSN, which combines generalized regression neural networks (GRNNs) and weighted centroid localization (WCL). In that algorithm, two GRNNs are trained separately for “x” and “y” coordinates. The GRNNs are used to provide a rough estimate of sensor node location. Subsequently the nearest reference nodes are selected and final estimation of the location is obtained by WCL.
A multilayer ANN, called artificial synaptic network, was introduced in [24] for sensor node localization based on the TOA algorithm. The artificial synaptic network model was defined as a collection of many artificial synaptic networks that works on data clusters. Only one selected synaptic network is used for each localization task. The training data were clustered by using k-means algorithm. During simulation experiment this approach has shown a better performance and efficiency in TOA localization than RBF ANN and MLP.
The main contributions in this paper include
In traditionally connected ANNs, such as the MLP or RBF, neurons are organized in layers and connections are introduced from one layer to the next layer. The FCNNs have additional connections across layers [11]. In [31] it was demonstrated that when comparing FCNNs with traditionally connected ANNs the latter ones require about twice as many neurons to perform a similar task. With connections across layers in FCNNs, there are fewer neurons on the signal paths, and, as a result, training algorithms converge faster. Reduced number of neurons and increased training speed are very important benefits for WSN localization applications as they enable lower consumption of the limited sensor node resources (memory, computational time, and energy) and faster adaptation in dynamic environments.
To improve the localization accuracy, a clustering-based approach is introduced in this study, which allows multiple ANNs to be used for location estimation in different regions of the considered area. According to this approach, current location of sensor node can be estimated by several ANNs. The location coordinates determined by different ANNs are merged together by using a fusion algorithm. The fusion of ANNs outputs improves the localization, especially for those sensor nodes that are close to borders of the regions.
3. Proposed Localization Method
In this section details of the proposed approach are presented. Main elements of this approach are related to training and localization processes. Localization process of a senor node consists of three steps: selection of ANNs, estimation of the node location by selected ANNs, and fusion of the estimated locations. During training process the available RSSI data are clustered and an ANN is trained for each cluster. These operations are repeated for different parameter settings until a required precision of localization is achieved. The precision is estimated by using a validation dataset. Parameters that determine the number of clusters and the number of ANNs used for localization are modified at each training iteration. In this way the parameter settings can be appropriately adjusted. An overview of the method is presented in Figure 1.

Overview of the proposed localization method.
3.1. Training Process
The set of training data includes vectors
There are n ANNs in the proposed ensemble, one for each cluster. The selection of training vectors for particular ANN depends on applied clustering algorithm. In case of k-means clustering, vector
In case of fuzzy c-means clustering, the training of jth ANN is performed based on vectors
3.2. Localization Process
Figure 1 shows a block diagram of the ANNs ensemble, which is used for sensor node localization. Inputs of the ANNs are indicated by the dashed lines, which mean that not all RSSI values have to be used as inputs of a given ANN. This assumption allows the topology of ANN to be simplified in case when some regions (clusters) are not covered by the signal range of all reference nodes.

Neural network ensemble for sensor node localization.
Fusion module selects the ANNs that are useful for estimation of the unknown position. According to the proposed approach the selection is based on the k nearest neighbors approach. Euclidean distances between vector
Final estimate of the sensor node position
In case of k-means algorithm, the coordinate x is calculated as follows:
If fuzzy c-means is the applied algorithm, then coordinate x is determined by using the formula:
For estimation of the y coordinate, appropriate versions of (2) and (3) can be easily obtained by substituting x with y.
4. Experiments
The experiments were performed on a RSSI dataset collected from sensor nodes in the 6-level building of the Institute of Computer Science at the University of Silesia, Sosnowiec, Poland. The dimensions of the building are approximately 60 m (length) by 18 m (height). The sensor nodes were equally distributed, 11 locations on each floor. They were used to periodically collect the RSSI values for signals of Wi-Fi access points (APs) and then send the collected data to a sink node. In total, the RSSI data from 66 nodes within the building were registered. The sensor nodes in WSN use a standard omnidirectional antenna.
The sensor node is presented in Figure 3(a). It is constructed from 8-bit microcontroller, Wi-Fi module, and 7200 J energy source. The mobile node (Figure 3(b)) was built based on ARM 32-bit architecture Raspberry Pi 2. The parameters of mobile node were sufficient for implementing the proposed localization method.

The sensor nodes used in experiments: (a) node collecting RSSI data, (b) localized mobile node.
Additionally, the building structure with network infrastructure was modeled in OMNeT++ (with Inet/Mixim extensions). The implemented WSN model takes into account Wi-Fi radio propagation [34] and obstacles. The simulation parameters were described in Table 1.
Simulation parameters of WSN network based on 802.11 protocol.
The model was implemented to compare the RSSI results obtained from real world WSN with values obtained via simulation. The comparison of obtained RSSI values for a WSN node based on the distance from the source of signal was presented in Figure 4.

RSSI values obtained in simulation versus real RSSI values.
Figure 4 shows that changes in RSSI values for simulation without obstacles and interferences are regular and localization task is quite simple. However, analysis of real data and models with obstacles and interferences shows that the RSSI value can increase or decrease with distance; thus object localization task is more difficult. The localization in second and third dimension only makes it more demanding. Therefore, the various ANN algorithms were researched as a method to tackle complex dependencies.
An example of RSSI value of a single AP registered by a grid of sensor nodes in building is illustrated in Figure 5. The RSSI values presented in Figure 5 show that the signal propagates easily within a floor level but its strength decreases significantly between floor levels. The ceiling is usually thicker than wall; thus the signal strength decreases faster. Within a floor level, the values are enhanced by reflection; thus for some locations it can be observed that the signal strength increases with distance from the AP (local distortions).

RSSI values measured in building.
Due to the dynamic character of the experimental environment, multiple readings of AP RSSI are needed for correct localization. 72 unique AP MACs were registered in the university building during 7 days of the research. However, only 46 of them were registered both by sensor nodes used to collect training dataset and by the mobile sensor node used for verification (testing RSSI dataset). The sensor nodes collect the RSSI value 10 times a day, so 660 readings a day are collected. In practice, due to high density of traffic in university network on average 630 readings a day were sent without an error (6300 in total). The testing set covers 400 readings of the mobile node for one day.
The localization error is determined as the Euclidean distance between the estimated location and the real location of the sensor node:
4.1. Parameter Calibration
The initial research was conducted to find a minimal ANN architecture that would be sufficient to perform localization with the lowest error (ME) value. To efficiently learn the ANNs, after the preliminary research, neuron by neuron algorithm [35] was selected as it converges after maximum 50 iterations, which is a better result than this of the improved backpropagation algorithm [36]. The optimal structure of ANNs was found by increasing the number of neurons in hidden layers until ME is not decreasing. The validation was performed on separate subset of the training data. The elements of validation subset are equally distributed over the localization area.
Three ANN architectures were considered in the research: MLP, RBF ANN (called RBF later on), and FCNN. Localization errors, obtained for the examined ANN architectures, are presented in Figure 6. The obtained ME value for all three ANNs was on a comparable level; however FCNN has achieved lower error ratio by using only 10 neurons in one hidden layer. In case of MLP the stable error values are obtained for 16 neurons in two hidden layers. Similarly, RBF has achieved the lowest ME value for the number of clusters (c) equal to 24. These parameters were used in further localization experiments.

Searching for optimal ANN architecture: (a) ANN scheme, (b) localization error for different numbers of clusters.
The FCNN was applied as a part of the proposed method as it requires the smallest number of neurons and provides the lowest ME value. The calibration of both considered clustering methods was performed by using the training set. In case of k-means algorithm only two parameters are tuned: number of clusters (n) and number of prototypes selected for localization (k). The calibration results are illustrated in Figure 7.

Localization error for different parameter settings of k-FCNN algorithm.
The calibration results show that lowest error can be achieved by using 4 clusters and 2 prototypes. The optimal number of clusters as well as the number of selected prototypes increases with reduction of the dataset. The low value of k and n provides information that the dataset contains redundant data.
The calibration of the proposed localization method with fuzzy c-means clustering requires determination of four parameters. Optimal values of these parameters can be selected by using brute-force search; however such approach would require long computational time. Thus, for better performance, the parameter values were selected by using improved particle swarm optimization (iPSO) method [37]. The fitness function in this case corresponds to the ME evaluated for validation dataset. The optimal parameters calculated via iPSO are
4.2. Localization Error Analysis
The localization error was examined for methods based on FCNN with k-means and fuzzy c-means clustering. Hereinafter, these methods are referred to as k-FCNN and c-FCNN, respectively. The proposed localization methods were compared with two versions of the RBF-based localization algorithm proposed in [26] named RBF + v1 and RBF + v2. Additionally, the solution was verified against the three tuned ANN architectures, applied directly for localization purposes, that is, FCNN, MLP, and RBF. In previous research [9], the authors noticed the relation between time of acquisition of training data and localization accuracy. Therefore, this aspect was further investigated in this work. The error of ANN-based localization algorithms is presented in Figure 8.

Error of ANN-based localization for different time period of training RSSI data acquisition.
The experiments were conducted for 400 localization attempts. The ANN weights and RSSI vector prototypes were determined based on data collected by sensor nodes at present (same day) as well as using previous measurements (1, 3, and 7 days old). Additionally, location was estimated by using data from not only current day but also the last 2, 3 up to 7 days. Intuitively, the localization accuracy decreases with the aging of training data. When using the dataset which is 7 days old, the localization error increases by 100%. However, if current data are combined with readings from previous days, the localization error decreases. The decrease of error is noticeable up to 3 days. For longer periods of data collection, the error of localization increases again. Therefore, 3-day period was considered in further experiments. The proposed method with k-means clustering gives slightly better results than RBF + v1 and results comparable to RBF + v2 [26]. The method based on fuzzy c-means provides the lowest error in terms of ME. The results for the testing dataset of 3 days are presented in Table 2.
Localization errors obtained for dataset of the last 3 days.
Examples of localization results are presented in Figure 9. The highest localization error is usually obtained near edge of the building, while lowest localization error can be found in the middle of the building, where the largest number of APs is available. The distribution of errors suggests that additional APs on edges of a building can improve the localization significantly.

Localization examples for compared methods.
4.3. Computational Complexity Analysis
The proposed method, which is based on FCNN, decreases the number of neurons in hidden layer. The computational complexity for ANN can be calculated as (
The localization was performed by using Raspberry Pi 2 Arm 32-bit microcontroller. The location was estimated 10 times for 40 different localization processes. The average localization time of the proposed methods (k-FCNN and c-FCNN) is 0.05 s and 0.07 s, respectively. In comparison, the time for direct application of FCNN is 0.03 s, while for RBF + v2 the time is 0.06 s. The experimental result confirms the theoretical analysis and the possibility of real-time applications.
5. Conclusions
In recent years, a growing interest in location-based services has stimulated active research on localization techniques for WSN. In case of indoor localization, the methods that utilize ANNs have received special attention as they are robust to noise and fluctuations of RSSI values.
The experimental results presented in this paper confirm that, for indoor environment, the RSSI values obtained from propagation models and measurements differ significantly. Therefore, the range-free localization methods that use ANN-based fingerprinting were considered in this study. Tests were performed for the ANN architectures typically used in WSN localization: MLP and RBF. Effectiveness of these architectures was compared against the proposed application of FCNN. The results show that when, using the recent ANN training algorithms, the obtained localization error is on a similar level for all of the examined architectures. However, FCNN allows the same localization task to be performed by using a smaller number of neurons, which leads to decreased complexity of the computations.
The localization method proposed in this paper combines the RSSI fingerprinting approach with application of FCNN and clustering algorithms (k-means and fuzzy c-means). The clustering algorithm determines regions with similar RSSI values. A FCNN is trained for each region to estimate coordinates of the localized sensor node. Experiments performed in the dynamic real world indoor environment have shown that the proposed method can improve the reliability and precision of the RSSI fingerprinting localization.
The application of clustering algorithm improves localization accuracy of the ANN-based fingerprinting approach. This improvement was observed for all considered ANN architectures: MLP, RFB, and FCNN. The proposed method provides superior localization accuracy when compared with state-of-the-art ANN-based fingerprinting methods. The lowest localization error was obtained when using the introduced method with fuzzy c-means clustering. The disadvantage of fuzzy c-means algorithm is related to the calibration process, which requires tuning of four parameters. This drawback was overcome by applying PSO algorithm for parameter calibration.
Moreover, the impact of the temporal coverage of RSSI database on localization error was analyzed in this study. The obtained results show that the RSSI data collected during three last days have enabled the most accurate localization. The localization error increases when recent RSSI measurement is not taken into account and when the RSSI dataset is too vast (covers more than three days). It was also found that the optimal number of clusters for k-means and fuzzy c-means algorithms changes with the temporal coverage of the RSSI dataset.
Further research will be conducted to develop an intelligent sensor node, which will select and report such RSSI data that are necessary for the ANN training. Additionally, the usefulness of FCNNs for localization purposes will be further investigated.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
