Abstract
Positioning and navigation of mobile robot is the main feature for the trajectory or motion of the mobile robot. Conventional mobile robot positioning and navigation system relies heavily on fusion of multiple costly sensors, which does not promote mass production. This paper aim is to use readily and available technologies which is WiFi due to its reliability as it is pre-deployed, and it exist in most of the building. The system used are based on indoor positioning system (IPS) by using a crowdsourced fingerprinting method. This seeks to improve crowdsourced fingerprinting database performance by solving the issue of the device diversity or heterogeneity of difference devices. To cope with the crowdsourced fingerprinting database as the location estimation method for the robot application, deep neural network (DNN) is employed. The proposed method namely ratio and ranged-based (RRB) shows an improvement of 60% increments by implementing the pre-processing technique of the raw data before feeding it to the DNN. The comparison between other method shows that RRB is better in term of accuracy in three validation techniques, which are root mean square error (RMSE), distance error and accuracy between true and estimate position. This improvement effectively could facilitate the actual positioning system utilizing the WiFi infrastructure for the mobile robot in very near future.
Introduction
The precise positioning of a mobile robot is a fundamental requirement for achieving true autonomy, particularly in the context of the rapidly evolving and interconnected landscape of the Fourth Industrial Revolution (IR 4.0). In indoor environments, conventional approaches to obtaining accurate and precise positioning typically involve the integration and fusion of multiple sensor systems onboard the robot.1-3 However, the reliance on multiple sensors introduces the need for additional hardware, complicating the system architecture and increasing costs. This study proposes an alternative approach by leveraging the ubiquitous presence of WiFi signals within buildings, thereby eliminating the necessity for additional hardware and streamlining the positioning process.
Various technologies have been explored as alternatives to conventional sensors used in mobile robots, as highlighted in the literature.4-6 Among these, utilizing WiFi is identified as one of the most cost-effective solutions due to its widespread availability in most buildings and the minimal requirement for complex system architecture. However, direct utilization of WiFi signals for positioning is challenged by their inherent instability. In complex indoor environments, WiFi signal propagation is susceptible to multiple forms of interference, including multipath effects, reflection, refraction, and diffraction. 7 These factors collectively contribute to positioning inaccuracies, complicating the reliable localization of mobile robots.
Several techniques have been proposed for indoor positioning using WiFi, including methods such as active badge systems, 8 Zigbee, 9 and the fingerprinting approach employed in the RADAR system. 10 Among these, the fingerprinting method is widely regarded as the most effective indoor positioning system (IPS). 11 Unlike propagation models that require a line-of-sight (LOS) environment, the fingerprinting method does not rely on direct signals from the WiFi source. Instead, it captures unique WiFi signal characteristics and stores them in a fingerprinting database, which serves as a map for positioning. By leveraging the distinct signal strength patterns at various locations, fingerprinting can achieve precise localization, as each location is associated with a unique identifier within the database.
Despite its advantages, the fingerprinting technique has inherent limitations. The creation of a fingerprinting database requires the expertise of a skilled surveyor to meticulously collect WiFi signals, a process that is both time-consuming and costly due to the need to employ specialized personnel. 12 Furthermore, the fingerprinting database is highly sensitive to changes in the environment or equipment; any alterations render the existing database obsolete, necessitating a complete re-survey to update the signal data. This presents a significant challenge, particularly in large indoor environments where frequent updates are impractical. Although automated methods for generating fingerprinting databases have been developed, they remain vulnerable to environmental changes, which continue to impact the accuracy and reliability of the database. 13
To address the limitations of traditional fingerprinting, crowdsourcing or organic fingerprinting database collection methods have been introduced.14,15 These approaches eliminate the need for expert surveyors by leveraging data contributions from random individuals or volunteers who willingly share their WiFi signal data. This system operates autonomously, continuously updating the fingerprinting database with new data, making it not only cost-effective but also adaptive to environmental changes. As a result, the system can maintain accuracy over time without the need for repeated, labor-intensive surveys.
A significant challenge in developing organic fingerprinting through crowdsourcing is the issue of device diversity, which arises from the varying hardware characteristics of different volunteers’ devices. 16 This phenomenon, also known as device heterogeneity, leads to inconsistencies in signal measurements, as different devices may produce varying outputs even when located at the same position. The uncertainties contributing to this issue include the randomness of database collection with respect to time, user, and hardware. 17 This paper focuses on addressing this core problem, with a detailed discussion on device diversity provided in the related work section to underscore its impact on signal distribution consistency and positioning accuracy.
Contribution
This paper makes several key contributions to the field of indoor positioning, specifically targeting the positioning of mobile robots in indoor environments. While indoor positioning and mobile robotics have seen significant advancements over the years, existing technologies still present notable limitations. The primary objective of this paper is to build upon the work of previous researchers and enhance the performance and reliability of positioning systems, particularly in the context of autonomous mobile robot applications. The specific contributions of this paper are as follows:
Experimental Findings: The results demonstrate that linear transformation methods are inadequate for addressing the variations between robots and devices. To address this issue, a novel ratio-based transformation approach has been proposed. Signal Accuracy Enhancement: The deployment of multiple access points has been utilized to differentiate similar signals from nearby locations, significantly improving the accuracy of position prediction. Device Diversity Mitigation: A new pre-processing technique has been introduced to tackle the device diversity problem in crowdsourced fingerprinting databases. This technique leverages ratio and range-based methods to standardize signal data across diverse devices. Integration of Deep Neural Networks (DNNs): DNNs have been integrated into the system to enhance prediction accuracy as the number of volunteers contributing to the crowdsourced fingerprinting database increases.
Related work
Device diversity is a critical challenge in constructing organic or crowdsourced fingerprinting databases. Previous research has identified device heterogeneity as a major obstacle in the development of reliable fingerprinting systems, as evidenced by the experiments in Refs.18,19 For instance, an experiment conducted with four different mobile devices in an indoor environment revealed significant discrepancies in WiFi received signal strengths (RSS) due to device heterogeneity. 20 These differences underscore the difficulties in achieving accurate indoor localization when different devices are used during the training and positioning phases. However, the observed similarities in the shapes of the RSS curves suggest that focusing on the relative shapes of these curves, rather than directly matching RSS fingerprints, could enhance localization accuracy. Further experiments 21 demonstrated that variations in transmit power significantly impact WiFi RSSI, with two smartphones recording different signal strengths even when stationary and positioned 4 m apart. These findings highlight the challenges faced by fingerprint-based indoor localization systems due to signal variations caused by device heterogeneity. Despite these challenges, using stable RSSI gradients instead of absolute values provides a more robust localization method, mitigating the complications of signal strength transformations and better accommodating device-induced variations. Although the location remains constant for each WiFi signal collected, the resulting characteristics and distributions differ due to the hardware variations among different users’ devices, as illustrated in Figure 1.

The hardware is not standardized, hence one of the solutions is to promote the policy of hardware setup standard to the manufacturer. This is likely an impossible solution as the design of the device may affect the signal. The feasible solution are by using linear transformation which by transforming the signal of the two different devices with different signal intensity based on its correlation. 24 The same work as in Ref., 25 the authors use a linear transformation based on the devices inter-correlation is proposed. Talking the reference in literature, 25 Figure 2(a) shows the distribution of three difference signals taken with three difference devices . One could see the distributions are not in-line where the local maxima are not the same. The solution is to transform the signals so that the local maxima stay on the acceptable region by using the correlation method of the devices. We have repeated this method and we found that there is no correlation between the devices. Therefore, we propose a new method, namely ratio and ranged-based (RRB), which could potentially transform the uncorrelated signals into one uniform distribution, suitable for position fingerprinting.

Device diversity solution. 25
Recent efforts have attempted to address the device diversity problem, though not specifically within the context of crowdsourced fingerprinting databases. For example, He et al. 26 proposed a calibration-free localization system known as SLAC, which simultaneously locates the target and calibrates for device and user heterogeneity. This system employs an optimization framework that jointly considers RSSI calibration and step counter measurements to determine a target's location. By leveraging correlations between location estimates, RSSI readings, and step data, SLAC transparently calibrates the variations in device and user models. However, a significant concern with this approach is the increased prediction time and complexity in feature training due to the fusion of additional sensors, which could potentially impact the system's overall efficiency.
Li et al. 24 introduced a prototype model for a multiple-surveyor-multiple-client system designed to localize mobile users based on a crowdsourced fingerprinting approach. The system employs a linear regression model to calibrate the data across the various training devices. Additionally, the conditional likelihood of a client detecting an access point (AP) that was not visible during the training phase is estimated using a geometric distribution. However, for the linear regression method to be effective, a strong correlation between the signals from different devices is necessary. In this experimental work, it was demonstrated that no significant correlation exists between the devices used to collect fingerprinting data. Consequently, the linear regression approach proves ineffective in this context. This limitation will be explored in greater detail in the methodology section.
JUIndoorLoc, 27 developed by Roy et al., is a system that presents a comprehensive indoor localization dataset, covering various domains such as spatial, temporal, context, and device heterogeneity. RSSI data was collected from cell sizes as small as 1 m × 1 m across three floors of a building using a purpose-built Android application. The system also introduces a framework for indoor localization, incorporating an ensemble of condition-specific classifiers to address context and device variability. However, a significant concern with JUIndoorLoc is the complexity introduced by the combination of multiple classifiers, which increases the computational demands and prolongs the time required for position estimation.
The fusion of multiple RSS measurements in fingerprinting methods for accurate indoor positioning has been proposed by Guo et al., 28 known as WiFi-FAGOT. This approach utilizes three distinct RSS collection methods: raw RSS from the fingerprinting database, signal strength difference (SSD) between pairs of access points (APs), and hyperbolic location fingerprint (HLF), which normalizes the ratio of RSS values between AP pairs. The WiFi-FAGOT system employs the K-nearest neighbor (KNN) algorithm for position estimation. In contrast, Xue et al. 29 introduced a machine learning-based approach to address the challenges of device heterogeneity and environmental changes. Their system, HAIL, uses a back propagation neural network (BPNN) to measure fingerprint similarities based on absolute RSS values and incorporates AP ranking to prioritize APs according to their signal strength during training. The proposed method achieves an average distance error ranging from 1.36 m to 1.78 m. For a fair comparison, similar methods with aligned objectives are considered. In this work, both HAIL and WiFi-FAGOT are compared with our proposed approach, as they aim to tackle device diversity and enhance the accuracy of fingerprinting methods.
Proposed method
The concept of utilizing the signal fingerprinting technique specifically for autonomous mobile robots is illustrated in Figure 3(a). In this setup, the robot is equipped with only a WiFi receiver, which collects signal information from surrounding APs. The robot determines its position by comparing the received signal data with a pre-collected fingerprinting database created by user devices. As depicted in Figure 3(b), the fingerprinting technique operates in two stages. The first stage, the offline phase, involves user devices collecting and storing signal data in the fingerprinting database. In the subsequent online phase, the robot's position is estimated by matching the current signal data against the stored fingerprinting database.

Crowdsourced fingerprint flows.
In the crowdsourced fingerprinting database, signal data is collected by a diverse group of individuals using different devices, as illustrated in Figure 3(a). During the offline phase, the varying signal strengths from these different devices are recorded and stored in the fingerprinting database. In the online phase, the robot estimates its position by comparing its received signal with the signals stored in the database. The device diversity arises from the crowdsourcing process, where differences in device hardware lead to variations in signal strength, introducing challenges in achieving consistent and accurate positioning.
In the crowdsourced database, WiFi signal fingerprints are collected by volunteers using various devices, as shown in the top left-most cell of Figure 3(a). These data are stored in the offline stage as a raw fingerprinting database, which includes vectors of locations and their corresponding WiFi signal strengths. During the online stage, the robot receives WiFi signals and attempts to “match” them with the raw signals stored in the fingerprinting database to compute its location. However, due to the variability in signal distributions across different devices in the raw database, this matching process can result in positioning errors.
This paper utilizes WiFi signals from the existing APs within the building. The overall system architecture, as illustrated in Figure 4, is structured into three stages: the offline stage, the pre-processing stage, and the online stage (prediction phase). Traditionally, fingerprinting methods involve two stages—offline and online. However, in our proposed approach, pre-processing is introduced as a third stage that occurs before the online prediction phase. During the offline stage, a crowdsourced fingerprinting database is constructed using data collected from multiple devices. The Received Signal Strength (RSS) of the WiFi signals is mapped across a grid within a controlled environment, where APs remain constant and only minimal human movement is considered. Volunteers with different hardware devices contribute to building this fingerprinting database. Before the estimation algorithm is applied in the online stage, pre-processing is performed to refine the data. As highlighted in Figure 8, outliers in the signal data can significantly increase prediction errors. To mitigate this, the robot's signal undergoes outlier removal using the interquartile range (IQR) outlier detection method. From 1000 signals collected at each location by four different devices, outliers are identified and discarded based on median computations provided by the IQR method,

Overview of RRB.
Then, the IQR is calculated by,
The IQR is multiplied by 1.5 and added to the third quartile (Q3); any value exceeding this threshold is discarded as an outlier. Similarly, for the lower boundary, the IQR is multiplied by 1.5 and subtracted from the first quartile (Q1); any value falling below this threshold is also discarded as an outlier.
After discarding the outliers, the pre-processing technique is applied by transforming the robot's signal distribution. Once transformed, the signal is fed into the DNN model for training. The system integrates and compares signals from multiple devices with the transformed pre-processed signal. The position prediction model is then constructed using the DNN, trained on this processed data. For validation, random signals from the robot are used to predict its location, allowing the model's accuracy to be assessed.
Pre-processing
The primary contribution of this paper is the introduction of a pre-processing technique designed to address the issue of device diversity in crowdsourced fingerprinting databases. The radio map derived from these databases is transformed into a format usable by the robot. The core idea behind this approach is to mitigate device diversity by aligning each signal distribution with a known reference signal from the robot. This concept is illustrated in Figure 5. Since the WiFi RSS data collected from different users varies significantly, the proposed method transforms each distribution to match the robot's signal characteristics. Previous research employed linear transformations to convert signal intensities from one device to another. However, our experimental analysis, using the Pearson correlation coefficient to evaluate the linear correlation between device signals and the robot's signal (as shown in Equation (3)), indicates that there is no significant correlation between the signals from different devices and the robot.

Graphical concept of data preprocessing by signal transfirmation.
The correlation test between the robot and the devices is validated by,
Given the lack of correlation between device signals, as identified in Ref., 25 we propose a simpler yet effective method called the RRB. RRB operates using a ratio-based approach, transforming signals from one device to another by leveraging the ratio of the desired transformation. The mean signal strengths for both the devices and the robot are obtained from the crowdsourced WiFi fingerprinting database. The ratio between the robot's signal and the device signals is then calculated. This ratio is applied, as shown in Equation (5), to transform the device signals into a new signal that closely approximates the robot's signal.
If there is no correlation between the device and the robot, then the ratio is given as,
A critical challenge with the transformation process is when the device signal falls outside the range of the robot's signal. During the prediction phase, the DNN relies on the probability distribution of the signals. If the signals used for training closely match those encountered during testing, the DNN is more likely to deliver accurate predictions. Therefore, our approach aims to closely replicate the robot's signal distribution during the transformation process, ensuring that the transformed signals align as closely as possible with the robot's original signal distribution.
RRB employs a range-based method to discard signals that fall outside the robot's signal range. The pseudo code is outlined in Algorithm 1. The method determines the range of both the device and the robot by identifying their maximum and minimum signal intensities. These values are then compared: if the robot's maximum intensity exceeds the device's maximum intensity, or if the robot's minimum intensity is lower than the device's minimum intensity, any signal outside the robot's range is discarded. Otherwise, the device's signal is retained as it was after the transformation.
Estimation using DNN
A DNN is a machine learning model that utilizes deep learning architecture, rooted in artificial intelligence concepts. The primary objective of a DNN is to optimize the designed model using a backpropagation technique, which iteratively adjusts the model to achieve the best possible performance. In the context of crowdsourced data, where the dataset is large and continually growing, the DNN is particularly effective. It adaptively optimizes itself based on the input data, making it well-suited for handling the complexities and scale of crowdsourced fingerprinting databases.
The DNN approach is a supervised machine learning method that estimates location in two main steps. The first step involves training the model using the provided data. Once a robust model is developed, the DNN is used to forecast the robot's position. During training, the DNN iteratively refines the model to improve its predictive accuracy, enabling it to more accurately estimate the robot's location during the prediction phase.
To begin utilizing a DNN classifier, the data must first be labeled. This labeling is done by using the signals from three APs as the features. The labeled data then undergoes the training phase, during which the model's hyperparameters are selected heuristically, as detailed in Table 1. In this experiment, the focus is on the impact of different devices, so the hyperparameters will remain constant across all devices. The DNN model will use two hidden layers: the first with 250 nodes and the second with 20 nodes. To minimize overfitting, a dropout rate of 0.2 will be applied. The activation functions for both hidden layers are sigmoid, as shown in Equation (8), since they are capable of handling negative values. In the output layer, the SoftMax function is employed as the activation function. This function, suitable for multinomial probability distributions, uses the highest value in the output nodes to predict the user's location.
Hyperparameter values.
Finally, the Adam optimizer is selected to optimize the model. The backpropagation technique is used to learn from the output, iteratively adjusting the model by minimizing the output error to improve accuracy.
Deep learning generates new data based on the input provided by the user. The primary objective of classification using a DNN is to identify the output with the highest probability, which the model has optimized through training. This selection process is governed by a series of mathematical formulas. Initially, each input xi is multiplied by its corresponding weight ωi to compute the output for the next node in the hidden layer.
u is then inserted into the non-linear activation function on the hidden layer which is the sigmoid to get the output value between 0 and 1,
The output of the last layer is then applied to the SoftMax equation to obtain the ratio of the parameter's exponential and its sum. This will result in the maximum probability.
After forward propagation, backpropagation is executed to adjust the weights based on the prediction error between the output layer and the labeled location. This process utilizes a cost function and gradient descent to optimize the weight adjustments. The Adam optimizer, which combines the advantages of root mean square propagation (RMSProp) and the adaptive gradient algorithm (AdaGrad), is employed to accelerate the optimization process. This approach is consistent with the methodology outlined in our previous publication. 30
Experimental data collection
The data collection was conducted in the Solid Mechanics and Acoustics lab at Universiti Malaysia Perlis, encompassing 42 reference points, arranged with 6 points along the x-axis and 7 along the y-axis. This setup is based on a cell-level environment with a granularity of one meter. The experimental area, depicted in Figure 6, was specifically designed for indoor mobile robot applications. Data collection occurred in a semi-controlled environment where all configurations were kept constant, with the only variable being the position of the mobile devices used to collect data at each reference point (RP). To replicate real-world conditions as closely as possible, human movement through the fingerprinting area was considered. This approach ensures that the data collected provides a realistic representation of the environment, allowing for meaningful analysis of the impact of various devices. The data gathering process followed the methodology outlined in Ref. 30

Experimental test bed.
Three TP-Link (TL-WR841N V14) routers operating at a nominal frequency of 2.4 GHz were employed, along with four different devices: a Galaxy A7 (2017), a Galaxy Tab 8.0, a Galaxy Tab A, and a Mi A2 Lite. A Nitro 5 Laptop AN515-45 served as the robot's processing unit, supporting 802.11a/b/g/n/ac/ax standards, compatible with most WiFi routers. At each reference location, 1000 samples were collected over a 20-minute period. The impact of the four distinct devices on position estimation was then assessed at each RP. This analysis will explore in detail how the use of multiple devices affects positioning accuracy. The signals from the four devices were studied and integrated into a single composite signal, simulating the use of multiple devices. To evaluate positioning accuracy, 250 samples from each device were combined into one comprehensive database.
Figure 7 illustrates the grid system of the signal fingerprinting database's reference points. To ensure that the unique devices are the primary variable under investigation, all other factors were kept constant, with only the devices varying. The database consists of 42 reference points, spaced 1 m apart. The vertical axis of the fingerprinting grid spans 6 m, while the horizontal axis spans 5 m. Data collection was conducted using three APs, designated AP1, AP2, and AP3. Multiple APs were chosen to address the issue of similar signal strengths at nearby locations, as some signals from random locations may be identical. Using three APs, which act as beacons in a trilateration approach, is the optimal configuration. The intersections of signals from three APs reduce the possible location estimates to a single point, providing a higher accuracy and fewer errors compared to using only two APs. While the focus of this experiment is not on the APs themselves, their number was fixed at three throughout the data collection process to maintain consistency.

Experimental area.
This experiment utilizes three APs by employing three routers, each equipped with two fixed antennas, operating at the nominal 2.4 GHz. The reference points are mapped along the X and Y axes, as illustrated in Figure 7. The database is represented by the string
Validation of proposed work
Indoor positioning systems (IPS) offer significant accuracy within indoor environments. To assess the performance of the proposed system, validation is essential. In the context of mobile robots, the validation process follows established IPS validation methods, as the primary objective is to ensure the accurate positioning of the mobile robot.
The proposed system will be validated using multiple methods, including accuracy metrics, distance error, and root mean square error (RMSE). These three validation methods will be employed to assess the effectiveness of the IPS for autonomous mobile robot applications. The primary goal is to minimize the distance error between the true location and the predicted location. A detailed discussion of these validation methods will be provided in this section.
The data is analyzed by calculating the difference between the true RP and the estimated position. One of the key evaluation metrics is the distance error, which is calculated using Euclidean distance, as shown in Equation 10. The results are presented in a cumulative distribution function (CDF) graph, where the frequency of distance errors is examined by assessing the minimum, average, and maximum distance errors.
The ratio of correct predictions to the total number of prediction samples as in Equation (11) is used to determine the accuracy.
Comparison of existing method.
RMSE is used to get the ratio in term of standard deviation, the value from the RMSE will be used to compare with other works in similar domain.
Result and finding
The first analysis focuses on device diversity among the four devices, underscoring the core motivation of this paper. Figure 8 illustrates the signal variations recorded by four different devices at the same location, specifically at coordinates (1,1). It is evident that the signal distributions differ significantly, despite being collected in the same location under identical environmental conditions. This variation arises from the differing hardware characteristics of each device, as discussed in earlier sections. Such discrepancies can significantly impact the accuracy of mobile robot positioning, as the fingerprinting technique relies on matching the robot's signal with the radio map of collected data. If the signal variation is too pronounced for the same location, it poses a substantial challenge during the matching stage.

Different devices behavior in same location.
In Figure 8, the signal data from four devices at the same location, specifically at coordinates (0,1), is shown. Notably, there is an outlier in the data from the Galaxy A7 (2017). This highlights a common issue in fingerprinting techniques, where outliers can lead to significant errors during the matching process. To mitigate this, the outlier is removed using the IQR outlier detection method, ensuring more accurate and reliable matching.
Previous studies have suggested that devices within crowdsourced databases exhibit a linear correlation. However, a deeper analysis reveals the opposite. In this work, the cross-correlation between the four devices and the robot is presented in Table 3. Using the Pearson correlation coefficient, where RRR values range from 0 to 1 (with values closer to 1 indicating strong correlation and values closer to 0 indicating weak correlation), the correlation between the Mi A2 Lite, Galaxy Tab A, Galaxy Tab 8.0, Galaxy A7 (2017), and the robot were found to be 0.051, 0.067, 0.014, and 0.062, respectively. These results clearly indicate that there is almost no correlation between the devices and the robot. Consequently, the linear transformation method commonly employed by researchers is ineffective in this context.
Pearson correlation between robot and devices.
The concept involves transforming the signal from a device to match the robot's signal, as illustrated in Figure 5. However, analysis based on correlation indicates that a linear transformation is not feasible. Instead, a ratio-based method is proposed, which transforms the device signal to the robot signal by multiplying it by the ratio of their means. This approach forms a key part of the pre-processing stage, enabling the transformation of data from the devices to align with the robot's signal.
The concept involves transforming the signal from a device to match the robot's signal, as illustrated in Figure 5. However, analysis based on correlation indicates that a linear transformation is not feasible. Instead, a ratio-based method is proposed, which transforms the device signal to the robot signal by multiplying it by the ratio of their means. This approach forms a key part of the pre-processing stage, enabling the transformation of data from the devices to align with the robot's signal.
Multiple access point
To address the issue of signal similarity across different locations, multiple APs were introduced. As illustrated in Figure 9, data collected from two distinct location (1,1) and (3,1), which are 2 m apart exhibited high signal similarity, both averaging around −51.5 dBm. In fingerprinting techniques, where location estimation relies on a matching system, such similarities can lead to significant errors. To mitigate this problem, the use of multiple APs, specifically three, was implemented to generate a more accurate and distinct radio map, thereby reducing the potential for errors caused by overlapping signals.

RSS of different location.
Pre-processing
Pre-processing involves transforming the signal data from various devices to align with the robot's signal, where the Nitro 5 Laptop AN515-45 serves as the processing unit. The fingerprinted signals from the devices are then compared to the robot's signal. As illustrated in Figure 10, the transformation is achieved by taking the mean ratio of the robot's signal to the device's mean signal. The initial signal distributions, as shown on the graph, differ in intensity. However, in our proposed system, after excluding outliers and applying the ratio technique, as depicted in Figure 11, the transformation successfully aligns the device signals with the robot's signal distribution. The average RSS signal intensity, initially at −49 dBm, is transformed to −44 dBm, matching the robot's signal intensity. This simple ratio-based approach effectively addresses the device diversity problem without the need for complex algorithms or methodologies.

Before and after transformation.

Range exclusion based on robot's RSS.
After transforming the device signals to align with the robot's signal distribution, a new issue arises some of the transformed device signals exhibit a broader or wider range compared to the robot's signal, as shown in Figure 11. To address this, our second approach involves excluding the excess signals using a range-based method.
The minimum range of the robot's signal, represented by the blue line, is narrower compared to that of the Mi A2 Lite. The goal is to adjust the device signals to fall within the range of the robot's signal. To achieve this, we apply a range-based exclusion approach. After the transformation, if a device's signal falls within the robot's signal distribution, it is retained. However, if the device's signal lies outside the robot's range, it is excluded. This ensures that the processed robot signal closely matches the device signals, facilitating more accurate iterations during the DNN training phase.
DNN before and after using RRB
The raw data from four different devices is combined to simulate a crowdsourced fingerprinting database. After merging the device signals, the raw data is directly input into the DNN algorithm for training. Following the training phase, the robot's signal is used as a validation test for the DNN model. As shown in Figure 12, the validation accuracy fluctuates between 6% and nearly 12%, highlighting the significant impact of device diversity on the accuracy of the robot's localization. These results underscore the challenges posed by device heterogeneity in achieving precise localization.

Accuracy of before and after pre-processing.
After applying the RRB method for signal transformation, the data from the four devices is fed into the DNN algorithm for training. The model is trained using the pre-processed data from these devices, and the robot's signal is then used for validation. As depicted in Figure 12, the validation accuracy, with the robot serving as the prediction mechanism for localization, shows a significant improvement. The accuracy increases from 45% to 63% after 100 epochs, demonstrating nearly a sixfold increase compared to the pre-processing stage, where the accuracy was approximately 10%. This substantial improvement highlights the effectiveness of the transformation process in enhancing localization accuracy.
A comparison between the results before and after applying the RRB method was plotted, revealing a significant improvement. The graphs indicate that the accuracy increased nearly sixfold after pre-processing, demonstrating the effectiveness of our proposed method for transforming signals for autonomous robot location estimation. Although some fluctuations are observed in the estimation, the average accuracy remains around 60%, which is a substantial improvement compared to the approximately 10% accuracy achieved without pre-processing, underscoring the solution's ability to mitigate the challenges posed by device diversity.
A CDF graph was plotted to compare the distance error before and after applying the RRB technique. As shown in Figure 13, prior to pre-processing, the maximum distance error reached 4 m, with an average error of 2.25 m and a minimum error of 1 m. After applying the proposed pre-processing method, the maximum error was reduced to 3 m, representing nearly a 1-m improvement. The average distance error dropped to 0 m, compared to 2.25 m without pre-processing, and the minimum error also reached 0 m. These results demonstrate a significant improvement in distance accuracy after pre-processing, with the accuracy increasing nearly sixfold and the distance error decreasing by 1–2 m. This improvement leads to more accurate localization for the autonomous robot.

Cumulative distributive distance error.
Comparison between others method
The methods used by previous researchers are varied, with different data collection approaches and techniques, making direct comparisons challenging. However, meaningful comparisons can be made by focusing on studies that are closely related to our experimental work. To ensure an effective comparison, we will evaluate these studies based on their objectives, while disregarding the specific methods used by the authors. In this paper, the validation of our proposed system is conducted through a comparison with previous methods addressing device heterogeneity, assessing how well our system predicts based on accuracy (%), RMSE, and distance error.
In the literature, various methods have been discussed, but two systems HAIL and FAGO are the most closely aligned with the objectives of this paper. Table 2 presents a comparison between our proposed system, referred to as RBB, and these previous works, focusing on the issue of device diversity in crowdsourced fingerprinting databases. While our system is specifically designed for autonomous mobile robots, the underlying challenges and goals are similar across the compared methods. The results in Table 2 demonstrate that our proposed system outperforms the others in terms of RMSE, with an error of less than 1 m, compared to FAGOT's 2.7-m error. The HAIL method shows an average distance error ranging from 1.38 m to 1.78 m and a maximum error of 6.2 m, whereas our method achieves a 0-meter average error and a 3-m maximum distance error, indicating superior performance
Conclusion
This paper aims to advance the field of indoor positioning for autonomous mobile robots. Despite significant progress in developing accurate, precise, and robust systems through extensive experimentation, there remains room for improvement. In this study, we developed an IPS that relies solely on WiFi for the localization of autonomous mobile robots. Traditional linear transformation methods were found inadequate for signal transformation between devices due to the lack of correlation, as demonstrated in this paper. We propose a new pre-processing technique that utilizes ratio- and range-based methods to address the issue of device diversity in crowdsourced fingerprinting databases. Additionally, multiple APs were employed to mitigate signal similarities across different locations. A DNN was designed and implemented to adapt to the updated crowdsourced fingerprinting database.
The proposed technique (RRB) significantly outperforms the raw data approach, with accuracy improving from 10% to nearly 70%, representing a sixfold increase. The distance error was also reduced, with the maximum error decreasing from 4 m without pre-processing to 3 m with pre-processing. Compared to similar previous work, our system demonstrates superior performance in terms of accuracy, distance error, and RMSE.
Future work will focus on addressing environmental variations that affect the robot's signal, as changes in the environment can impact positioning accuracy. This will be a key area of contribution for indoor positioning in mobile robot applications. Additionally, optimizing the DNN hyperparameters will be a priority to achieve the most efficient design, further enhancing the IPS specifically for autonomous mobile robots.
Footnotes
Acknowledgment
The authors would like to acknowledge the support from the Fundamental Research Grant Scheme (FRGS) under a grant number of FRGS/1/2018/ICT05/UNIMAP/02/4 from the Ministry of Education Malaysia.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Japan Science and Technology Agency (JST), Japan, “Precursory Research for Embryonic Science and Technology” (PRESTO) (grant number JPMJPR2135), Japan Society for the Promotion of Science (JSPS), Japan “KAKENHI” (grant number JP21H00901), and the Ministry of Higher Education, Malaysia (grant number FRGS/1/2018/ICT05/UNIMAP/02/4).
