Abstract
Being applicable for almost every scenario, mobile localization based on cellular network has gained increasing interest in recent years. Since received signal strength indication (RSSI) information is available in all mobile phones, RSSI-based techniques have become the preferred method for GSM localization. Although the GSM standard allows for a mobile phone to receive signal strength information from up to seven base stations (BSs), most of mobile phones only use the information of the associated cell as its estimated position. Therefore, the accuracy of GSM localization is seriously limited. In this paper, an algorithm for GSM localization is proposed with RSSI and Pearson's correlation coefficient (PCC). The information of seven cells, including the serving cell and six neighboring cells, is used to accurately estimate the mobile location. With redundant information, the proposed algorithm restrains the error of Cell-ID and shows good robustness against environmental change. Without any additional device or prior statistical knowledge, the proposed algorithm is implementable on common mobile devices. Furthermore, in the practical test, its maximum error is below 550 m, which is 100 m better than that of Cell-ID, and the mean error is below 150 m, which is 250 m better than Cell-ID.
1. Introduction
As mobile devices spring up and with the corresponding improvements of wireless communication, localization in mobile networks has become one of the hottest topics in wireless and mobile computing research [1, 2]. However, it is a key problem to acquire sufficient localization accuracy for location-based services (LBSs). According to the regulations, U.S. Enhanced 911 (E-911) adopted by the U.S. Federal Communications Commission, all emergency calls made by cellular phones have to be localized within an accuracy of 125 m in 67% of the cases [3]. GPS is favorable due to its high accuracy, but it is the most power-consuming positioning method. When the GPS receiver is turned on, current mobile phone's battery can last only a few hours.
Despite of its low accuracy, Cell-ID positioning is regarded as the basic positioning method in most cellular-communication systems [4, 5]. It reports the identity or a geographical description of the cell to which the terminal is connected. Cell-ID finds the center of the associated (usually the nearest one) cell as its estimated position. Transmitted over the control channel, Cell-IDs are easy to be obtained. It can be used in any GSM devices without additional devices or prior statistical knowledge. Therefore, it is applicable in almost all situations when there is cellular coverage. Another advantage of Cell-ID method is that it owns short response time because the Cell-ID is generally stored in the mobile terminal together with other basic information related to the connection. Due to its simplicity and low cost, Cell-ID has become the most preferable way for mobile localization.
The main drawback of Cell-ID is that its accuracy depends on the center of cell area [6, 7]. Conventional Cell-ID positioning method can only provide low accuracy because the cell size in GSM networks, especially in rural areas, is relatively large. This leads to different attempts to enhance the accuracy of the Cell-ID positioning method. For instance, timing advance value is adopted to reduce the cell size and improve the accuracy [8].
In this paper, an improved Cell-ID method is proposed for GSM localization. Not restricted to the serving BS Cell-ID only, the proposed method fully utilizes the information of all the seven cells. With received signal strength indication (RSSI) and Pearson's correlation coefficient (PCC), the proposed method can accurately estimate the mobile location. Furthermore, environmental interference and shadow fading can be restrained effectively with redundant information. Experimental results show that the proposed method can acquire higher accuracy than conventional Cell-ID and its enhanced version.
The contributions of this paper can be summarized as follows:
Compared with TOA, AOA, city-wide WiFi, and augmented sensor-based systems, our proposed method uses RSSI which requires no additional hardware. It is more implementable on common mobile devices. Furthermore, the proposed algorithm does not depend upon prior statistics, so it can be used in any place covered by GSM signal. We fully utilize the information of all available cells, including the serving cell and its six neighboring cells, to attain better accuracy in localization. By contrast, typical Cell-ID methods only use the information of serving cell. We use Pearson's Correlation Coefficient, instead of the Euclidean distance of the two vectors, as the evaluation function. It can provide better robustness.
2. Related Works
A broad spectrum of solutions, such as received signal strength (RSS), time of arrival (TOA) [9], time difference of arrival (TDOA) [10], and angle of arrival (AOA) [11], has been proposed to attain mobile localization by measuring the radio signal traveling between a mobile terminal and base stations (BSs) [12, 13]. Some researchers have proposed a number of methods including fingerprinting and max-min-box [5]. These techniques, except for RSS, often depend on additional hardware and database, which means additional cost and more computational burden. For example, AOA-based methods always require an antenna array to identify the signal's angle, while TOA-based methods often need strict time synchronization. Furthermore, the cellular radio propagation often causes bad influence upon these methods. When obstacles exist in the propagation path of the signal, these methods will suffer from the non-line-of-sight (NLOS) and multipath propagation. However, due to the electromagnetic propagation properties, particularly in urban areas [14], NLOS errors are very likely to corrupt the original signal and increase the estimation error significantly [15]. Comparatively, fingerprint positioning can attain good performance, but it always requires time-consuming site survey [5] and cannot adapt to dynamic environment.
As the fundamental positioning method of most cellular-communication systems, Cell-ID has been extensively researched and implemented. For typical Cell-ID based positioning, the area under study is divided into several cells. Generally, the shape of cells is irregular and highly depends on the propagation environment. For classical Cell-ID approach, the smaller the cell sizes are, the better accuracy one can get from Cell-ID based localization [6]. Therefore, an investigation of cell sizes can give one a rough idea about the accuracy that can be obtained. In [8], the statistical modeling of user motion and the measurements are done via a hidden Markov model (HMM). The obtained results show smaller cells in the Pre-WiMAX network than in the GSM network. Hence, using the Cell-ID positioning in the Pre-WiMAX network will provide better accuracy than that in the GSM network. However, this might not be the case for the other parts of the world because Pre-WiMAX network only covers a part of countries.
In many places in the world, the density of cell towers is so small that the available cell tower information for localization is very limited. To enhance the accuracy of localization, probabilistic approach can be utilized. In [13], the signal strength history from only the associated cell tower is utilized to achieve accurate GSM localization. Compared to current RSSI-based GSM localization systems, the authors declared at least 156% enhancement in median error in rural areas and 68% in urban areas. To some extent, uncertainties in power-distance mapping and dynamics of propagation models can have bad influence upon the performance of the positioning system. In [16], the authors also present a Cell-ID Aided Positioning System (CAPS), which leverages near-continuous mobility and the position history of a user. CAPS is designed based on the insight that users exhibit consistency in routes traveled and that Cell-ID transition points that the user experiences can uniquely identify position on a frequently traveled route. With a Cell-ID sequence matching technique, CAPS estimates the user's position based on the history of Cell-ID and GPS position. In [17], the authors propose the time-delay neural network to efficiently learn the mobile location from sequential received signal strength. By embedding the temporal structures of RSS into the spatial structures of networks, the proposed algorithms can extract location information from temporal variation of RSSs rather than removing them.
Generally, the positioning techniques based on RSSI can give a more precise estimation than Cell-ID [18]. It does not require directional antennas or extra time synchronization hardware. In fact, some enhanced Cell-ID algorithms exploit radio measurements to determine a distance to the terminal. Measurements of path loss or round-trip time (RTT) have been proposed. However, the path-loss measurement suffers from shadow-fading effects. In [7], the author proposes an algorithm that clusters all the reference points into several clusters and allows multiple reference points per region for mobile localization. Each cluster is tagged according to the detected set of neighbor cells, auxiliary connection information, and auxiliary measurements that are simultaneously performed with high-precision positioning. This method can produce areas, with a high prespecified confidence, of a size equal to 20%–50% of the original cell. It can also be viewed as a robust fingerprinting algorithm. Collecting realistic RSS data in the target area may also reduce the uncertainties, but it requires site survey which is time consuming. To develop a calibration-free RSS-based localization system, [19] proposes to utilize the pairwise information between base stations to localize the user based on multidimensional scaling. This approach further considers the geometric structure between base stations to compensate for distance estimation. Therefore, it can achieve better accuracy.
The mobile station continuously measures signal strengths from both the serving cell and its neighbouring cells. Undoubtedly, more information is better for target positioning. In [20], a novel Cell-ID localization algorithm based on hidden semi-Markov model (HSMM) is proposed. All the Cell-IDs detected by mobile nodes are utilized. Furthermore, the positioning results are acquired by maximizing a posteriori estimation criterion via HSMM. The method can obtain superior positioning accuracy of 455 m compared with the classical Cell-ID approach on average. By evaluating measurements from each neighbouring site presented in the network measurement report (NMR), the original area obtained from the Cell-ID (and possibly time arrival) can be cropped down to a smaller one by removing the parts that are unlikely to enclose the terminal's location [21, 22]. Absolute RSS values received from a base station change with time, but the relative RSS (RRSS) values which refer to the relations of the RSS values between different BSs are more stable. In [23, 24], the authors propose Database Correlation Method (DCM) on the basis of a database of a premeasured RSS. Real test shows that the mean positioning accuracy is about 29 m in urban areas and velocity estimation is about 1 km/h in rural areas.
Some Cell-ID enhanced algorithms are proposed by utilizing the signal of neighboring cell perceived by GSM device. In [25], the authors considered the effect of shadow fading and obtain several propagation distance candidates between the MS and each BS. A Gaussian model was built to represent the RSSI of each BS. One of its disadvantages is that some parameters must be obtained according to empirical model decided by the environment around the user. Therefore, it cannot be widely used in different environments without modification. Another data-fusing algorithm to enhance Cell-ID (ECID) is proposed in [26]. This method is based on a standard parameter separation least-square algorithm by following the convergences of the gradient-descent algorithm to determine the MS location. It uses iterative algorithm which takes time in embedded system. Furthermore, the least-square algorithm only solves the problem of using different antennas but ignores the environmental change. Similar problems occur in most fingerprint-based or database-matching algorithms [27, 28]. These algorithms only work with the databases which are built beforehand. Additionally, the database built for a particular model of the phones will not fit all the phones in the world.
Cell-ID localization accuracy may also be improved by further techniques, such as map-snapping, movement prediction, or combination with other technologies [29]. Some research also proposes the combination of GPS and GSM Cell-ID positioning, while the energy efficiency must be considered [30].
3. Mobile Localization Algorithm Based on RSSI and Pearson's Correlation Coefficient
3.1. Problems in Triangulation
With the RSSI and output power of base station (BS), it is possible to estimate the distance between the mobile device and BS. If the device can get its distance from three nearby cells or BSs, RSSI-based mobile location will turn into the well-known triangulation positioning problem, which is shown in Figure 1.

A sample of triangulation positioning.
This problem can be described as below:

Triangulation suffering from interference and shadow fading.
Equation (2) may not have an analytical solution. Though it might be solved with numerical algorithms, for example, Newton algorithm iterations, the computational cost is too high for low-end embedded system.
3.2. Enhanced Cell-ID Algorithms
From its neighboring cell, GSM devices can get such information as RSSI, Cell-ID number, and LAC number. The latter two factors can determine the position of the cell. Then the device can get the location and RSSI of the seven cells including six neighboring cells and one serving cell. These seven cells are shown in Figure 3.

Serving cell and neighboring cells.
With Bayesian estimation, we can form enhanced Cell-ID method [25]. Gaussian model is built to represent the RSSI of each BS:
The long-term median
Note that
Here, the maximum value of
3.3. Pearson's Correlation Coefficient Localization
In this paper, we propose a localization algorithm for GSM mobiles based on RSSI and Pearson's Correlation Coefficient (RPCC). The main idea is to make good use of the seven-cell information, which is redundant to locate a two-dimensional position and minimize the influence of disturbance, barriers, or NLOS errors, for example, on performance.
GSM signal propagation and attenuation can be described by Okumura-Hata model [31]:
For GSM localization, f (frequency),
Evidently, there is a linear relationship between L and R.
Even if
The absolute value of PCC is less than 1. The closer it is to 1, the stronger is the linear relationship between the two vectors.
As shown in Figure 4, for any location

The grid in serving cell.
Set
If the location

Environment of the area.

Distribution of
There are seven BSs in the area. Assuming that
As shown in Figure 6, the closer to the device, the less
4. Algorithm Evaluation
4.1. Simulation Test on ECID and RPCC
To compare with RPCC, we calculate the distribution of
In this simulation, I is subject to Gaussian distribution:
The expectation of I is zero, and the standard deviation is 8% of the mean value of L. Figure 7 shows the results.

Distribution of
As shown in Figure 7, ECID works well because all parameters are set to adapt the environment. RPCC gives a similar estimation in this situation.
To observe the environmental influence, we changed some environmental parameters including the height of the BSs, the height of the MS, and the frequency of the signal. At the same time, the antenna gain is also changed. According to the simulation, the results of RPCC and ECID are shown in Figures 8 and 9, respectively.

Distribution of

Distribution of
As Figure 9 shows, the distribution of
4.2. Statistics Test without Environmental Change
To take a full test on performance of the proposed algorithm, we take a simulation in the area shown in Figure 10. There are 500 MSs in the area of the center cell. This means the center cell is the serving cell of every MS, and other six cells are the neighboring cells of every MS. The location of MSs is randomly generated. In this simulation the MSs can only be in the center cell. In a cellular network, if the serving cell of the device is not on the edge of the network, the device must be in a serving cell surrounded by six neighboring cells. The situation in this simulation is an ordinary case in the real world.

Environment of simulation.
The location of every device is calculated, respectively, according to the ECID and the RPCC. The RSSI in this simulation is calculated according to (12), regardless of the environmental change.
As Figure 11 shows, ECID and our proposed algorithm have similar performance. Both algorithms get the maximum errors of about 50 m. With the maximum error of about 100 m, the accuracy of Cell-ID is worse than ECID and the RPCC.

Accuracy comparison without environmental change.
4.3. Statistics Test with Environmental Change
Although ECID and RPCC perform well in the last test, the parameters of environment are different in different places. An algorithm can be widely used only if it can be adapted to different environments without any modification. In this test, we increased the gain of the antenna by 0.2 dB from 2 dB. Simultaneously, we added a random variation by −10~+10 m to the height of BSs to emulate the different heights of BSs and increased the height of the MS by 3 m. We also changed the frequency from 900 MHz to 1800 MHz. Figure 12 shows the accuracy performances of Cell-ID, ECID, and RPCC in different environmental parameters.

Accuracy comparison with interference and shadow fading.
According to Figure 12, Cell-ID does not suffer from the environmental change and shows the same performance in Figure 11. Suffering from the environmental parameters change, ECID loses accuracy significantly. The maximum error of ECID is over 150 m. The accuracy of RPCC and Cell-ID stayed the same with previous test. According to the result of this test, suffering environmental changes, the probability of error less than 50 m is about 50% for ECID and almost 100% for RPCC.
5. Implementation and Field Test
As a localization algorithm for mobile device, RPCC is easy to be implemented on a mobile device with MCU, a smart phone, for example. Figure 13 shows the flowchart of a program that implements RPCC on smart phone.

Flowchart of the program.
The details of subprocess to calculate the position using RPCC are shown in Figure 14.

Details of the subprocess to calculate the position using RPCC.
To take the RPCC into a practical test, we implement it on a smart phone with embedded system. The smart phone can get the information, such as RSSI, LAC code, and Cell-ID code, of seven nearest cells including the serving cell and six neighboring cells. Using the information, the location of the seven cells can be acquired from some free cell localization service on Internet [32]. Then the location of the smart phone can be calculated by its MCU using RPCC and Cell-ID. The reason of not using MSL is that MSL needs to know the transmit power of the base station (not necessary for RPCC or Cell-ID) which is easy to get in a simulation but hard in a practical test. Additionally, MSL is proved to be worse than Cell-ID in the simulation test and that means it is not necessary in the practical test.
We put the device in pocket and took a riding in the city. A set of data was collected and shown in Figure 15.

Result of the practical test.
The green line is the route the device follows. The red line with square shows the error of RPCC, and the blue line with plus sign shows the error of Cell-ID. Obviously, the RPCC shows higher accuracy than Cell-ID in the practical test. The result of statistical analysis is shown in Figure 16.

Result of the statistical analysis.
The accuracy of RPCC and Cell-ID is lower than simulation, because the density of BSs in real test is lower than that in simulation. Additionally the distribution of the cells in practical is not as regular as it is in the simulation. The maximum error of RPCC is less than 550 m, and the probability of the error below 300 m is about 80%. The maximum error of Cell-ID is less than 650 m, and the probability of the error below 300 m is about 20%. The line of RPCC is above that of Cell-ID. This means RPCC is more accurate than Cell-ID.
6. Conclusion
The main idea of RPCC is to fuse the information of seven BSs to reduce the influence caused by the interference and shadow fading upon mobile localization. The proposed RPCC algorithm is compared with the other data-fusing algorithm, for example, ECID, which also uses the information of seven BSs. Both of them work well in the simulation test without environmental changes. But RPCC shows more immunity to the change of environmental parameters in the second simulation test which changes the parameters of the environment and antenna gain.
RPCC can estimate the position precisely without any additional devices or prior statistical knowledge. Since it does not rely on complex computing, it is easy to be implemented in most mobile devices. Compared with the Cell-ID, which is widely used in mobile devices, RPCC has a better performance in the simulation and practical test.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61471110 and 61273078), China Postdoctoral Science Special Foundation (2014T70263 and 2012M511164), and Chinese Universities Scientific Foundation (N130404023).
