Abstract
This article puts forward an indirect cooperative relative localization method to estimate the position of unmanned aerial vehicles (UAVs) relative to their neighbors based solely on distance and self-displacement measurements in GPS denied environments. Our method consists of two stages. Initially, assuming no knowledge about its own and neighbors’ states and limited by the environment or task constraints, each unmanned aerial vehicle (UAV) solves an active 2D relative localization problem to obtain an estimate of its initial position relative to a static hovering quadcopter (a.k.a. beacon), which is subsequently refined by the extended Kalman filter to account for the noise in distance and displacement measurements. Starting with the refined initial relative localization guess, the second stage generalizes the extended Kalman filter strategy to the case where all unmanned aerial vehicles (UAV) move simultaneously. In this stage, each unmanned aerial vehicle (UAV) carries out cooperative localization through the inter-unmanned aerial vehicle distance given by ultra-wideband and exchanging the self-displacements of neighboring unmanned aerial vehicles (UAV). Extensive simulations and flight experiments are presented to corroborate the effectiveness of our proposed relative localization initialization strategy and algorithm.
Keywords
Introduction
Inter-unmanned aerial vehicle (UAV) relative localization (RL) is the pre-requisite for UAV teaming and swarming in the case when absolute localization information such as information from global positioning system (GPS) is unavailable or inaccurate,1,2 which happens in indoor, urban, and forest environments. Therefore, RL is critically important in multi-UAV systems and there are pressing needs to study the inter-UAV RL problem.
A convenient solution for RL is to extract the distance and bearing information by using cameras. 3 However, cameras can only operate within a limited range and suffer greatly from occlusion and lighting conditions. On the other hand, distance measurement can be obtained from different types of sensing devices such as ultra-wideband (UWB), radars, and lidars, which can operate over a much larger range. In particular, UWB technology stands out in accurate ranging due to its ability to alleviate multi-path effects. Different from the space-sweep-ranging method of a widely used laser scanner, i.e. Hokuyo (≤30 m), UWB modules used herein are utilized for peer-to-peer ranging with bidirectional communications.
In this article, we focus on the active RL problem of determining the position of a moving UAV relative to its neighbors in a 2D plane through active sensing (measuring relative distance) and communications (exchanging self and neighbor’s displacements). Our proposed method constitutes two stages. Initially, without knowledge of its own and neighbors’ states, each UAV solves for an initial relative position estimate with respect to a static hovering UAV (a.k.a. beacon). To be specific, limited by the environmental space or task constraints, we aim to design some ranging path along which the moving UAV is able to effectively reduce its localization error. Particularly, the case of a moving UAV under motion constraint is considered, namely that the UAV can only move in a limited range. Such scenario is not rare when the UAV team is required to conduct cooperative tasks in a cluttered environment, say in a forest where neighboring UAVs are separated by trees. Besides, some specific task may require a quick solution of RL, which only allows a small movement.
Assuming noise-free displacement measurements, we reformulate the active RL problem as optimal sensor placement under constraint. Given the maximum allowed moving distance D, we manage to find the minimum mean square error (MMSE) in terms of D and the sample size n (the number of collected data for initialization). Based on this result, we then design a ranging path to effectively reduce the localization error. In the presence of displacement noise, we leverage the extended Kalman filter (EKF) to deal with the RL problem.
Starting with the refined initial RL estimate from the first stage, the second stage generalizes the EKF strategy to the case where all including beacon UAVs move simultaneously. In this stage, each UAV will transmit its displacement to its neighbors whenever a range request is triggered by any neighboring UAV. Thereupon, the self and neighbor’s displacements and their relative distance are fed into the proposed EKF estimator for updates.
The main contributions of the article are highlighted below:
A motion constraint RL is reformulated as an optimal sensor placement problem under constraint and according to theoretical analysis, an initialization stage for RL problem with the MMSE performance is designed. A workable RL solution based on the EKF for miniature UAVs using light-weight UWB sensors (less than 60 g) is presented and extensive flight experiments in a large field (e.g. inter-UAV distance from 20 to 50 m) are conducted to verify our RL system.
The rest of the article is organized as follows. We first review some related works in “Related works” section, and formulate the problem in “Preliminaries and problem formulation” section. A rough initial RL estimate in the case of a single beacon is introduced in “RL with a single beacon” section and we analyze the corresponding MSE, based on which we, respectively, design the RL algorithm in both the cases of noise-free and noise-corrupted displacements. A cooperative RL method for simultaneous movements is presented in “Cooperative RL without beacon” section. Simulations and experiments are conducted, respectively, in “Simulation results” and “Experiments with UAVs” sections and verify our approach, and we conclude our work in “Conclusion and future work” section.
Related works
In general, the main RL algorithms are classified into two categories: beacon-based and beacon-free. In this article, the proposed single beacon based RL (first stage) serves as the initialization of the beacon-free RL (second stage). Furthermore, a motion constraint, which is reformulated herein as optimal sensor placement under constraint, is considered in the first stage. The related works are introduced as follows.
RL with a single beacon
In the case of a static beacon, the RL problem can be viewed as a source localization problem. By treating the source location in 3D space as parameters to be estimated and assuming exact distance and self-position measurements, Dandach et al. 4 proposed a continuous-time adaptive method to achieve the exponential convergence of the location estimation, given that the mobile agent regularly avoids the planar motion.
By including the distance measurement as an augmented state, Batista et al. 5 transformed the original nonlinear system of the source location into a linear time-varying system and applied Kalman filter for its location estimation. The necessary motion to guarantee a convergent estimate was given by analyzing the observability of the time-varying system. A similar method was also adopted in the discrete-time case Batista et al. 6 With a periodic motion of the mobile agent, a recursive least squares fading memory filter was applied to the squared distance measurements in Indiveri et al. 7 However, note that the above approaches require the agent to move a long distance before getting a convergent solution, and the design of a ranging path to effectively reduce error is not considered. Also, the squaring of distance measurements may result in a larger estimation error. Mueller et al., 8 who is the earliest adopter of UWB for quadcopter state estimation and control, utilized accelerometers, gyroscopes, and UWB radios to estimate the dynamic state of a quadrocopter and flight tests were successfully conducted along a 4 m × 3 m horizontal rectangle. However, five UWB radios had to be used as anchors for only one quadrocopter localization and the dynamic model of this quadrocopter was necessarily required. Wang et al. 9 novelly combined moving horizon estimate (MHE) and convex optimization to perform 3D multirobot localization with constraints and unknown initial poses. However, this method was applied in a centralized manner and suffered from a heavy computational burden. Therefore, at present, it is difficult to be implemented on multiple miniature UAVs.
Cooperative RL
Based on rigidity theory, a robust quadrilaterals algorithm, Moore et al. 10 was proposed to cope with landmark-free localization using distance measurements between agents. More recently, Diao et al. 11 developed a generalized barycentric coordinate representation and algorithm for determining the sensor locations in a randomly deployed sensor network and this method was implemented in an iterative and distributed manner. These methods seem close in spirit to our work. However, these algorithms aimed primarily at static sensor networks. The majority of work on localization generally either require stationary nodes or are not limited to range only measurements for static sensor networks.12–14
Localization for mobile robots in 2D using range only measurements have been demonstrated in literature.15–18 In these works, the relative position between robots is estimated after the robots travel through a sequence of positions and orientations where the experimental verifications were not considered therein. Recently, an on-board Bluetooth-based RL method is proposed by Coppola et al. 19 for collision avoidance in UAV swarms. In practical tests, three UAVs (using offline velocity estimate) or two UAVs (using on-board optical flow) flying in a 4 m × 4 m space collided once over a cumulative flight-time of around 3 min as a result of the disturbances of the on-board velocity estimate. Note that, in this latest research result, the flight tests were conducted only in a small space due to the limitation of Bluetooth signals. By using UWB signals, our proposed system can operate within at least a 100-level 20 region.
In our work, for the first stage, we adopt a similar approach as in the optimal sensor placement problem to solve the active initial RL problem under the constraint of the UAV’s motion. Actually, assuming an accurate self-displacement measurement, solving the RL problem is equivalent to estimating the relative position at the starting point, and the motion constraint can be projected on the sensor configuration. As a result, the active RL problem is reformulated as a constrained optimal sensor placement and we seek the optimal configuration to minimize the estimation error bound under the constraint. As shown later in Theorem 4.1, the lower error bound is dependent on the maximum allowed moving distance D and the sample size n, and hence one only needs to enlarge the ranging span to effectively reduce the estimation error. Such an idea will be applied to the design of ranging path, as seen in the Error analysis section. In the second stage, starting with the initial RL guess from the first stage, an EKF strategy is proposed to dynamically sustain the RL estimation and deal with the noise in distance and displacement measurements. Besides, in the literature, most of the works focus on the RL algorithm design. Field experiment solutions especially for miniature UAVs are rare.
Preliminaries and problem formulation
For convention, we use vector
Before delving into details, it is necessary to state the UWB based indirect cooperative RL problem. We consider a team of mobile UAVs labeled Distances, displacements, and relative position of three UAVs during simultaneous movement.
RL with a single beacon
Initial relative position estimation
To achieve an accurate RL estimate, we first assign one static UAV as a beacon while other UAVs range to and communicate with it. In this section, we shall formulate the problem in the perspective of nonlinear least squares (NLS).
As shown in Figure 2, O is the location of the static beacon and the mobile UAV starts from P0 and takes the distance measurement di at each point Pi, Relative localization to a static beacon.
Denote
The above equation can be rewritten as
If An initial guess is needed to solve (1), which comes from the solution of a related linear least squares problem with the squared distances Navidi et al.
22
To be exact, the distance measurements ri is corrupted by noise. However, to achieve an initial guess, we temporarily assume that ri is equal to di regardless of noise, i.e. Although equation (5) is able to give an exact solution when there is no ranging error, the introduction of squared distances would increase the error of calculation and only achieves a rough estimate in the presence of measurement noises. In fact, assume that the range measurement NLS-GN based Initial Relative Estimation 1: 2: 3: Starting from P0, the UAV moves along some nonlinear path as described in Error analysis section and collects n distance measurements ri. 4: 5: Form the linear equations as (5) based on the overall available ranging measurements ri and 6: calculate an initial rough estimate 7: 8: Set the maximum iteration step as S and the positive threshold of estimation error as ε. 9: 10: Calculate GN iteration (4) 11: 12: 13: 14: 15: Remark 4.1
Remark 4.2
Algorithm 1
Error analysis
In the sequel, we seek to find a lower bound of the estimation error of the problem (1) as well as the corresponding path configuration, given that the mobile UAV can only move by a maximum distance of D in the whole duration of RL. The optimum configuration then serves to guide the design of the ranging path, in an effort to effectively reduce the estimation error under the motion constraint.
In this section, we shall first analyze the RL error based on (4) to find its lower bound of MSE, and then propose an effective sampling strategy to reduce the error. Two different cases are studied, depending on whether or not the displacement measurements are corrupted by noise.
By (4), the error covariance matrix can be approximated by
Rewrite
Now the MSE is given by the trace of E as
Note that the error covariance matrix E is also the Cramer–Rao lower bound, or the inverse of the Fisher Information Matrix (FIM). In the GPS literature, the squared root of the MSE is defined as the geometric dilution of precision (GDOP) Levanon.
23
Remark 4.3
Given a fixed number of range measurements to minimize MSE is to maximize the cost function
Denote
Assume that the maximum allowed moving distance
Angle tags and maximum ranging span within the moving radius.Theorem 4.1

Correspondingly,
Notice that each individual term is the squared sine of a linear function of From the above result, we can see that the localization error is dependent on the maximum spanning angle Θ, as well as the sample size. Intuitively, a larger field of view and more distance measurements help reduce the localization error. By (12) and (8), we can see that
Proof
Remark 4.4
RL algorithm with a single beacon
Noise-free case
From Remark 4.4, we know that the key point of reducing localization error is to enlarge the spanning angle between Two-phase Active RL (TPARL)
Two-phase active RL with a single beacon. 1: 2: Phase I (Nonlinear): Starting from P0, the UAV moves along some nonlinear path of length D1 collecting distance measurements, and produce an initial rough estimate 3: Phase II (Piecewise linear): Assume that the mobile UAV is to move N steps of linear path of fixed length l and 4: 5: 6: 7: 8: Move along a straight line with the heading α 9: Collect new rangings and produce new estimate 10: 11: Several points have to be noted from the above algorithm. The first is the choice of length of the nonlinear path. Although an initial guess can be generated from three samples on a nonlinear path as seen from (5), the estimation error may be so large that a refined estimate in phase II may have a counter direction to that of the rough estimate, thus reduce the ranging span as a whole. Therefore, the nonlinear path should be long enough to guarantee a relatively accurate initial estimate. The second point is with regard to the choice of the shape of the nonlinear path. Notice that we assume no prior knowledge of the relative position
It should be noted that each change of heading actually serves to maximize the span from the initial position to the next waypoint based on the newest relative position estimate, and is only needed when there is a much different estimate from the last one. For simplicity, here we set a fixed length movement for each step, but it can be variable in practice.
Algorithm 2

Remark 4.5
Remark 4.6
Noise-corrupted case
In this subsection, we consider the design of EKF for the same strategy as that of the case without noise to take the displacement noise into account and the initial RL estimate from Algorithm 2 will be fed into the proposed EKF for the first step calculation. The sampling algorithm remains the same, while the estimate is provided by EKF.
For the EKF design, we need to determine the state equation and observation equation. The state space representation consists of the initial relative position
On the other hand, the observation consists of the relative range rk, and the displacement
In the state equation (14), we split the relative position into the initial relative position
Remark 4.7
Cooperative RL without beacon
In this section, we consider the case of a dynamic team as depicted in Figure 1, where the UAV serving as a “beacon” in RL algorithm with a single beacon is also mobile. For any two team members, say UAVi and UAVj, their kth relative position estimate
The observation consists of the relative range rk, and the displacement difference
Combined with the initial RL algorithm described in Algorithms 1 and 2, the proposed cooperative RL is summarized in Algorithm 3. Cooperative localization without beacon 1: 2: Data communication: UAVi sends range request to its neighbor UAVj and receive this neighbor’s displacement and height information (which helps transfer a 3D distance measurement into 2D). 3: Data matching: The collected UAVj’s information and measured distance are stored together with UAVi’s displacement and height measurement by querying the closest UAVi’s sensor data in the buffer. 4: Based on the initial relative estimate (Algorithm 2) and selected historical data (e.g. m samples), execute m steps EKF as (14) and (15) to initialize the estimation of the states of UAVs for the simultaneous movement. 5: 6: 7: 8: 9: 10: carry out EKF 11: update 12: calculate 13: 14: Algorithm 3
Simulation results
RL with a single beacon
In this section, the performance of the proposed distance-based active RL Algorithm 2 is evaluated by simulation, respectively, for the cases of noise-free and noise-corrupted self-displacements. Specifically, in the former case, we compare the performance of the proposed algorithm with that of a constant heading in the piecewise linear phase, in order to show the necessity of adaptive heading. Besides, we also compare the performance of our approach with that of other two methods in the literature.
For a comprehensive evaluation, the initial starting point P0 is placed at different circles centered at O, over a span of radius from 20 to 50 m with an interval of 5 m, as seen from the abscissa in Figures 5 and 6. On each circle, P0 will be randomly generated by the uniform distribution and 50 Monte–Carlo simulations are repeated, with the whole absolute localization error distribution on each direction represented in the form of box plot. Note that the blue box contains an approximate 95% of the data in case of Gaussian distribution, and the red one covers the data around the sample mean within one unit of standard deviation.
Absolute error comparison between switching heading and constant heading in noise-free case. (a) Absolute estimation error with switching heading in noisefree case, (b) Absolute estimation error with constant heading in noisefree case. Absolute error comparison in noise-corrupted case. (a) Absolute estimation error of the proposed RL, (b) Absolute estimation error by including augmented state Batista et al. (2013), (c) Absolute estimation error by recursive least squares method Indiveri et al. (2012).

Some common parameters in the simulation are listed as follows. The UAV moves at the speed of 1 m/s, the ranging frequency is given by 5 Hz, and the ranging noise
Noise-free case
With the noise-free displacement measurement, the RL problem is equivalent to estimating
Note that the mean values of the estimation errors in the constant heading case are still less than 0.5 m. In practice, without strict accuracy requirements, we may simply fix the heading after the initial rough estimation and the RL estimates still can be accepted.
Remark 6.1
Noise-corrupted case
In this subsection, we evaluate the performance of EKF in RL algorithm with a single beacon when the displacement measurement is corrupted by noise. Under the assumption of
Comparison with other methods
Here we compare the performance of our approach with that of Batista et al. 6 and Indiveri et al. 7 under the same setting in RL algorithm with a single beacon, as respectively shown in Figure 6(b) and (c). We can see that the absolute estimation error in Figure 6(b) generally ranges from 2 to 5 m, much larger than that in Figure 6(a), while the error in Figure 6(c) can be as large as dozens of meters. Such a poor performance when compared with the approach in this article can be expected, if we note that the augmented system in Batista et al. 6 requires more movement and samples to produce a convergent estimate, and the squared distance measurements result in an enlarged error and a poor estimate when the ranging path adopted in this article is not so nonlinear as that in Indiveri et al. 7
Cooperative RL without beacon
In this section, we will demonstrate the accuracy of the cooperative RL for 3 UAVs. In the simulation, the initial RL (with a single beacon) and its corresponding covariance are fed into the cooperative RL algorithm (without beacon) directly for the first step estimate. The noise of displacement difference
In practice, the relative position among UAVs is leveraged for achieving multi-UAV performance such as formation shape control, flocking, and coverage control, where the trajectory of each UAV is controlled by their behavior manager. To verify the proposed RL algorithm independently, a serpentine path of UAV1 is simply applied while UAV2 and UAV3 just fly straight and keep the same heading as that at the end of phase II (Algorithm 2). Besides, no control laws (e.g. obstacle avoidance, formation control, and path planning) are applied in the simulation.
The flight trajectories of three UAVs are depicted in Figure 7 and the statistic accuracy (350 tests as introduced in Section 6.1) of RL estimate during the simultaneous flight phase is shown in Figure 8. Figure 9 shows the indirect RL estimate of UAV3 to UAV2 ( Flight trajectories of three UAVs for 600 s in noise-corrupted case. Absolute RL estimation error with measurement noise during the simultaneous flight phase (600 s) in Figure 7. Absolute UAV3 to UAV2 RL error of indirect estimate (RL31-RL21) during simultaneous flight.


From Figure 7, it can be seen that the UAV2 and UAV3 fly a short nonlinear and linear path (RL initialization) before UAV1 starts to move. Note that cross trajectories occur during their simultaneous flight. That is because no control laws are imposed in the simulation. Figure 8 depicts that the estimation errors of
From the preceding simulations, the statistical errors of RL estimation for a relative long-term flight (612 s) can be achieved acceptably. Despite this, there still exists a possibility of RL estimate drift or even pointlessness over a long-time test, i.e. estimates getting worse and worse, or EKF divergence due to various unreasonable measurements or long-term data dropout. For safety issue, by setting a reasonable sample window (balancing the sampling rate with the moving speed), Algorithm 1 can be operated concurrently as a backup of RL estimate or for EKF recovery.
Remark 6.2
Experiments with UAVs
Methodology
We implemented the active RL algorithm on quadcopters and conducted tests in a sports field and near woods environment, respectively, as shown in Figures 10 and 16(b), where the moving UAVs navigate to their relative targets based on the beacon’s position and their relative position estimates to this beacon. Due to the large bandwidth (from 3.1 to 5.3 GHz), UWB is robust to multipath and non-line-of-sight effects, and provides a reliable long distance ranging with an accuracy of 10 cm. The quadcopter is equipped with Pixhawk integrating inertial measurement unit and flight controller on board, which is also connected to a GPS module for safety and displacement generator purpose. The detailed descriptions of the experimental hardware can be found in Guo et al.
25
It should be noted that although the distance measurement provided by UWB is a range in 3D space, the projected distance on 2D plane will be almost the same if the 3D distance is much larger than the height, say in our case, the mobile UAV always moved at a distance further than 20 m from the beacon, and at a height of no greater than 2 m.
Test environment.
To evaluate the active RL Algorithm 2, flight tests for two UAVs were carried out first as shown in Figure 10, where the moving UAV and the beacon are, respectively, circled in yellow and red. UWB modules were installed on both quadcopters, with one of them placed on the ground as the static beacon. Ten different starting (launching) points were chosen for ten different vectors
UWB based RL workflow
Figure 11(a) illustrates the workflow of the UWB-based RL system. The UWB module on a hovering quadcopter actively sends ranging requests to neighboring UAVs for distance measurement and communication based on the two-way time of flight ranging method described in Guo et al.
25
Once a distance is obtained by UAV2 and UAV3, it will be calibrated by linear regression where the calibration parameters are determined by a series of experiments in different environments.
25
This calibrated range then goes through the outlier detection before it is stored in the database. To reduce the computation and avoid excessive repetition of similar data, only selected distance measurements and neighbor’s information are recorded and stored. Note that RL initialization algorithm is executed first and its output serves as the initial state estimate for EKF. The follow-up localization is sustained by EKF in a recursive way. Finally, the RL estimate update will be fed into the flight controller for quadcopter navigation as depicted in Figure 11(b).
System structure. (a) UWB-based relative localization workflow, (b) Diagram of flight control workflow.
Evaluation
Flight tests for two UAVs
Absolute estimation error at launching points.
Absolute estimation error at landing points.

Variation of estimation error and flight trajectories. (a) Comparison of relative position estimate and ground truth, (b) Trajectories of 10 flight tests.
We also examine the variation of the estimation error of the starting point during the whole test, as shown in Figure 13. Note that totally seven different parts of path are involved in the whole duration of RL, corresponding to seven different estimates. From Figure 13, we observe that the error keeps decreasing with the increasing traveling distance, and the ranging along the piecewise linear path drastically reduces the error of the initial estimation. Such a result demonstrates the efficacy of reducing error by enlarging the ranging span as revealed in Theorem 4.1. On the other hand, the decrease of error after each change of heading suggests the necessity of changing heading. We also notice that after some steps the estimation error tends to stabilize, implying that the increase of ranging span is no longer able to counter the displacement error. It would be interesting to explore the relationship when the displacement error increases the estimation error would increase.
Variation of estimation error at the launching point. (a) |x| error, (b) |y| error.
Flight tests for three UAVs
To further validate the performance of the proposed RL method, flight tests of three UAVs were conducted at Dover in Singapore with an area of 30 m × 40 m (Figure 15(b)). First, to support RL estimation among three UAVs, we had to design a new UWB ranging and communication topology, which is different from that utilized for two UAVs. Without clock synchronization among UWB modules, a master-slave ranging mode is adopted to avoid sensing conflicts as shown in Figure 14. Specifically, in our RL context, the module installed on the UAV1 (denoted as M) will actively send its ranging request and odometry data to the module on UAV2 and UAV3 (denoted as S, assume that they are equivalent). The responder module (UAV2 or UAV3) will automatically respond to the ranging request it receives from UAV1 and at almost the same time, collect the ranging and odometry data, which has been synchronized by the master node one time slot afore.
Ranging topology in master-slave pattern. Fight demo of three UAVs at Dover. (a) Demo scenario at Dover, (b) Flight video shot of 3 UAVs at Dover. Flight performance based on RL estimate. (a) Performance of the relative localization for 3 UAV's (b) Accuracy of the relative localization measured after landing.


Subsequently, the UWB module M installed on UAV1 transmits and receives ranging command signal as well as odometry information from UAV3 at
Since the reference of our relative position estimate can be any customized coordinates, a baseline herein, which is parallel to a ditch nearby with 233° relative to the north, was first selected for calibrating the ground truth. UAV2 and UAV3 are desired to navigate to their destinations based on the gap between the preconfigured destinations and their real-time RL estimates to UAV1. In this test, the destination points relative to UAV1 were set preliminarily based on the method in “Methodology” section and marked by yellow flags as shown in Figure 16(a). UAVs will hover over their destination points once they reach them. Such performance is able to demonstrate the practicability of our UWB-based RL and the accuracy of landing reveals the applicability of the proposed RL method. The following gives a detailed description on the flight experiment.
Initially, UAV1 was set in the middle of UAV2 and UAV3 with 20 m interval as shown in Figure 15(a). Once all of the UAVs were powered on, UAV1 started automatically to range and communicate with UAV2 and UAV3. After taking off, UAV1 hovered over its launching point while UAV2 and UAV3 flied along a preset non-linear trajectory for collecting distance data and calculating the initial relative position, namely phase I. The RL estimates were improved when UAV2 and UAV3 flied along a piecewise linear path during phase II. Based on the difference between the on-line RL estimates and their preset destinations, in phase III, UAV2 and UAV3 navigated directly to their desired target positions (the red stars in Figure 15(a)). The flight video shot is shown in Figure 15(b). Finally, all of these three UAVs hovered over their destination as seen in Figure 16(a) with the position error less than 1 m (Figure 16(b)). More details concerning the trajectories of three UAVs are depicted in Figure 17 where UAV2 and UAV3 eventually hovered over their relative targets to UAV1, at (0,30) and (0,15), respectively, and the RL estimation errors both fell within the 1 m error bound (cyan circle). Note that the RL error was measured after UAVs landed and in fact there always existed an inevitable drift during landing (e.g. avoiding to land on the marker, wind disturbance, near-ground effect, and control accuracy). Such drift would cause an undesirable deviation from each RL estimate. Despite this, a RL accuracy with less than 1 m can be acceptable when considering the maximum inter-UAV distance reaches 30 m in this scenario.
a
Flight paths of three UAVs.
Conclusion and future work
In this article, we studied the UWB based RL problem for an UAV team. An active RL problem of a moving UAV with respect to a static UAV under motion constraint was introduced first. Specifically, the UAV is only allowed to move by a small distance D and we aim to find a ranging path to reduce the RL error as much as possible. We considered the problem under the assumption of noise-free self-displacements, and reformulated it as an optimal sensor placement under constraint. A lower bound of estimation error was obtained in terms of D and the sample size n, and it was found that one only needs to enlarge the ranging span and increase the number of rangings to reduce the error. In this light, we designed an active RL algorithm to enlarge the ranging span within the distance D, and applied the EKF to account for the noise in the displacement measurements. Simulations and experiments were conducted to validate the algorithm.
Second, we extend the EKF based RL method to a dynamic UAV group where the beacon UAV starts to move and the RL estimate from the first stage is fed into the proposed cooperative RL algorithm for initialization. To validate the accuracy of the cooperative RL estimate during UAVs’ simultaneous flight, a simulation without control laws was conducted and the error bound falls within 0.8 m on average.
In the future, we shall extend the current result to 3D case under more general constraint, and an UWB based formation flight will be explored and tested.
Footnotes
Acknowledgements
We would like to thank Mr. Abdul Hanif Zaini, Mr. Thien Minh Nguyen, Mr. Bing Cheng Yeap and Mr. Dong Wei for their assistance in numerous flight tests.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
