Abstract
Abstract In this paper, we work to develop a path planning solution for a group of Unmanned Aerial Vehicles (UAVs) using a Mixed Integer Linear Programming (MILP) approach. Co-operation among team members not only helps reduce mission time, it makes the execution more robust in dynamic environments. However, the problem becomes more challenging as it requires optimal resource allocation and is NP-hard. Since UAVs may be lost or may suffer significant damage during the course of the mission, plans may need to be modified in real-time as the mission proceeds. Therefore, multiple UAVs have a better chance of completing a mission in the face of failures. Such military operations can be treated as a variant of the Multiple Depot Vehicle Routing Problem (MDVRP). The proposed solution must be such that m UAVs start from multiple source locations to visit n targets and return to a set of destination locations such that (1) each target is visited exactly by one of the chosen UAVs (2) the total distance travelled by the group is minimized and (3) the number of targets that each UAV visits may not be less than K or greater than L.
Keywords
1. Introduction
Small autonomous vehicles have paved the way for numerous military and civil applications, which are currently an active area of research. Military applications may involve the use of UAVs, where operations may not be restricted to just one depot or launch site and so the scenario may require several different launch and landing sites. Unmanned Aerial Vehicles (UAVs) not only eliminate threat to pilot's life, they require less power, leading to longer mission times and more flexible manoeuvring. To utilize the full potential of a team of UAVs, efficient methods of decentralized sensing and path planning must be employed.
The UAV task assignment problem has been explored from many different angles. Auction-based approaches have been used for task allocation to multiple robots to find feasible or even guaranteed quality solutions easily using PRIM's algorithm [1]. Sariel et al. [2] propose a framework that employs an incremental task allocation method, dynamically adapting to the changes in the environment. Gao et al. [3] present a decentralized allocation algorithm, which combines a sequential single-task auction with a task transfer among robots of the same sub-team. The tasks allocated to the sub-team are clustered using an orthogonal genetic algorithm, which further improves the efficiency of task transfer. In the absence of threats, the multivehicle target assignment problem is related to the dual or multiple vehicle problem where the sum of the distances travelled by the vehicles is sought to be minimized. This approach is presented in [4], where seven vehicles are used to visit fifty targets. However, this approach does not incorporate load balancing. Methods for solving the multivehicle, target assignment problem in the presence of threats are presented by Maddula et al. in [5]. The objective, in this case, is to minimize the maximum path length.
Use of a centralized planner such as GRAMMPS [6] has also been made for constrained task allocation and planning. Centralized algorithms have been used to solve the path planning problem. Cui et al. [7] propose a search method based on Probabilistic Road Maps (PRM) to obtain the Pareto-optimal coordination solution for multiple agents. A decoupled approach is used, consisting of two parts, building the roadmap in a coordination space and searching for Pareto-optimal solutions in the co-ordination roadmap. Decoupled planning does not increase the dimensionality of the configuration space and is quite practical [8].
Desaraju et al. [9] present the Decentralized Multi-Agent Rapidly exploring Random Tree algorithm by extending the Closed Loop Rapidly exploring Tree (CL-RRT) strategy to the multi-agent case. It gives priority to re-plan to agents with greater potential to improve the path and allows an agent to modify its teammates' plan in order to select paths that reduce their combined cost. The strategy suits scenarios where a large number of agents operate in cluttered environments. However, it requires that the agents have emergency nodes in their trajectory where they can halt to avoid collision/deadlocks. The approach needs to be modified for air vehicles which cannot have zero forward speed.
Though many different approaches have been used to solve the task allocation problem, an efficient integer programming formulation gives flexibility, whereby additional constraints can be easily added to obtain a realistic, optimal and feasible solution to the path planning problem. For example, the UAV may have a constraint on the turning radius. Avoidance of travel legs with a risk factor above a certain threshold may be modelled as a constraint. An attempt at balancing loads across different UAVs can be incorporated. Thus, a variety of constraints can be included in the mathematical programming model presented here.
Prior to this work, MILP has been combined with clustering techniques [10] to reduce the problem size. An attempt to use MILP for computing fuel-optimal paths for multiple vehicles was made by Schauwenaars et al. [11]. In this work, MILP is applied in a receding horizon framework to design a series of short trajectories to the goal instead of one long trajectory. The Receding Horizon (RH) approach was adopted as opposed to a fixed arrival time to reduce solution time. Simulations of moving a single vehicle around ten obstacles demonstrate its feasibility using different horizon lengths. The approach was extended to calculate the trajectories of Autonomous Aerial Vehicles by Bellingham et al. [12, 13].
In this paper, we propose a mixed integer linear programming model to be used for m multiple vehicles. The vehicles carry sensors that have footprints several times greater than 2r, where r is the minimum turn radius of the vehicle. Assuming the targets to be sufficiently spread, reasonably flyable, optimal paths may be computed. Path planning is carried out prior to mission execution using the information available a-priori. However, as the real-time assessments, carried out using on-board sensors, reveal obstacles and/or high risk zones, more informed actions are needed. The team may suffer the loss of one or more of its members. Paths need to be re-computed in real-time to cater for such contingencies. Each UAV must have on-board computational resources to compute a new plan using the information acquired. The communication of solutions is implicit, that is, it is accomplished by all vehicles running the same model in parallel. Information collected by on-board sensors is shared among all the vehicles. The vehicle travelling longest determines the duration of the mission. Thus load-balancing among team members plays a crucial role in reducing mission execution time.
The problem of resource allocation is well known as being NP-hard. An important point to be mentioned here is that NP-classification is defined in terms of worst-case behaviour, not average case behaviour. For most practical situations, the main interest is in average case behaviour. Asides from this application, lower bounds can still be derived and used in branch and bound solvers of commercial software to get faster results. Rathinam et al. [14] have extended the lower bound of Held-Karp [15], the best-known result for the single Travelling Salesman Problem (TSP), to the multiple depot, multiple TSP.
2. Related Work
Over the past two decades the MDVRP has attracted greater attention due to its applicability in many real-life scheduling and routing problems. Bektas et al. [16] gives a comprehensive review of these applications as well as the known approaches. MILP based methods have been gaining in popularity as a means of solving a wide variety of planning problems because they are provably optimal and can handle a wide variety of constraints. The main drawback is that when they are applied to uncertain environments, the solutions must be obtained quickly. Furthermore, the problem may need to be resolved quite frequently. The work presented in this paper tries to overcome these issues through combining MILP with less optimal methods to improve the computation time for large problems.
Specific cases of resource allocation problems can also be formulated as a MILP problem. Related versions of this problem can be found in [17–19]. Kara et al. [20] propose a polynomial-size integer linear programming model for this problem, with improvements in subtour elimination constraints (SECs). Thus, a practical formulation that is solvable in real-time with a commercial code will pave the way for real-time implementations.
In previous work by Bellingham [12], paths are optimized to compensate for UAV loss and to exploit the coupling effects of cooperation between UAVs to improve survival probability. Since MILP uses discrete points along the trajectory, the computational complexity grows with the length of the horizon and the number of obstacles. Simulation results using 4 vehicles and 8 targets are given. It takes 60 minutes to find a feasible solution, yet the problem is not solved to completion. Results in [13] show that for fixed-wing autonomous vehicles, with 200 discrete steps and eight obstacles, the computation takes over 600 seconds. In our work, UAV attrition is taken into account and risky paths are avoided. Since our work deals with small UAVs in sparse environments and targets are scattered, the paths computed are flyable. For small and medium-sized instances, MILP solutions are computed quickly, qualifying for real-time applications.
Two recent papers, Kim et al. [21] and Alidaee et al. [22] have considered the problem of m identical UAVs that are to serve n targets. The problem is to find a sequence of visiting targets for each UAV, starting from a depot, such that all targets are visited only once by a UAV. The formulations also have an additional bound on the maximum number of targets each UAV can visit. In our implementation, we limit both the minimum and maximum number of nodes that a UAV should visit which allows the decision maker to assign appropriate work-loads to each vehicle. Post-optimality analysis can be conducted simply by varying the parameters of the problem and observing the changes in the solution values.
Julia et al. [23] provide some insight into the path planning strategies to be used for different applications. It is shown that coordination improves if agents use techniques that provide close to global optimal solutions.
The main contributions of this work are: (i) a different approach to catering for real-time path planning for moderate-sized problems; (ii) plans are computed in a decentralized manner yet solutions are optimal since all vehicles share the same information; (iii) use of a strategy to avoid collisions and static obstacles; (vi) plans are reconfigurable in real-time in case of UAV attrition.
Computational analysis shows that moderate-sized instances can be solved to optimality via the proposed formulations. Since the problem considered in [22] is closely related to the one being considered in this work, the two formulations will be examined for performance in Section 3. In Section 4, simulation and experimental results will be presented. The computed path plans will also be compared to those computed in [3] and to those produced by a variant of the open-source Matlab code available at www.matlabcentral.com by Joseph Kirk.
3. Problem Description
Let T = {t1, t2, … tn} be the set of targets to be visited. There are m UAVs initially located at source vertices S = {s1, s2, … sm}. The cardinality of S is m and that of T is n. Each depot sends out one vehicle so that the number of depots is equal to the number of UAVs. Each UAV is required to visit at least K and at most L targets, and then reach a final location. There are m final locations denoted by the set F = {f1, f2, … fm}. The problem is to assign a sequence of targets to each vehicle such that its path contains one source vertex, a subset of target vertices and one final location. Given a set of m UAVs such that m < n where n is the number of targets to be monitored. The objective is to minimize the total distance travelled by all the vehicles.
Minimize
In this case cij is the cost between targets i and j,xij is a binary variable, which is ‘1’ if the link between i and j, is part of the optimal solution and ‘0’ otherwise. The vehicles share information acquired by their sensors. Using the updated information, each UAV can find the shortest paths in real-time which may need to be modified whenever obstacles are detected.
Our problem is an Open Multiple Depot Vehicle Routing Problem (OMDVRP) as UAVs end their journey at one of the specified final m locations and do not return to the starting depots. The problem is formulated as a variant of the model given by Kara et al. [20], where each source vertex has one UAV. The total number of UAVs would thus be the same as the number of source vertices. The model is solved subject to the following constraints.
subject to
It is to be noted that for each i ∈ S and j ∈ F, one inward and one outward edge is guaranteed by constraints (i) and (ii). This is true as one UAV starts from each source vertex and terminates its flight at one of the final locations. The degree of each of these vertices is one. Equations (iii) and (iv) are the degree constraints for the target node ensuring that each node entered must also be left. Thus, each target node has a degree of two. Constraints (v) and (vi) impose bounds on the number of target nodes a UAV visits together with initializing the value of the ui as 1 if i is the first target visited on the mission. For any UAV, ui is the sequence number of targets visited on that vehicle's path from the source vertex up to the final location (i.e. visit number of the ith target), L is the maximum number of targets an agent may visit; thus 1 ≤ ui ≤ L for all i ≥ k. In addition, if K is the minimum number of nodes an agent must visit, i.e., if xi1 = 1, then K ≤ ui ≤ L must be satisfied.
Constraint (vii) prevents a UAV path consisting of only one vertex from S and one vertex from F, since such a tour does not visit any target. Finally, constraint (viii) is the Sub Elimination Constraints (SECs) that break all sub-tours between target nodes. Each path contains exactly one vertex from S, one vertex from F and a subset of vertices from T. Constraint (x) allows only links with risk factor rij ≤ rthreshold to be used in the solution. The risk factor assigned to a particular link will be based on information collected by the sensors. rij is the risk factor of the link joining nodes i and j. It is to be assigned a value between 1 and 10 based on how risky traversal along that arc is considered. In this formulation, a link will not be used for traversal if the risk associated with it is greater than 9. Table 1 provides the details of variables and constraints for the formulation in [22], referred to as AWL, as well as our formulation. The working is shown for the case where UAVs do not return to their depots. However, they must end their journey at one of the designated end point locations. Table 2 shows a comparison of the two formulations in terms of the total variables and constraints for the benchmarks of VRP [24] and TSP [25] used in this work. As can be seen from the table, there is a significant reduction in the number of variables and constraints.
List of variables and constraints for AWH and our formulation
Variables and constraints for the benchmark instances used.
4. Simulation Results
An integer programming model based on the problem formulated in Section 3 was simulated using Lingo 11.0 optimization software to study the behaviour of UAV teams and the time required to solve various instances to optimality. To the best of our knowledge no specific benchmarks have been proposed for open vehicle routing problems. VRP benchmarks without any hard constraints have been used to verify the performance of the MILP model.
4.1 Static Environments
Experiments were carried out to study the different aspects of the path planning problem. A team of identical UAVs starts from the source vertices, information relating to the environment is known a-priori and it is thus static. Vehicles complete the mission without suffering any losses and their journey terminates when the UAVs reach the final locations. The optimal plan generated using the MILP formulation is checked for cross-over using a Path Analyzer in Matlab. If there are no intersections between the paths of any vehicles, the mission is executed. However, in case vehicle paths are found to cross-over, the time interval between crossings is computed using the heading angle between successive targets on a given path and the UAV speed. If the interval is found to be greater than tthreshold seconds, no change is needed. However, if the interval between crossings is less than tthreshold seconds, the velocity of one of the UAVs is reduced such that the time difference becomes equal to or greater than tthreshold. For example, it would not be harmful if the vehicles cross-over the same point with some time difference, but it could cause a collision of the vehicles if the cross-over occurs at the same time. If UAV1 crosses the point x1, y1 at time t1, while UAV2 crosses the same point at time t2, it would cause no harm if t2 – t1 ≥ tthreshold. For example, a value of 0.5 for tthreshold would be sufficient as UAV1 and UAV2 would be 50 meters apart if both are moving with a velocity of 100 meters/second. The paths thus obtained are collision-free and avoid known threat zones by using only those links which have an associated risk factor rij ≤ rthreshold.
Figure 1 shows the allocation results of a VRP benchmark a-n32-k5 among four vehicles using Ortho Cluster Algorithm Allocation (OCA_Alloc) [3]. The start locations are the first four coordinates in the benchmark respectively and are marked as ‘0’ and the maximum path length is 176 units. A modified version of the Matlab code by Joseph Kirk available at www.mathworks.com/matlabcentral [26] was also used, which computes the path using a genetic algorithm. It will be referred to as JK implementation in our work. This gives a maximum path length of 134, when evolved over 5000 generations. A plot corresponding to this is shown in Figure 2.

Paths computed using OCA-Alloc algorithm.

Paths computed using JK's implementation.
However, the maximum path length is further reduced to 121 units when the MILP model is solved using Lingo11.0 as shown in Figure 3. This indicates an improvement of 31% over OCA_Alloc [3]. The results in Table 3 show the average value of 30 simulations for each instance. All except the first benchmark used five UAVs. A comparison of the path length with an auction-based representative algorithm, OCA_Alloc [3] indicates that a performance enhancement of up to 34% is achieved. The last column shows the time taken to solve the benchmark to optimality. For the instances kroa100 and kroa200, ‘*’ in the MILP solution time column shows that an optimal solution could not be obtained within a time limit of three hours.
Performance of OCA_Alloc versus MILP.

Optimal paths computed by solving the MILP model.
The optimization software Lingo 11.0 has been used to solve each formulation on a Core2 Duo 2.20 GHz processor running Windows 7.0. The results indicate our approach can be used to solve moderate-size path planning problems in real-time situations. Table 4 shows the performance of the JK implementation and the MILP model. Each instance was solved 30 times and the average value obtained is shown.
Comparison of JK and MILP.
More experiments were carried out to compare the performance of solutions obtained using the MILP model with the benchmarks given in [3]. Figure 4 depicts the paths for a-n80-k10, indicating the maximum distance travelled as 153 units. The paths are conflict-free with reasonably balanced loads. The start locations are the first five coordinates of the benchmark.

Optimal paths for a-n80-k10 benchmark.
However for large-sized problems, it sometimes becomes difficult to find the best solution in a limited time. The feasible path found for the instance of kroa100 at the end of three hours has been plotted in Figure 5. Though intersections are seen in the paths, these cross-overs are allowed after analysis by the Path Analyzer. An alternate path-planning strategy for large-sized problems was to partition the entire region of the benchmark into reasonably small-sized chunks. The path for each partition is then planned which takes a comparatively smaller time, as can be seen from the last column in Table 5. These results indicate that the solution, though not optimal, may still be better than those obtained with the use of other methods such as OCA-Alloc and JK implementation.
Reduced solution times for MILP with partitioning.

Feasible paths found for kroa100 benchmark.
The entire region for the benchmark kroa100 is partitioned into two parts. Path planning for both can be done simultaneously. Each partition shows three paths. The start (4,1) and end (15, 22) nodes for two vehicles are contained in partition 1 as can be seen from Figure 6.

Path planned for partition 1 of kroa100 benchmark
However, for the third vehicle, the start node (2) lies in partition 2, while the end node (6) lies in partition 1. So the third vehicle will travel a length of 4342 (node 2–23) plus 2020 (node 23–6), total distance of 6362 units. Figure 7 shows that the fourth and fifth vehicles have their start (3, 5) and end (71, 52) nodes in partition 2 and hence travel a shorter distance. Thus, the total number of vehicles used is five. The traversal path of each can be seen in Figures 6 & 7.

Path planned for partition 2 of kroa100 benchmark.
4.2 Uncertain Environments
In the next set of experiments, teams of UAVs start from source vertices and move along their paths a certain distance. Since UAVs used for surveillance missions may be lost or significantly damaged during operations, employing multiple UAVs can increase the chances of success. However, their efficiency depends on the extent of their collaboration. In such cases, the surviving vehicles must collaborate to share the load of the destroyed vehicle.
Figure 8 depicts this scenario for the benchmark a-n32-k5, where one UAV has been lost and the remaining three collaborate to complete the mission. The graph originally used to solve the model is then modified. The target nodes already visited by the team are deleted along with their links. This is evident from Figure 8, where the four start nodes as well as the visited target nodes (9, 13, 20, 23 27) have been deleted. The present positions of the vehicles are taken as the depot nodes and the path plan is recomputed in order for them to visit the remaining targets. However, the complexity of the problem is reduced as the problem size is smaller than the original one and consequently the time required to find a new solution is reduced.

Path re-computed after the loss of one UAV.
Figure 9 shows the loss of one UAV being shared by the remaining four for the benchmark kroa100. Re-computing the path took thirty- six seconds.

Path re-computed after the loss of one UAV.
Lastly, during the mission, changes in the environment may be detected. Ellipsoidal obstacles may be revealed along the traversal path and must be avoided. For obstacle avoidance, the path is slightly changed by adding two way-points tangential to the ellipse. The targets on either side of the ellipsoidal obstacle are now connected through one of these way-points. The modified path will consist of two line segments replacing the single line connecting the two consecutive targets representing the UAVs' traversal path. In this way, only a slight modification of the path occurs which does not affect the optimality of the solution in any major way. Figure 10 illustrates the changes in the planned path as obstacles pop up. The mission now takes a little longer to complete. Figure 11 illustrates the avoidance of obstacles that are detected along the travel path for kroa100.

Altered path plan to avoid obstacles.

Obstacle avoidance for larger instances.
5. Conclusions
A real-time optimization solution for UAV path planning in dynamic situations is discussed. A MILP model which can be readily solved with commercially available tools is proposed. The vehicles carry on-board sensors which can collect information, as well as communication devices to transmit the data.
Initially an optimal path is computed by the computing resources on-board each UAV. The vehicles traverse multiple target regions and avoid collision with obstacles and other team members in a dynamic hostile environment. Most of the computational processing and communication must be carried out by each vehicle in real-time on-board. These UAVs operate in situations where the available information about the environment is known with a limited level of certainty. Failures of system components are likely to occur during missions, e.g. failures in communication links or the loss of a team member. The planning-system must be able to adapt to unexpected failures and reconfigure the mission plan. This will enhance the chances of achieving success in completing a particular mission.
6. Acknowledgments
We are grateful to Professor Ping-an Gao and Professor Fanggeng Zhao for providing us with their test problems and answering our queries regarding their approach. We would also like to thank Joseph Kirk for help with the open source Matlab code on matlabcentral. The research was funded and supported by the University of Engineering and Technology, Taxila, Pakistan.
