Abstract
In this article, a tree search algorithm is proposed to find the near optimal conflict avoidance solutions for unmanned aerial vehicles. In the dynamic environment, the unmodeled elements, such as wind, would make UAVs deviate from nominal traces. It brings about difficulties for conflict detection and resolution. The back propagation neural networks are utilized to approximate the unmodeled dynamics of the environment. To satisfy the online planning requirement, the search length of the tree search algorithm would be limited. Therefore, the algorithm may not be able to reach the goal states in search process. The midterm reward function for assessing each node is devised, with consideration given to two factors, namely, the safe separation requirement and the mission of each unmanned aerial vehicle. The simulation examples and the comparisons with previous approaches are provided to illustrate the smooth and convincing behaviours of the proposed algorithm.
Introduction
The applications of unmanned aerial vehicles (UAVs) in military and civilian fields have achieved great success in recent years. UAVs have advantages over manned planes, such as low manufacturing and operational costs, flexibility in accommodating different payloads, risk reduction of human lives (no pilot or crew), and so on. The advantages and prosperous experiences have spurred great number of communities and organizations to research on UAVs. However, most UAVs are only allowed to fly in segregated airspace nowadays, which has largely restricted the application of UAVs because the segregated airspaces are often far away from populated area. Therefore, the most fundamental problem in UAVs application promotion is airspace integration. 1 The UAVs air traffic management (ATM) is a necessity when UAVs are permitted to fly into nonsegregated airspace. In order to meet with the “do not harm” criterion, the first and foremost duty of ATM is to guarantee the airspace safety. Naturally, the airspace conflict detection and resolution (CDR) problem is a primary concern. 2
The CDR on manned plane considers less on optimizing the trajectories of airplanes. Unnecessary fuel is wasted, and atmospheric pollution increases because of this inefficient mechanism. Studies have already been processing on improving the efficiency of manned CDR. 3 –5 However, there are differences between UAVs and manned planes: there is no pilot in the cabins; the sizes of civil UAVs are much smaller. Furthermore, most UAVs cannot load as many devices as manned planes, such as Traffic Collision Avoidance System II.
The objective of conflict resolution is to guide conflict-related UAVs return to predefined paths efficiently. Kinds of conflict resolution methods are proposed in the literature, such as trajectories planning methods, reactive methods. The safe separation distances between UAVs are anticipated to be smaller than that of manned planes because of their characteristics, such as platform sizes and maneuverability. As a result, the dynamic environment, such as wind, may have more obvious effluences on the states of UAVs. The long-term trajectories planning methods are time-consuming and inefficient to deal with uncertainties of environment. The reactive methods are lack of the ability to consider the objectives of CDR comprehensively, which are to keep the safe separation between UAVs, and to minimize the deviation from nominal paths. In this article, we study on the tree search-based CDR algorithm. We consider the circumstance that the ATM system keeps on communicating with UAVs. It guides UAVs to take cooperative conflict-free maneuvers when they are predicted to lose separation. There are three kinds of strategies to resolve airspace conflicts, namely heading control, speed control, and attitude control. UAVs usually fly on track laterally and to maintain a particular airspeed or Mach number when they are en route. This article researches on conflict resolution by heading control.
The tree search algorithm is widely used in AI field. It is a technique for solving practical problems in combinatorial optimization. In the conflict resolution problem, a decision is very difficult to assess before having a complete solution, therefore requiring the simulation of the process to probe into the quality of the decision. However, the uncertainty dynamics would disturb the simulation. As it is difficult to construct accurate mathematical models for the uncertainty of environment and system, various kinds of techniques are proposed to approximate the plants, such as the fuzzy logic systems (FLS) and neural networks (NNs) 6,7 Fang Wang et al. propose to approximate the continuous uncertainty function by FLS. 6 Yan-Jun Liu et al.’s study on the optimization control problem for discrete-time system. 8 Meanwhile, NNs are also widely applied to approximate the unmodeled dynamics. Shitie Zhao et al. propose to approximate the unknown nonlinear function by affine-type NN. 9 Radial basis functions NNs are applied to approximate the continuous function 7,10 Since wind varies in dynamic environment. In order to meet with the online approximation requirement, the back propagation NNs (BPNNs) are utilized to approximate the wind effect. 11 The weights of the NNs are updated according to the change of environment.
To meet with the online planning requirement and adapt to the change of the environment, we propose to search conflict-free maneuvers in a limited time interval (look-ahead interval) PN . Such a method requires that the optimization task is solved periodically to ensure that the moving look-ahead interval is always placed in the future. The decentralized conflict resolution method is efficient in searching the near optimal de-conflict policy. The tree search algorithm may not be able to expand to the goal states in each online search process as PN is limit. The midterm reward function is designed to evaluate the intermediate nodes of the tree. Several features are considered in midterm reward function, namely safe separation constraint and flight efficiency (priority, maneuver, and costs). 12
The remainder of this article is organized as follows. Section “Related works” scrutinizes related works; the tree search algorithm is presented in section “Tree search conflict resolution algorithm”. In section “Conflict detection and resolution analysis”, we study on the conflict resolution problem and devise the midterm reward function. The multi-UAV conflict resolution problem is studied and the tree search midterm reward function is proposed in section “Multi-UAV conflict resolution”. The performance of our tree search algorithm is demonstrated by comparing it with existing algorithms in section “Simulation experiments.” Conclusions on our works are presented in “Conclusions” section.
Related works
The literature that deals with the CDR problem is rapidly growing in volume. 13 Various methods have been proposed. Trajectories optimization methods are centralized methods. They aim to find a series that guarantee conflict-free maneuvers between UAVs with minimum overall consumption, such as genetic algorithm-based method 14,15 and centralized mixed integer quadratic programming (MIQP). 16,17 However, these methods are time-consuming, even the linearized methods. 18,19 Distributed algorithms are therefore proposed. The fast reactive local motion planning method is applied in many scenarios. It is efficient in keeping aerial vehicles apart. It is typically applied to respond to unexpected events or errors in the environment model. 20 However, it neglects minimizing the fuel consumption and time delay in searching conflict-free solutions. Some researchers consider taking the returning to the nominal paths into reactive methods, such as the navigation function-based method, 21 collision cone-based algorithm, 22 potential field theory-based algorithm, 23 and velocity obstacles collision avoidance algorithms. 24,25 There are some other distributed algorithms that are proposed to deal with airspace conflict resolution problem. Santosh Devasia et al. bring forth the concept of conflict-resolution procedures. This is a very brilliant idea in ATM. 5 David Šišlák et al. propose a distributed agent-based cooperative collision avoidance algorithm. 3 They suppose that each aircraft knows part of the future intentions of neighboring planes during CDR process. Planes would plan conflict-free trajectories cooperatively if conflict is detected. In summary, the planning algorithms and distributed algorithms have been researched for many years and perform well in many CDR problems.
In order to improve the airspace efficiency, the safe separation distance between UAVs should be smaller. In addition to that, the dynamic factors, such as wind, may have effect on UAVs states easily because UAVs are much lighter. The CDR problem would be intractable when UAVs are flying in congested and dynamic environment. The development of model predictive control (MPC) has spurred its application on multi-agent navigation problem. Martin Saska et al. propose to coordinate and control heterogeneous vehicles in virtue of MPC. 26 The safe separation is guaranteed in trajectory following process. The MPC pattern is of advantage in dealing with the navigation problems in dynamic environment. 27 Some of the MPC problems are solved by standard mathematical programming methods while some MPC problems are solved by tree search method. 28,29 As the unmodeled dynamics are difficult to describe in an analytical form, we propose to approximate the unmodeled dynamics by NNs. 30 In this article, the cooperative conflict resolution maneuvers are obtained by heuristic tree search method.
Tree search conflict resolution algorithm
UAV model
The tree search algorithm is based on the model of UAVs. The point mass aircraft model described in the study by Menon et al. 31 is commonly accepted to represent dynamical effects in civil aviation. Due to the complexity, there is a problem that searching safe separation maneuvers of several interfering aircrafts with such a model would require too much computational effort. When UAVs are flying in a constant attitude, a good approximation of aircraft point mass model dynamics is used in the study by Bicchi and Pallotino. 32 The state of Ai at time t is represented by vector
where
where function
where
where ϕmax is the max roll angular and γmax is the max pitch angular, n and g are constants.

UAV motion modification. UAV: unmanned aerial vehicle.
We study on the CDR problem in two-dimensional (2-D) planar space. The safe region is a circle with safe radius ri
in 2-D space, and the state of A
i in 2-D space is
Approximate the effect of wind by NNs
The UAV model in “UAV model” section is a simplified linear model. There are bounded high-order nonlinear effects in UAVs system. In addition to that, there are external disturbances from the environment, such as wind. This would inevitably lead to unmodeled dynamics because of mismatching between the navigation design model and the actual plant. We define the unmodeled dynamics parts
The BPNN is trained by leaning the effect of unmodeled dynamics on UAVs. In this article, the effect of the unmodeled dynamics on UAVs is expressed by error between the predicted state and real state
In the system, the main factor in the unmodeled dynamics is the wind effect. According to the dynamics of UAVs, the effect of wind on each UAV is relevant to the variables
where ui is the heading maneuver. The state prediction error
The detail of BPNN can refer to the study by Haykin. 35 The real state of UAV A i is estimated by equation
Model-based tree search in conflict resolution
Primary concept
The tree search conflict resolution algorithm computes a motion series, which is a continuous function from time variable into state space,
Cooperative tree search motion planning algorithm plans tracks for multiple vehicles. For N vehicles, the composite configuration space
starting at the initial states of vehicles,
avoiding loss separation of N cooperative UAVs,
minimizing additional fuel consumptions and reducing deviations from nominal paths,
respecting the kinematic and dynamic constraints of UAVs.
Several problems should be discussed when applying the tree search method in UAV CDR.
A. Action
In the tree search algorithm, actions are candidate safe separation maneuvers. In this article, the conflict resolution is solved by headings control. The control variable of A
i is angular velocity wi. The action in multi-UAV conflict resolution is
B. budget of expansion
The tree search algorithm should be defined with budget of expansion because of the following reasons. Firstly, the computation time and memory space is limited, it is inappropriate to expand the tree to the goal state. Secondly, the NN cannot give a perfect precise estimation on the effect of unmodeled dynamics, the prediction errors would accumulate along the expanded nodes.
C. Middle-term rewards
A reward function maps perceived states (or state–action pairs) of the environment to a single dimensionless number. A reward value indicates the intrinsic desirability of the state. The sole objective of the tree search algorithm is to maximize the total reward that it receives in the long run.
As the budget of expansion is limit, the tree search algorithm may not be able to expand to the goal states in the search process. Therefore, this article proposes to criticize nodes by midterm rewards. The algorithm expands nodes based on the midterm rewards.
The reward function
Tree search algorithm
The state of UAV A
i is estimated by equation (7). As it is discussed above, the middle-term reward of each node is obtained by the reward function. For each terminal node
where
Equation (9) denotes that
Then we consider how to expand the tree such that we could arrive at a near-optimal decision within allowed computational budget. We apply the best first tree search method. The best first search method develops trees non-uniformly. It expands nodes to a deeper depth that look ‘promising’. In order to do this, the algorithm chooses to expand node with the highest value
Fiorini et al. show theoretically that non-uniform trees developed by the score will never perform worse than uniform trees for the same budget of expansion. 37 The performance of non-uniform tree search method is problem specific. In conflict resolution problem, the midterm reward value denotes safe separation relationship between UAVs and the effect of maneuvers on returning to goal states. The efficiency of the tree search algorithm is largely determined by the structure of the midterm reward function.
CDR analysis
UAV CDR problem analysis
The safe region of UAV A
i is defined as
where
In the future, UAVs would be connected with ATM by communication devices, such as the Automatic Dependent Surveillance-Broadcast system. The local ATM system is able to obtain the real-time states of UAVs. We define the alert distance as
Supposing that the conflict resolution process persists on M periods, the transition points can be obtained by the following equation,
CDR midterm reward
Horizontal maneuver discussion
The relationship between A
i and A
j is studied in local frame. We define
There is potential conflict hazard between UAVs if the following condition is satisfied
where
The differential unit position of A j relative to A i is
The displacement of A
j relative to A
j in
Conflict resolution midterm reward
One of the most important issues in tree search algorithm is to design a reasonable midterm reward function. In the tree search algorithm, airspace conflicts are expected to be resolved by sequential (multistep) maneuvers, which would reduce the deviations of UAVs from nominal paths.
To avoid the potential conflict between A
i and A
j, the cooperative conflict avoidance maneuver policy
UAVs should keep separation from each other during each maneuvers interval
UAVs should keep the safe separation after heading maneuver.
To minimize the fuel cost, the distance between two UAVs may be close to

The separation distance might be respected at t k and t k+1 while not being respected inside interval [t k, t k+1].
Constraint (a) means that two UAVs are required to keep the safe separation during the maneuver time period. It is assumed that the distance between UAVs at tk is larger than the relative safe distance. To guarantee the safe separation of two UAVs in

Newton downhill-based loss separation checking algorithm.
where ε is a sufficient small positive constant,
Constraint (b) combines the objective of safe separation and returning to nominal paths. The reward of each maneuver pair is approximated by the states of UAVs. For example, A
i and A
j take the heading maneuver

Action prediction and value evaluation.
Let
From the perspective of safety, larger
We propose that
The value of tc would be high if
where nc, ςr, and tσ are constant coefficients.
We then discuss on evaluating the effect of heading maneuver
Function
where α and σ are constant coefficients, ξ is an indicative function
Function
where
Therefore, the midterm reward function
Multi-UAV conflict resolution
We have discussed the dynamics of UAVs and proposed the midterm reward function for pair-wise conflict resolution problem. There may be scenarios that more than two UAVs are involving in local conflict in multi-UAV environment.
The multi-UAV conflict is regarded as a series of pair-wise conflicts. These pair-wise conflicts are coupled. The tree search algorithm considers all these relevant pair-wise conflicts. In this article, we propose to analyze multi-UAV conflict problem by graph. As it is shown in Figure 5, the conflict relations of conflict-relevant UAVs are expressed by constraint graph

(a) A configuration with four robots. (b) The constrained graph generated based on the states.
UAVs that involve in one specific conflict construct a connected graph. Therefore, UAVs in the local airspace can be grouped into disconnected sub clusters.
To deal with the conflict resolution problem for UAVs in each sub cluster, we define the multi-UAV conflict resolution midterm reward. When the algorithm expands a leave node
L
s = 1 if the action set u lead to loss separation between any pair of UAVs. The tree search algorithm would not expand nodes
Simulation experiments
To demonstrate the tree search conflict resolution algorithm, we compare the tree search algorithm with other algorithms. The experiments are based on the Matlab/Simulink platform. We suppose that Pioneer UAVs fly on their predefined courses. The simulated air traffic controller would send navigation commands to UAVs if these UAVs are predicted to meet with conflict. The navigation command includes three-dimensional velocity and future positions. The autopilots on these UAVs would control four units based on the navigation commands, namely pitch, roll, yaw angular and throttle, to control UAV platforms.
Algorithm efficiency analysis
In the first simulation experiment, we devise four UAVs airspace conflict scenery. UAVs fly at the attitude of 60 m flatly with the speed of 60 m/s. We set the safe radius of UAVs to be 250 m based on the following reasons: Pioneers are substantially smaller and lighter than commercial planes; the effects of weak vortex for pioneers are more slender than commercial planes; pioneers are permitted to take large amplitude maneuvers to keep the safe separation as there are not human beings in the cabin. To guarantee the safety of UAV platforms, the maximum angular velocity is restricted at 0.2 rad/s. The alert distance
In the experiment, wind is regarded as dynamic disturbances. Wind effect would result in deviations of UAVs from their predefined waypoints, which may cause conflict dangers.
As shown in Figure 6, four UAVs are planned to pass the dark zone P at sequence {U3, U4, U1, U2}. The minimum distance between UAVs is 252 m if they fly in accordance with their plans. The transverse wind would change the ground speed of UAVs. Therefore, four UAVs would arrive at P zone almost simultaneously, which leads to loss separation dangers.

Four UAVs predefined flight plan. UAV: unmanned aerial vehicles.
In this simulation experiment, we compare the tree search CDR algorithm with other algorithms. As the time constraint for online conflict resolution and the effect of unmodeled dynamitic would be accumulate step by step, we would not compare our algorithm with trajectory planning algorithms. We compare our algorithm with reactive algorithms. Steven Roelofsen et al. propose to avoid the conflict between UAVs by navigation function-based method. 21 In the rule-based algorithm, UAVs would abide by the same collision avoidance rules to keep the safe separation. 40 The BPNN online learning method is applied in the tree search algorithm to estimate the effect of unmodeled dynamics. The conflict resolution results are shown as Figure 7. Figure 7(a) and (b) depict the conflict-free traces by the rule-based algorithm; Figure 7(c) and (d) depict the conflict-free traces by the navigation function-based algorithm. Figure 7(e) and (f) show the conflict-free traces by cooperative tree search algorithm. In the rule-based algorithm, each UAV would take half of the responsibility in each pair-wise conflict, which leads to additional fuel consumption. In the navigation function, UAVs would take large detours to avoid other UAVs. In our algorithm, conflicts are solved by cooperative tree search method. Each UAV would take different maneuvers to avoid the conflict. Therefore, UAVs would not take unnecessary fuel consumptions. The deviations from nominal paths are shown in Figure 8. It is shown that the navigation function-based method would cause larger deviations from nominal paths. The rule-based algorithm would have minor impact on UAVs flight path. The tree search algorithm lead to the least influence on UAVs flight paths.

Conflict-free traces, (a), (c), and (e) depict 3-D traces. (b), (d) and (f) depict 2-D traces.

Comparison of conflict resolution consumption. The deviation from nominal paths.
The distances between each pair of UAVs during the flying time 40 s to 120 s are shown in Figure 9. The orange line is the minimum safe separation distance between UAVs. The result shows that the tree search algorithm can guarantee the safe separation between UAVs.

Distance between UAVs in CDR process. UAV: unmanned aerial vehicle; CDR: conflict detection and resolution.
CDR in complex scenario
In the second scenario, we demonstrate our algorithm in copying with the conflict resolution in the real environment. There is a disaster in a city region. The ground communication devices are destroyed because of disaster. In order to search and rescue the survivals, four communication-relay UAVs are allocated to fly above this region to build the communication. One scout UAV is issued to search important spots. The velocities of these five UAVs are 56 m/s, 54 m/s, 48 m/s, 66 m/s, and 30 m/s. In this scenario, we set the safe region of each UAV based on their speed. We define different safe radius for different UAV according to its velocity. The safe radius of each UAV A
i is set as

Predefined traces of UAVs in search and rescue scenario. UAV: unmanned aerial vehicles
These UAVs may meet with loss separation dangers because of unexpected disturbances. The communicate-relay UAV are permitted to take large detours during the flight while the scout UAV should keep fly close to the preplanned trace. In the experiment, we set the priority of scout UAV as 9 and the priorities of communicate-relay UAVs as 1. The algorithm takes the obstacles into consideration when devising the conflict-free policy. The conflict-free trajectories of these UAVs are shown in Figure 11.

Conflict-free traces for search and rescue scenario.
As Figure 11 shows that these UAVs can avoid collision with obstacles. The trace of UAV 5 has not been largely modified. During the conflict avoidance maneuver process. The communicate-relay UAVs have taken most of the work load to avoid the conflict dangers. The distances between UAVs are shown in Figure 12.

The distances between UAV pairs. UAV: unmanned aerial vehicle.
As the velocities of these UAVs are different, the ranges of safe regions of UAVs are not equal. As it is shown in Figure 12 (a) to (d), the distance between each pair of UAVs is larger than
Conclusions
We study on UAVs conflict resolution problem in dynamic environments in this article. We first propose the tree search method to deal with UAVs conflict resolution problem. The BPNNs are applied to approximate the unmodeled dynamics. Secondly, the midterm reward for pairwise conflict between UAVs is defined as weighted sum of two factors, namely, loss separation punishment and goal attraction. We then discuss on multi-UAVs conflict resolution problem. Finally, we demonstrate the proposed algorithm by simulation experiments.
As the communication condition may be imperfect in some circumstances and the number of UAVs would increase. The distributed cooperative CDR method is required in these situations. We would study on the distributed cooperative CDR in our future work.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by Natural Science Foundation of China (number: 61403410), China Postdoctoral Science Foundation (number: 2014M552687) and Graduate Student Innovation Project of Hunan province (number: CX2014B013).
