Pseudospectral optimal control approach-based cooperative planning for UAV formations with mission-oriented formation selection

Abstract

Cooperative path planning for unmanned aerial vehicle (UAV) systems is challenged by complex kinematic constraints, coordination constraints, and maintaining optimized formations. This work characterizes mission-oriented performance constrained formation and integrates them into cooperative path planning under multiple constraints. A mission-oriented performance constrained formation model covering reconnaissance, penetration, damage and communication capabilities is proposed, and the simulated annealing particle swarm optimization algorithm (SAPSO) is used for optimization to achieve the desired formation. A cooperative path planning algorithm based on the radau pseudospectral method (RPM) is proposed, by converting the optimal control problem into a nonlinear programming problem, an cooperative path planning method under multiple constraints is realized. Simulations of a UAV formation validate that the proposed approach generates smooth, collision-free trajectories that maintain the optimized formation while satisfying all kinematic, performance, and cooperative constraints.

Keywords

multi-UAV systems cooperative path planning Radau pseudospectral method nonlinear systems formation optimization

Introduction

Multi-UAV Systems can be applied to cooperative path planning in complex environments to accomplish missions that are difficult for a single UAV to complete.^1–3 Multi-UAV cooperative path planning is a research field that has emerged in recent years, involving the collaboration between multiple UAVs to achieve path planning under multiple constraints. The multi-UAV path planning problem is characterized by a high degree of complexity and numerous challenges, primarily manifested in environmental complexity and the need for cooperative operations. These challenges require ensuring that the UAVs can coordinate with each other, avoid collisions during flight, and maintain a desired formation to maximize operational effectiveness.^4–8

For single-UAV path planning, several classic algorithms have been proposed.^9–11Compared to the single-UAV case, the cooperative path planning of a multi-UAV system must satisfy additional constraints such as communication topology, collision avoidance, and formation maintenance, which poses significant challenges. In recent years, several methods for multi-UAV cooperative path planning have been introduced, with common approaches including the artificial potential field method,^12–14 graph search methods,^15–17 heuristic algorithms,^18–21 reinforcement learning methods,^22,23 and model predictive control.^24,25 However, the aforementioned methods for formation flight path planning, obstacle avoidance, and cooperative control often model the UAVs as point masses. This simplification ignores the 6 degree-of-freedom(6-DOF) dynamic characteristics of the UAVs as well as the effects of unavoidable lumped disturbances, leading to a certain degree of deviation from the complexities of real-world multi-UAV formation flight scenarios.

The pseudospectral method is an effective numerical technique for solving optimal control problems.²⁶ Due to its advantages, such as convenient constraint handling, rapid convergence, high accuracy, and relatively low sensitivity to initial values,^27,28 it has been widely applied in the trajectory optimization of hypersonic vehicles during their climb, glide, and re-entry phases,^29–31 the trajectory optimization of combined-cycle powered vehicles,^32,33 and missile guidance trajectories.³⁴ It is evident that the pseudospectral method has achieved significant success in the field of vehicle trajectory optimization, but it has not yet been applied to solve multi-UAV cooperative trajectory optimization problems. Current research using pseudospectral methods for multi-UAV trajectory optimization has not yet considered key challenges such as cooperative constraints, obstacle avoidance strategies, and formation keeping, making it difficult to meet the collaborative mission requirements of practical flight. Regarding combat formations, traditional design approaches primarily evaluate formation metrics using methods like bi-level programming models^35,36 and potential field models.³⁷ These methods rely on inter-UAV distances and situational assessment for formation selection. Although their models are simple and easy to implement, they fail to effectively integrate weapon system performance with operational effectiveness, making it difficult to guarantee the optimality of the formation for mission efficiency and safety.

To address the above challenges, this paper investigates multi-UAV cooperative path planning using the RPM integrated with a formation optimized for mission-oriented formation selection. To overcome the multi-constraint challenges posed by complex dynamics and cooperative relationships, a multi-objective path planning approach based on RPM is proposed, which transforms the problem into a nonlinear programming formulation under multiple constraints and achieves both high solution accuracy and computational efficiency. In addition, an evaluation model is established to assess formation performance under mission-oriented constraints, and the SAPSO algorithm is employed to solve the formation optimization problem, thereby enhancing cooperative combat capability. Simulation results demonstrate that the proposed method, RPM based cooperative path planning with mission-oriented formation optimization, can simultaneously satisfy dynamic and cooperative constraints, while effectively designing and maintaining formations with superior operational effectiveness, thereby validating the methodology.

The innovations of this paper include:

Aiming at the key performance indicators of multi-UAV cooperative operations，a formation optimization modeling method based on mission performance constraints is proposed, and the SAPSO algorithm is used to achieve the desired formation design, providing a theoretical basis for the formation design of multi-UAVs.

To address the inadequacy of existing nonlinear programming models in representing multi-UAV cooperation, a model capable of characterizing multi-UAV cooperative relationships is established, which provides a complete representation of cooperative constraints such as synchronous arrival, formation keeping, and inter-agent collision avoidance.

In order to overcome the insufficient representation of the real 6-DOF dynamics of UAV modeling in multi-UAV path planning, a multi-constraint RPM is proposed, which comprehensively considers dynamic equations constraints, cooperative relationships, and the mission-oriented formation keeping.

The content of this paper is arranged as follows. Section 2 introduces the UAV and formation modeling and the related problems. The mission-oriented performance constrained formation design using the SAPSO optimization algorithm is proposed in Section 3. Section 4 presents the multi-UAV cooperative path planning algorithm based on the RPM. Simulation studies are demonstrated in Section 5. The conclusion and future prospects are discussed in Section 6.

Problem statement

UAV and formation modeling

The dynamic modeling of UAV primarily describes its navigational motion in three-dimensional space, represented by changes in position, velocity, and orientation. Let $μ$ denote flight path bank angle, $α$ is angle of attack, $β$ denote sideslip angle, $T_{x}$ , $T_{y}$ and $T_{z}$ are thrust in the body-axis components, meanwhile the above variables as reference commands, then the specific form of UAV dynamic model is given as follows:

{\dot{x}}_{i} = V_{i} \cos γ_{i} \cos χ_{i}

(1)

{\dot{y}}_{i} = V_{i} \cos γ_{i} \sin χ_{i}

(2)

{\dot{z}}_{i} = V_{i} \sin γ_{i}

(3)

{\dot{V}}_{i} = \frac{1}{M_{i}} [- D r a g_{i} + (S i d e_{i} + T_{y c_{i}}) \sin β_{i} - M_{i} g \sin γ_{i} + T_{y c_{i}} \cos β_{c_{i}} \cos α_{i} + T_{z c_{i}} \cos β_{c_{i}} \sin α_{c_{i}}]

(4)

\begin{aligned} {\dot{χ}}_{i} = & \frac{1}{M_{i} V_{i} \cos γ_{c_{i}}} [L i f t_{i} \sin μ_{c_{i}} + (S i d e_{i} + T_{y c_{i}}) \cos μ_{c_{i}} \cos β_{c_{i}} + T_{x c_{i}} (\sin μ_{c_{i}} \sin α_{c_{i}} - \cos μ_{c_{i}} \sin β_{c_{i}} \cos α_{c_{i}})] \\ - \frac{T_{z c_{i}}}{M_{i} V_{i} \cos γ_{i}} (\cos μ_{c_{i}} \sin β_{c_{i}} \sin α_{c_{i}} + \sin μ_{c_{i}} \cos α_{c_{i}}) \end{aligned}

(5)

\begin{aligned} {\dot{γ}}_{i} = & \frac{1}{M_{i} V_{i}} [L i f t_{i} \cos μ_{c_{i}} - M_{i} g \cos γ_{i} - (S i d e_{i} + T_{y c_{i}}) \sin μ_{c_{i}} \cos β_{c_{i}}] + \frac{T_{x c_{i}}}{M_{i} V_{i}} (\sin μ_{c_{i}} \sin β_{c_{i}} \cos α_{c_{i}} + \cos μ_{c_{i}} \sin α_{c_{i}}) \\ + {\frac{T_{z c_{i}}}{M_{i} V}}_{i} (\sin μ_{c_{i}} \sin β_{c_{i}} \sin α_{c_{i}} - \cos μ_{c_{i}} \cos α_{c_{i}}) \end{aligned}

(6)

where the subscript i and c denote the i -th UAV within a multi-UAV formation and the reference commands. The variables x , y , and z constitute the position vector of the UAV relative to an inertial ground frame. V represents the velocity. $χ$ and $γ$ are the flight path azimuth and climb angle, respectively. $L i f t$ 、 $D r a g$ and $S i d e$ denote the aerodynamic forces, which are lift, drag, and side force, resolved in the wind axes system, and Mrepresents the mass of the UAV.

The aerodynamic forces can be expressed as: $L i f t = Q S (c_{L 0} + c_{L}^{α} α)$ , $D r a g = Q S (c_{D 0} + c_{D}^{α} α)$ , $S i d e = Q S c_{Y}^{β} β$ . Parameters $c_{L 0}$ and $c_{D 0}$ are the zero-alpha lift coefficient and zero-lift drag coefficient, respectively. $c_{L}^{α}$ and $c_{D}^{α}$ denote the derivative of lift and drag coefficient with respect to angle of attack, respectively. $c_{Y}^{β}$ is the side force coefficient derivative with respect to sideslip. Q and S denote dynamic pressure and reference area, respectively.

The modeling of a multi-UAV formation is primarily developed based on the relative kinematic relationships between any two aircraft within the formation. This paper adopts the leader-follower structure for formation modeling due to its advantages, which include a simple architecture and strong robustness, as well as its ability to simplify control design based on the positional relationships among the UAVs. A leader-follower formation model is typically realized by establishing the relationship between a single leader aircraft and multiple follower aircraft. The leader is responsible for guiding the formation's overall motion, while the followers maintain the formation by tracking the leader's trajectory. In this configuration, the tracking model for the leader is consistent across all followers. Consequently, the collective motion of the multi-UAV formation can be analyzed by examining the relative motion between a single follower and the leader. The leader-follower structure of the multi-UAV formation is shown in Figure 1.

Figure 1.

Leader-follower formation structure.

In the figure, the subscripts L and F denote the leader and follower UAVs, respectively. The $O - X Y$ coordinate system represents the inertial frame, while the $O_{C} - X_{C} Y_{C}$ system represents the formation frame, where the origin $O_{C}$ corresponds to the leader aircraft. The terms $V_{L}$ and $V_{F}$ represent the velocities of the leader and the follower, respectively, while $χ_{L}$ and $χ_{F}$ denote the azimuth angles of the velocity vectors $V_{L}$ and $V_{F}$ with respect to the $O X$ axis. The variables $d x$ and $d y$ represent the lateral and longitudinal separation distances between the leader and the follower within the formation coordinate frame.

Based on the geometric relationships shown in the figure, the transformation of the relative longitudinal and lateral positions, $d x$ and $d y$ , from the inertial frame to the formation frame can be obtained as follows:

d x = (x_{F} - x_{L}) \cos χ_{L} + (y_{F} - y_{L}) \sin χ_{L}

(7)

d y = (x_{F} - x_{L}) \sin χ_{L} - (y_{F} - y_{L}) \cos χ_{L}

(8)

Simultaneously considering the relative altitude relationship between the UAVs in the formation, let $d z$ represent the relative altitude. If $z_{L}$ and $z_{F}$ are the altitudes of the leader and follower, respectively, it can be readily established that:

d z = z_{F} - z_{L}

(9)

Mission-oriented performance constrained formation framework

Multi-UAV formation design refers to the selection of an appropriate formation configuration to ensure mission efficiency and safety. Traditional approaches utilize inter-UAV distance and situational assessment as the basis for formation selection, are characterized by their simplicity and ease of implementation. However, they are deficient in effectively integrating weapon systems’ performance with operational effectiveness, thereby struggling to guarantee the optimality of the formation when complex mission-oriented formation selection is required.

This paper proposes a multi-UAV formation model based on mission-oriented formation selection. The model encompasses four key aspects: cooperative detection capability, cooperative maneuvering penetration capability, target destruction capability, and communication command capability. The specific components of this formation evaluation model are detailed as follows:

The cooperative detection capability: It refers to the capacity for comprehensive and precise awareness, ensuring the UAV formation can accurately ascertain the distribution of targets, threat status, and dynamic changes within the operational area.

The cooperative maneuvering penetration capability: It represents the formation's ability to evade enemy defense systems through agile maneuvering, ensuring the multi-UAV system can adapt to complex operational environments and enhance its survivability.

The target destruction capability: It describes the capacity for precision strikes against designated targets, ensuring the combat effectiveness of the multi-UAV formation during the execution of coordinated strike missions.

The communication command capability: It signifies the ability of individual UAVs within the formation to maintain real-time, reliable communication, which serves as the foundation for the entire formation to sustain cohesive and coordinated flight.

Based on the preceding analysis, the structure of the mission-oriented performance constrained UAV formation evaluation model is illustrated in Figure 2.

Figure 2.

The mission-oriented performance constrained UAV formation evaluation model.

Multi-UAV cooperative path planning framework

Cooperative path planning for multi-UAV systems aims to enable formation flight while adhering to a set of constraints, including dynamic characteristics, performance limitations, and no-fly zones. To ensure the formation exhibits high performance and efficiency throughout the path planning process, specific optimization objectives must be satisfied. Furthermore, to achieve synchronous and safe arrival in a designated formation, the cooperative relationships among the UAVs must be maintained.

Therefore, multi-UAV cooperative path planning must satisfy three fundamental requirements: safety, efficiency, and coordination. Safety, the paramount consideration, involves respecting the UAVs’ dynamic and performance constraints to operate within safe operational envelopes, adhering to no-fly zone restrictions, and ensuring continuous collision avoidance between the formation and any obstacles. Efficiency aims to enhance overall operational effectiveness by optimizing for metrics such as minimum flight distance or minimal control effort, identifying optimal paths that minimize flight time and expenditure while adhering to all safety criteria and mission objectives. Coordination demands that all UAVs in the formation cooperate to prevent inter-agent collisions and to arrive at the target location synchronously in the desired formation.

To fulfill these requirements, the modeling of multi-UAV cooperative path planning must comprehensively account for the dynamic characteristics, performance limitations, and no-fly zone constraints of each formation member, as well as the optimization objectives and cooperative relationships of the group.

Formation design for mission-oriented formation selection using the SAPSO optimization algorithm

This section, guided by actual mission requirements, establishes a comprehensive evaluation model for multi-UAV combat formations based on mission-oriented formation selection. This model incorporates four key capabilities: cooperative reconnaissance, maneuver and penetration, target destruction, and command and communication.

Subsequently, the section introduces the SAPSO optimization algorithm. This algorithm is then integrated with the formation evaluation model to solve for and determine the desired formation layout. This approach facilitates a multi-UAV formation design that is explicitly optimized for mission-oriented formation selection.

Modeling of the mission-oriented performance constrained formation

This subsection proposes an evaluation model for multi-UAV formations based on mission-oriented formation selection. The model comprises four key aspects: cooperative reconnaissance capability, maneuver and penetration capability, target destruction capability, and command and communication capability.

Building upon the relative positions $d x$ 、 $d y$ and $d z$ established in the preceding multi-UAV formation modeling, and with the formation structure in Figure 2 as the objective, the mission-oriented formation selection model is developed. The modeling function for this formation evaluation is defined as

f_{j} (U A V_{f m})

(10)

where

j = 1, 2, 3, 4, 5

corresponds to radar detection width, radar detection depth, maneuvering safety distance, weapon kill range, and communication response time, respectively. Furthermore, the modeling equations have been normalized to ensure the logical consistency of the evaluation by eliminating dimensional units and to improve the comparability of the applied weights.

Radar detection width

Radar detection width refers to the maximum lateral distance at which a multi-UAV formation can detect a target. A greater detection width indicates that the radar system can cover a larger area, thus possessing a greater capability to detect and identify more targets. The detection volume of a radar is typically modeled as a conical region in three-dimensional space, where the maximum detection range is represented by the slant height of the cone $R_{r a d a r}$ , and the angular coverage is represented by the cone's apex angle $γ_{r a d a r}$ . The modeling of the radar detection width for a multi-UAV formation is illustrated in Figure 3. As shown, the maximum detection width of the formation is the sum of the diameters of the conical bases of each member's detection volume.

Figure 3.

Schematic diagram of radar detection width.

To maximize the detection width and avoid significant blind zones, the multi-UAV formation should adopt a more concentrated or compact pattern, ensuring that the distances between members are not too dispersed. Therefore, the evaluation model for radar detection width can be expressed as

f_{1} (U A V_{f m}) = {\begin{matrix} 1 - e^{(d y - \sum_{i = 1}^{N} y_{i_{r a d a r}})}, d y_{i} \leq \sum_{i = 1}^{N} y_{i_{r a d a r}} \\ 1, d y_{i} > \sum_{i = 1}^{N} y_{i_{r a d a r}} \end{matrix}

(11)

where

i = 1, 2, \dots, N

denotes the i -th UAV in the formation, N is the total number of UAVs in the formation, and

y_{i_{r a d a r}}

is the detection width of the i -th UAV. Based on the geometry of the cone, the detection width can be readily expressed as

y_{i_{r a d a r}} = R_{r a d a r} \sin (γ_{r a d a r})

According to this radar detection width model, it is necessary to prevent the spacing between formation members from becoming so large that discontinuous blind zones appear in the detection coverage. From Equation (11), it follows that if the lateral distance between the leader and follower exceeds the sum of the individual members’ detection widths, the value of the radar detection width modeling equation is set to 1, according to the normalization method.

Radar detection depth

Radar detection depth signifies the maximum longitudinal distance at which the multi-UAV formation can detect a target. A greater detection depth implies that the radar system can cover a larger area, thus possessing a greater capability to detect and identify more targets. Following the same modeling approach as for the radar detection width, the evaluation model for radar detection depth can be expressed as

f_{2} (U A V_{f m}) = {\begin{matrix} 1 - e^{(d x - \sum_{i = 1}^{N} x_{i_{r a d a r}})}, d x_{i} \leq \sum_{i = 1}^{N} x_{_{i_{r a d a r}}} \\ 1, d x_{i} > \sum_{i = 1}^{N} x_{i_{r a d a r}} \end{matrix}

(12)

where

x_{i_{r a d a r}}

represents the detection depth of the i -th UAV, which can be expressed as

x_{i_{r a d a r}} = R_{r a d a r} \cos (γ_{r a d a r})

In the process of modeling both radar detection width and depth, the slant height of the UAV's conical radar detection area is equivalent to the maximum radar detection range, $R_{r a d a r}$ . In practical applications, this maximum detection range is influenced by numerous factors, including the radar system's delay time, transmitted power, receiver sensitivity, and antenna gain. This relationship can be specifically expressed by the radar range equation:

R_{r a d a r} = {[\frac{P_{t} G_{t}^{2} λ^{2} σ_{r a d a r}}{{(4 π)}^{3} S_{r a d a r}^{min}}]}^{1 / 4}

(13)

where

R_{r a d a r}

is the maximum search radius of the radar,

P_{t}

and

G_{t}

are the radar's transmitted power and antenna gain, respectively,

σ_{r a d a r}

is the target's radar cross-section (RCS),

A_{e}

is the effective area of the antenna,

λ

is the wavelength of the radar signal and

S_{r a d a r}^{min}

is the minimum detectable signal power.

While Equation (13) analyzes the impact of various radar system parameters on the maximum search radius, the formation evaluation model designed in this paper primarily focuses on analyzing the effect of this maximum search radius on the multi-UAV formation's geometry. Therefore, the evaluation models designed in Equation (12) and previously in Equation (11) represent a further stage of research that commences after the maximum radar search radius has already been determined from considerations of factors such as system delay time and the parameters in the radar range equation.

Maneuvering safety distance

In multi-UAV formation operations, the relative distance between members is critically important, as it directly affects whether their maneuvers interfere with one another. When a formation executes an obstacle avoidance task, it often requires agile, high-g maneuvering. Therefore, it is essential to ensure that the relative distance between any two members in the formation is greater than the UAV's minimum maneuvering radius. This guarantees that members do not impede each other during maneuvers, effectively preventing inter-agent collisions and ensuring flight safety. The evaluation model for maneuvering safety distance can be expressed as

f_{3} (U A V_{f m}) = {\begin{matrix} \prod_{i = 1}^{N} \frac{R_{m a n e}}{min (d_{i j})}, min (d_{i j}) \geq R_{m a n e} \\ 1, min (d_{i j}) < R_{m a n e} \end{matrix}

(14)

where

min (d_{i j})

represents the minimum distance between any two UAVs in the multi-UAV formation, and

R_{m a n e}

is the UAV's minimum maneuvering radius. The evaluation model for the safe maneuver distance in Equation (14) aims to minimize the total penalty. When the minimum distance between formation members

min (d_{i j})

is less than the minimum maneuver radius

R_{m a n e}

, the result of the equation is 1, representing the maximum penalty value. As the distance between formation members increases, the penalty value decreases proportionally to

\frac{R_{m a n e}}{min (d_{i j})}

Effective coverage area

The methods employed by UAVs to neutralize targets are not limited to traditional munition delivery and missile interception. Low-cost, direct-impact kinetic strikes can also be used to inflict direct physical damage and achieve high kill effectiveness. Therefore, in a departure from considerations for traditional fighter aircraft, a multi-UAV formation in a cooperative strike mission must consider the total size of the area it can effectively cover, rather than focusing solely on weapon factors like missile range and attack precision. A larger coverage area corresponds to a stronger capability to destroy targets. By evaluating this effective kill area, the attack capability of the multi-UAV formation and its potential for target destruction can be quantified.

Therefore, the evaluation model for the weapon's effective coverage area can be expressed as

f_{4} (U A V_{f m}) = 1 - \frac{S}{S_{max}}

(15)

where S represents the actual coverage area of the multi-UAV formation, and

S_{max}

represents the maximum possible area for the same formation geometry. When the actual coverage area is equal to the maximum area, the effective kill range is maximized.

Communication response time

Communication response time is a critical factor for the command and communication capability of a multi-UAV formation. A long response time leads to communication latency, which can cause a lack of information synchronization between the leader and follower aircraft. This asynchrony may result in the formation losing control or becoming unstable, and could even lead to collisions between UAVs.

For a given multi-UAV formation, the relationship between its command and communication capability and its geometric parameters primarily depends on the communication distance between the leader and the farthest follower. The shorter this distance, the shorter the response time and the stronger the communication link for the followers tracking the leader. The evaluation model for communication response time can be expressed as

f_{5} (U A V_{f m}) = \frac{max (d_{i})}{d_{l i n k}}

(16)

where

d_{i} = \sqrt{d x_{i}^{2} + d y_{i}^{2} + d z_{i}^{2}}

represents the distance between the i -th follower UAV and the leader, and

d_{l i n k}

represents the maximum communication distance of the multi-UAV formation's communication system.

Formation design based on the SAPSO optimization algorithm

The SAPSO algorithm employed in this paper is an adaptive optimization method based on Particle Swarm Optimization (PSO) and Simulated Annealing (SA), designed to address the inherent problems of using either PSO or SA alone. Traditional PSO algorithms are prone to premature convergence, often resulting in a local optimum. Conversely, the precision and speed of the SA algorithm are highly dependent on the initial temperature setting and the cooling schedule, leading to poor optimization time performance. To overcome these issues, this paper utilizes the SAPSO algorithm, which combines PSO and SA.

The objective of the SAPSO algorithm is to leverage the global search capabilities of PSO and the local search capabilities of SA, allowing them to complement each other to enhance both the accuracy and speed of the optimization. In this manner, the SAPSO algorithm overcomes the respective limitations of standalone PSO and SA, offering a more effective solution for optimization problems.

A critical step in the SAPSO algorithm is the use of the Metropolis acceptance criterion to determine whether to accept the iterative solution for each particle in each generation. This criterion governs the probability of accepting a new solution, thereby facilitating a more effective search of the global solution space. When a new solution is superior to the current one, it is always accepted. When a new solution is inferior, it still has a certain probability of being accepted according to the Metropolis criterion, which helps the algorithm avoid becoming trapped in a local optimum. If a new solution is not accepted, the particle is iterated again until the algorithm terminates, yielding the optimal solution.

Additionally, a constriction factor is introduced in the SAPSO algorithm. Its function is to eliminate the need for boundary limits on velocity (velocity clamping), permitting particles a greater range of movement during the search process. This helps to augment the exploration of the search space and improves the algorithm's global search capability.

To ensure the algorithm's convergence, it is necessary to select appropriate parameters. This includes adjusting the acceptance probability within the Metropolis criterion and controlling the degree of influence of the constriction factor. Through rational parameter selection, a balance between the accuracy and speed of the SAPSO algorithm can be achieved during the search process, and the need for boundary limits on velocity can be eliminated.

First, the update equations for the position and velocity of the PSO algorithm incorporating a constriction factor are introduced as follows:

v_{i, j} (k + 1) = χ [v_{i, j} (k) + c_{1} r_{1} (p_{i, j} (k) - x_{i, j} (k)) + c_{2} r_{2} (p_{g, j} (k) - x_{i, j} (k))]

(17)

x_{i, j} (k + 1) = x_{i, j} (k) + v_{i, j} (k + 1)

(18)

where $x_{i, j}$ and $v_{i, j}$ represent the position and velocity of a particle, respectively; $p_{i, j}$ and $p_{g, j}$ are the personal best position and the global best position; $c_{1}$ and $c_{2}$ are the acceleration constants; $r_{1}$ and $r_{2}$ are random numbers within a specified interval; and k denotes the current iteration number. The term χ is the constriction factor, which can be expressed as

χ = 2 / (| 2 - C - \sqrt{C^{2} - 4 C} |)

(19)

where

C = c_{1} + c_{2}

and

C > 4

As can be seen in the velocity update Equation (17), the standard algorithm uses the global best position, $p_{g, j}$ . If this position does not correspond to the true global optimum, it can cause all particles to converge prematurely toward this local optimum, thereby limiting the algorithm's global search capability. To resolve this issue of local convergence, an SA-based mechanism is used to select a position $p_{i, j}^{'}$ from among the multiple personal best positions, $p_{i, j}$ . This selected position is then used to replace $p_{g, j}$ in Equation (17). This means that instead of exclusively using the best position found by the swarm as a reference point, a more broadly selected position from the entire population is used as a reference. This increases search diversity, enables particles to better explore the search space, and enhances the global search capability of the algorithm.

Therefore, the velocity update Equation (17) can be transformed to

v_{i, j} (k + 1) = χ [v_{i, j} (k) + c_{1} r_{1} (p_{i, j} (k) - x_{i, j} (k)) + c_{2} r_{2} (p_{i, j}^{'} (k) - x_{i, j} (k))]

(20)

By incorporating the SA algorithm, the personal best position ( $p_{i, j}$ ) is treated as a special solution that is potentially inferior to the global best position ( $p_{g, j}$ ) to ensure that $p_{i, j}$ values with good performance have a higher probability of being selected. The transition probability, $P b$ , of a personal best position, $p_{i, j}$ , relative to the global best, $p_{g, j}$ , is defined with respect to an annealing temperature as

P b = e^{- (J_{p_{i}} - J_{p_{g}}) / t} / \sum_{j = 1}^{P} e^{- (J_{p j} - J_{p_{g}}) / t}

(21)

where J is the objective function of the optimization algorithm;

J_{p i}

and

J_{p g}

represent the values of the objective function for the personal best position

p_{i, j}

and the global best position

p_{g, j}

, respectively; and P is the population size. This transition probability is then treated as the fitness value of the individual solution

p_{i, j}

, allowing it to be selected with a certain probability to replace the global best solution,

p_{g, j}

Finally, the selection of $p_{i, j}^{'}$ is performed via roulette wheel selection from the population of $p_{i, j}$ that has been evaluated using the SA-based criterion described above. By replacing the global best position in this manner, the algorithm overcomes the tendency of the standard PSO algorithm to become trapped in a local optimum.

Having detailed the design of the SAPSO algorithm, the establishment of the optimization algorithm's objective function will now be specified. The design and performance of an optimization algorithm are directly influenced by the characteristics and requirements of its objective function, making the objective function a core component of the algorithm. A well-designed objective function can provide effective guidance for the optimization, enabling the algorithm to converge more rapidly to the optimal solution.

The optimization objective of this paper is to design a multi-UAV formation that maximizes operational effectiveness. Assuming a multi-UAV formation composed of N aircraft, and based on the relative positions $d x$ , $d y$ , and $d z$ from the modeling in Equations (7)-(9), the optimization parameters for the formation design are defined as

X = [\begin{matrix} U A V_{1} & U A V_{2} & \dots & U A V_{N} \end{matrix}] = [\begin{matrix} d x_{1} & d y_{1} & d z_{1} & d x_{2} & d y_{2} & d z_{2} & \dots & d x_{N} & d y_{N} & d z_{N} \end{matrix}]

(22)

where

d x_{i}

d y_{i}

, and

d z_{i}

are the relative longitudinal distance, lateral distance, and altitude between the leader and the i -th follower UAV in the formation, respectively. It is also assumed that

d x_{i}

d y_{i}

, and

d z_{i}

are all measurable quantities.

Using the optimization parameters defined above, and in conjunction with the mission-oriented performance constrained formation model proposed in Section 3.1, the objective function for the formation optimization algorithm is established as

J = \sum_{j = 1}^{5} ω_{j} f_{j} (U A V_{f m})

(23)

where

f_{j} (U A V_{f m}), (j = 1, 2, \dots, 5)

are the formation evaluation modeling Equations (11)-(16), and

ω_{j}

represents the weighting factor for the j -th evaluation equation.

To conclude the discussion of the SAPSO optimization algorithm design, Figure 4 provides a visual depiction of the SAPSO algorithm's workflow, summarizing the entire process.

Figure 4.

Flowchart of the SAPSO algorithm.

Multi-UAV cooperative path planning algorithm based on the RPM

Cooperative path planning for multiple UAVs must account for a variety of factors, including performance constraints, synchronous arrival, obstacle avoidance, and performance optimization. Traditional path planning methods are often unable to find optimal solutions due to the problem's complexity and computational inefficiency. The RPM, however, is a numerical technique designed for solving nonlinear dynamic systems that can effectively address the nonlinear, multi-constraint, and multi-objective optimization challenges inherent in path planning.

This method is well-suited for the demands of multi-UAV cooperative path planning, such as handling dynamic equation constraints, path planning with obstacle avoidance, and ensuring simultaneous arrival, which enhances the adaptability of the path planning to various scenarios. Furthermore, the RPM utilizes high-order polynomials to approximate the path curves, yielding smoother and more continuous flight trajectories that are better aligned with the mission requirements of UAVs.

The fundamental concept of the RPM is to discretize the state and control variables at a set of Legendre-Gauss-Radau (LGR) collocation points. These discrete points are then used as nodes to construct Lagrange interpolation polynomials that approximate the state and control trajectories. By differentiating the global interpolation polynomial for the state, the time derivative of the state is approximated, thereby transforming the system's differential equations into algebraic constraints.

Concurrently, integral terms within the performance index or control effort are calculated using Gauss-Legendre quadrature, and the terminal state is determined through the integration of the system dynamics from the initial state.

Through this transformation, the multi-UAV cooperative path planning problem is converted into a parameter optimization problem subject to a series of algebraic constraints, which can be solved as a Nonlinear Programming (NLP) problem.

Modeling for multi-UAV cooperative path planning

The modeling of multi-UAV cooperative path planning requires consideration of the dynamic characteristics, performance constraints, no-fly zone constraints, optimization objectives, and cooperative relationships for each member of the formation. These components are specified as follows:

Dynamic Equations Constraints

The dynamic characteristics of the multi-UAV formation are described by the dynamic model presented previously in Equations.(1)-(6). This ensures that the generated trajectories are dynamically feasible and continuously adhere to the inherent flight dynamics of the UAVs. By manipulating inputs such as the aerodynamic angles, which include the angle of attack and sideslip angle, and engine thrust, the changes in each UAV's position, velocity, and orientation in three-dimensional space are controlled.

Performance constraints

To account for the structural integrity and performance stability of the UAV, its performance constraints typically include load factor constraints and control input constraints.

The structural strength of a UAV directly impacts its flight safety and durability, particularly during agile, high-g maneuvers which cause an increase in the load factor. If the aircraft's structural strength is insufficient to withstand these high loads, it can lead to structural damage, fracture, or failure, potentially resulting in accidents. Therefore, including a load factor constraint is a critical element for ensuring flight safety. The constraint on the load factor n can be expressed as:

n_{i} = \frac{\sqrt{L i f t_{i}^{2} + D r a g_{i}^{2}}}{M_{i} g} \leq n_{max}

(24)

where

n_{max}

represents the maximum load factor that the UAV can withstand.

The UAV's angle of attack, sideslip angle, and flight path bank angle serve as the control command inputs for trajectory planning and directly influence the aircraft's stability. The angle of attack command affects the UAV's lift and drag; an excessive angle of attack can lead to an aerodynamic stall. Similarly, changes in the sideslip angle impact the UAV's lateral stability, and an excessive sideslip angle can cause a lateral loss of control. The flight path bank angle command input ensures stable turns and good maneuverability. Therefore, the constraints on these control inputs and their rates of change can be expressed as:

α_{min} \leq α_{i} \leq α_{max}, β_{min} \leq β_{i} \leq β_{max}, μ_{min} \leq μ_{i} \leq μ_{max}

(25)

{\dot{α}}_{min} \leq {\dot{α}}_{i} \leq {\dot{α}}_{max}, {\dot{β}}_{min} \leq {\dot{β}}_{i} \leq {\dot{β}}_{max}, {\dot{μ}}_{min} \leq {\dot{μ}}_{i} \leq {\dot{μ}}_{max}

(26)

where

α_{min / max}

β_{min / max}

, and

μ_{min / max}

represent the minimum and maximum allowable values for the angle of attack, sideslip angle, and flight path bank angle, respectively. Correspondingly,

{\dot{α}}_{min / max}

{\dot{β}}_{min / max}

, and

{\dot{μ}}_{min / max}

represent the minimum and maximum rate limits for the change in the angle of attack, sideslip angle, and flight path bank angle.

No-fly zone constraints

In complex operational environments, it is imperative to fully consider constraints imposed by various no-fly zones to ensure the safety of the UAVs and the success rate of the mission. These primarily include constraints related to physical obstacles and enemy radar and artillery threats.

Physical flight obstacles are typically modeled as infinitely tall cylinders. The constraint for avoiding such an obstacle can be expressed as:

(x_{i} - x_{o b})^{2} + (y_{i} - y_{o b})^{2} > R_{o b}^{2}

(27)

where the subscript i denotes the i -th UAV in the multi-UAV formation, (

x_{o b}

y_{o b}

) are the coordinates of the center of the cylindrical obstacle's base, and

R_{o b}

is the radius of the cylinder.

Radar and artillery systems constitute primary threats from enemy weaponry. Specifically, enemy radar systems can detect friendly UAVs, while artillery systems can engage them. Therefore, it is necessary to consider these threats comprehensively to ensure the UAVs can effectively evade enemy detection and attack. These radar and artillery threats are modeled as hemispherical zones, and the avoidance constraint can be expressed as:

(x_{i} - x_{e t})^{2} + (y_{i} - y_{e t})^{2} + (z_{i} - z_{e t})^{2} > R_{e t}^{2}

(28)

where (

x_{e t}

y_{e t}

z_{e t}

) are the coordinates of the center of the threat hemisphere's base, which is typically ground-based, and

R_{e t}

is the radius of this threat zone.

Optimization objective

In trajectory planning, the selection of an optimization objective depends on the specific mission requirements and performance indices. Common objectives include minimizing path length, minimizing flight time, and minimizing control effort. Considering the advantages of minimizing control effort in cooperative multi-UAV flight, such as extending flight endurance, reducing operational costs, this study adopts control effort minimization as the primary optimization objective.

Therefore, the objective function is formulated as a quadratic form with respect to the control variable u, can be expressed as:

J = \int_{t_{0}}^{t_{f}} u_{^{t p_{i}}}^{T} (t) u_{t p_{i}} (t) d t

(29)

where

u_{t p_{i}}

represents the input vector in the optimization, defined as

u_{t p_{i}} = [\begin{matrix} α_{i} & β_{i} & μ_{i} \end{matrix}]

. The components

α_{i}

β_{i}

, and

μ_{i}

are the UAV's angle of attack, sideslip angle, and flight path bank angle, respectively. The terms

t_{0}

and

t_{f}

represent the initial and terminal times of the path planning horizon.

Cooperative constraints

In multi-UAV path planning, cooperative constraints typically include safety distance constraints, formation-keeping constraints, and synchronous arrival requirements. Unlike single-vehicle flight, a primary concern in cooperative multi-UAV path planning is maintaining a safe separation distance between vehicles to ensure flight safety. To address this, a safety distance constraint is formulated to guarantee that the distance between any two UAVs remains greater than a prescribed minimum safety threshold throughout the trajectory. This constraint effectively ensures collision-free cooperative flight. The multi-UAV safety distance constraint can be expressed as:

‖ p_{i} (t) - p_{j} (t) ‖_{2} \geq l_{min}, (i \neq j, \forall t \in [t_{0}, t_{f}])

(30)

where the subscripts i and j denote any two distinct UAVs in the formation,

p_{i} = [\begin{matrix} x_{i} & y_{i} & z_{i} \end{matrix}]

is the position vector of the i -th UAV, and

l_{min}

represents the minimum safe flight distance between UAVs.

In cooperative multi-UAV trajectory planning, the desired formation selected based on mission requirements ensures both efficiency and safety. The design of this formation, based on mission-oriented formation selection, was completed in Section 3. Therefore, the path planning will be based on this optimal geometry, and a formation keeping constraint is imposed to ensure the stability and accuracy of the formation throughout the flight. Based on the previously established formation model, the formation keeping constraint can be expressed as:

x_{i} = x_{L} + d x_{i}^{o p t} \cos χ_{L} + d y_{i}^{o p t} \sin χ_{L}

(31)

y_{i} = y_{L} + d x_{i}^{o p t} \sin χ_{L} - d y_{i}^{o p t} \cos χ_{L}

(32)

z_{i} = z_{L} + d z_{i}^{o p t}

(33)

where the subscript L denotes the leader aircraft, and $d x_{i}^{o p t}$ , $d y_{i}^{o p t}$ , and $d z_{i}^{o p t}$ represent the desired relative positions of the i-th UAV with respect to the leader in the optimized formation.

Synchronous arrival requires that all UAVs reach their designated target points at the same location and time, which is critical for mission coordination and maximizing operational effectiveness. To achieve this, the cooperative strategy begins with a time prediction step, in which the feasible arrival times of each UAV at its target position are estimated. Through optimization, the minimum and maximum feasible flight times of the i-th UAV, denoted as $t_{f i, min}$ and $t_{f i, max}$ respectively, are determined.

Following this, a time coordination step uses these individual feasible intervals $[t_{f i, min}, t_{f i, max}]$ to establish a common time window, $[T_{f, min}, T_{f, max}]$ , for the entire formation, where:

\begin{aligned} T_{f, min} = max {t_{f 1, min}, \dots, t_{f i, min}} \\ T_{f, max} = min {t_{f 1, max}, \dots, t_{f i, max}} \end{aligned}

(34)

Finally, in the cooperative trajectory optimization phase, a single terminal time, $t_{f}$ , is selected from within this coordinated time window $[T_{f, min}, T_{f, max}]$ and is applied as a terminal constraint to the path planning problem, thereby ensuring the synchronous arrival of all UAVs.

Next, building upon the established model for multi-UAV cooperative path planning, a study of the trajectory optimization method using the RPM will be conducted.

Optimal control discretization via the RPM

The RPM employs LGR collocation points to discretize the optimal control problem into a NLP problem. For multi-UAV cooperative path planning, the problem can thus be reformulated as an optimal control problem with multiple constraints. Accordingly, the procedure for transforming the path planning problem into an NLP is as follows.

First, collocation points are selected. In RPM, optimization is performed using LGR points. Suppose that Z collocation points are chosen; then, the LGR points are defined as the roots of the polynomial $P_{Z - 1} (ϖ) + P_{Z} (ϖ)$ , where $P_{Z} (ϖ)$ denotes the Legendre polynomial of degree Z, which can be expressed as:

P_{Z} (ϖ) = \frac{1}{2^{Z} Z!} \frac{d^{Z}}{d ϖ^{Z}} [(ϖ^{2} - 1)^{Z}]

(35)

According to the flight conditions in trajectory optimization, let $t_{0}$ and $t_{f}$ denote the initial and terminal times, respectively. The trajectory optimization is typically defined over the time interval $[t_{0}, t_{f}]$ , while the RPM is formulated over the standard interval $(- 1, 1]$ . Therefore, it is necessary to transform the trajectory optimization time domain into the pseudospectral time domain. Defining $ϖ$ s the mapped time from the physical interval $[t_{0}, t_{f}]$ to $[t_{0}, t_{f}]$ , the transformation is given by:

ϖ = \frac{2 t}{t_{f} - t_{0}} - \frac{t_{f} + t_{0}}{t_{f} - t_{0}}

(36)

As described above, the path planning time interval is $[t_{0}, t_{f}]$ . According to Equation (36), it is transformed into the standard pseudospectral interval $[- 1, 1]$ , whereas the RPM is defined over $[- 1, 1)$ . To ensure consistency of the time intervals, an additional node is introduced at the initial time $ϖ = 0$ . Consequently, the collocation point set K in RPM consists of the Z LGR points together with the terminal time point, yielding $K = Z + 1$ . After selecting the points, the state variables, control variables, and various constraints from the cooperative path planning problem must be approximated and discretized.

In RPM, the state variable $x_{P} (ϖ)$ and the control variable $u_{P} (ϖ)$ within each collocation interval are approximated by polynomials of degree K. Specifically, they can be expressed as:

x_{P} (ϖ) \approx X_{P} (ϖ) = \sum_{i = 1}^{K} L_{i} (ϖ) X_{P} (ϖ_{i})

(37)

u_{P} (ϖ) \approx U_{P} (ϖ) = \sum_{i = 1}^{K - 1} {\hat{L}}_{i} (ϖ) U_{P} (ϖ_{i})

(38)

where

X_{P} (ϖ)

and

U_{P} (ϖ)

are the polynomial approximations of the state and control variables, respectively.

L_{i} (ϖ)

and

{\hat{L}}_{i} (ϖ)

are the corresponding Lagrange basis polynomials for each node. The general form for a Lagrange polynomial is

L_{i} (ϖ) = \prod_{j = 1, j \neq i}^{K} \frac{ϖ - ϖ_{j}}{ϖ_{i} - ϖ_{j}}, {\hat{L}}_{i} (ϖ) = \prod_{j = 1, j \neq i}^{K - 1} \frac{ϖ - ϖ_{j}}{ϖ_{i} - ϖ_{j}}

(39)

It should be noted that the basis polynomials $L_{i} (ϖ)$ for the state approximation in Equation (37) are constructed using the full set of $K = Z + 1$ points which consists of the LGR points plus the terminal point. In contrast, the basis polynomials ${\hat{L}}_{i} (ϖ)$ for the control approximation in Equation (38) are constructed using only the Z LGR points. This is because the control variable is not typically defined at the final point of the interval in this formulation. Consequently, the number of collocation points for the state and control variables are different, being K and $K - 1$ , respectively.

By differentiating the state approximation with respect to $ϖ$ within the collocation interval, we obtain the derivative of the state at the collocation points,

{\dot{x}}_{P} (ϖ_{z}) \approx {\dot{X}}_{P} (ϖ_{z}) = \sum_{i = 0}^{K - 1} {\dot{L}}_{i} (ϖ_{z}) X_{P} (ϖ_{i}) = \sum_{i = 0}^{K} D_{z, i} X_{P} (ϖ_{i})

(40)

where the differentiation matrix

D_{z, i} \in R^{Z \times (Z + 1)}

is expressed as the derivative of the Lagrange basis polynomial evaluated at the collocation point

ϖ_{z}

D_{z, i} = {\dot{L}}_{i} (ϖ_{z}) = {\begin{matrix} \frac{(1 + ϖ_{i}) {\ddot{P}}_{Z} (ϖ_{i}) + 2 {\dot{P}}_{Z} (ϖ_{i})}{2 [(1 + ϖ_{i}) {\dot{P}}_{Z} (ϖ_{i}) + P_{Z} (ϖ_{k})]}, i = z \\ \frac{(1 + ϖ_{z}) {\dot{P}}_{Z} (ϖ_{z}) + P_{Z} (ϖ_{z})}{(ϖ_{z} - ϖ_{i}) [(1 + ϖ_{i}) {\dot{P}}_{Z} (ϖ) + P_{Z} (ϖ_{z})]}, i \neq z \end{matrix}

(41)

In summary, to handle the dynamic equations constraints, the differential equations governing the dynamics are transformed into a set of discrete algebraic equality constraints, represented at each collocation point as

\sum_{i = 1}^{K} D_{z, i} X_{P} (ϖ_{i}) - 0.5 (t_{f} - t_{0}) f (X_{P} (ϖ_{z}), U_{P} (ϖ_{z}); t_{0}, t_{f}, ϖ_{z}) = 0

(42)

Path constraints are conditions that restrict the behavior of the multi-UAV system during its trajectory. The performance constraints, no-fly zone constraints, and safety distance constraints, which were discussed in detail previously, all fall into the category of inequality path constraints. These constraints can be approximated in the following general form, which must hold at each collocation point:

B (X_{P z}, U_{P z}, ϖ_{z}; t_{0}, t_{f}) \leq 0, z = 1, 2, \dots, K - 1

(43)

Boundary constraints impose specific requirements on the initial and terminal points of the trajectory to ensure feasibility and to meet mission or system objectives. The synchronous arrival requirement, as discussed previously, is enforced using boundary constraints. This can be expressed in a general form as:

ξ (X_{P 0}, X_{P f}, t_{0}, t_{f}) = 0

(44)

The optimization objective designed previously, after discretization using the pseudospectral method, can be approximated as:

J = Ξ (X_{P 0}, X_{P f}, t_{0}, t_{f}) + 0.5 (t_{f} - t_{0}) \sum_{i = 1}^{K - 1} ω_{z} ι (X_{P z}, U_{P z}; t_{0}, t_{f}, ϖ_{z})

(45)

where

ω_{z} = \int_{- 1}^{1} L_{i} (ϖ) d ϖ

are the Gauss-Legendre quadrature weights,

Ξ

represents the non-integral Mayer term of the objective function, and

ι

represents the integral Lagrange term.

In summary, the workflow of the multi-UAV cooperative path planning algorithm based on the RPM is depicted in the flowchart in Figure 5. The algorithm follows a three-stage process. The first stage involves initializing the problem constraints, such as the initial positions and the dynamic models. In the second stage, the algorithm solves for the optimal trajectory of each UAV individually, subject to its respective dynamic, no-fly zone, and performance constraints, while optimizing for the primary objective. In the final stage, cooperative constraints—including safety distances, formation keeping, and synchronous arrival—are introduced to refine the individual paths into a set of multi-constraint, coordinated trajectories for the entire formation.

Figure 5.

Flowchart of the cooperative path planning algorithm using the RPM.

Mathematical simulation

Numerical implementation

The multi-UAV cooperative path planning problem as described in the previous sections is solved using RPM using the numerical values for the parameters shown in Table 1. The optimization is transcribed into a large-scale nonlinear programming (NLP) problem via the RPM integration method, and is subsequently solved using the NLP solver IPOPT. The simulations are executed on a standard personal computer equipped with an [Intel Core i7-12700 K @ 3.60 GHz] and [32 GB RAM].

Table 1.

Parameters used in the simulation.

Parameters	Value	Parameters	Value
$μ_{min}$	$- 20 \circ$	$μ_{max}$	$20 \circ$
${\dot{μ}}_{min}$	$- 3 \circ$	${\dot{μ}}_{max}$	$3 \circ$
$α_{min}$	$- 10 \circ$	$α_{max}$	$10 \circ$
${\dot{α}}_{min}$	$- 2 \circ$	${\dot{α}}_{max}$	$2 \circ$
$β_{min}$	$0 \circ$	$β_{max}$	$0 \circ$
${\dot{β}}_{min}$	$0 \circ$	${\dot{β}}_{max}$	$0 \circ$
$l_{min}$	$30 m$	$n_{max}$	5
$C_{L_{0}}$	0.0063	$C_{L_{α}}$	0.264
$C_{D_{0}}$	0.00032	$C_{D_{α}}$	0.011
$C_{Y_{β}}$	−0.1661	$S$	$84 m^{2}$

For the pseudospectral discretization setting, a dynamic hp-adaptive mesh refinement strategy is adopted,³⁸ with a relative error tolerance of $10^{- 6}$ , actively adjusting the polynomial degree between 4 and 16 per segment. For the number of nodes and segments, the time domain is uniformly divided into 10 segments, with 4 collocation nodes assigned to per segment. An automatic bounds-based scaling formulation,³⁹ is applied to map all states and controls into a nondimensional computational space. To guarantee continuous-time feasibility, between-node constraint verification is enforced by reconstructing trajectories via Lagrange polynomial approximation over a dense inter-node grid,⁴⁰ ensuring dynamic and path constraints satisfy tolerance.

For the numerical optimization settings, the NLP error tolerance is set to $10^{- 6}$ , and the maximum number of solver iterations is strictly limited to 3000. To handle the complex constraints, dynamic equation constraints and synchronous arrival constraints are transcribed as algebraic equality constraints, performance constraints, no-fly zone constraints and collision constraints are formulated as bounded inequality path constraints. All constraints are strictly processed by the solver's internal interior-point algorithm.

The initial positions of the UAV formation are fixed, but in the cooperative formation trajectory planning problem, the terminal position of each UAV is uncertain; only the terminal position of the UAV at the center of the formation is determined by the target position. The initial and terminal positions for the multi-UAV cooperative path planning scenario are specified in Table 2. Simultaneously, the initial velocity of each UAV in the formation is set to 200 m/s, while the terminal velocity is uncertain. For the remaining states, specifically the climb angle $γ$ and flight path azimuth angle $χ$ , both their initial and terminal values are treated as free. Accordingly, the initial guesses for each UAV are summarized in Table 3.

Table 2.

The initial and terminal positions for multi-UAV trajectory planning.

UAV ID	Initial position	Terminal position
UAV1	(0, 0, 4000)	(4000,4000,4200)
UAV 2	(0, −100, 3900)	Free
UAV 3	(0, −200, 3800)	Free
UAV 4	(−100, 0, 3850)	Free
UAV 5	(−200, 0, 3950)	Free

Table 3.

The initial guesses for each UAV.

UAV ID	$x$	$y$	$z$	$V$	$γ$	$χ$	$t$	$μ, \dot{μ}, α, \dot{α}, β, \dot{β}, T$
UAV1	[0; 4000]	[0; 4000]	[4000; 4200]	[200; 200]	[30; −30]	[40; −40]	[0; 30]	[0; 0]
UAV 2	[0; 4500]	[−100; 4500]	[3900; 4500]	[200; 200]	[30; −30]	[40; −40]	[0; 30]	[0; 0]
UAV 3	[0; 4500]	[−200; 4500]	[3800; 4500]	[200; 200]	[30; −30]	[40; −40]	[0; 30]	[0; 0]
UAV 4	[−100; 4500]	[0; 4500]	[3850; 4500]	[200; 200]	[30; −30]	[40; −40]	[0; 30]	[0; 0]
UAV 5	[−200; 4500]	[0; 4500]	[3950; 4500]	[200; 200]	[30; −30]	[40; −40]	[0; 30]	[0; 0]

Simulation results and analysis of the formation optimalization

To validate the effectiveness of the multi-UAV formation design, which is based on mission-oriented formation selection, a five-UAV formation flight scenario was designed.

The simulation parameters were set as follows: a radar detection radius of $R_{r a d a r} = 2000 m$ , a radar detection apex angle of $γ_{r a d o r} = 45 \circ$ , a minimum maneuvering radius of $R_{m a n e} = 200 m$ , and a maximum communication distance of $d_{l i n k} = 100 m$ . The weighting factors for the objective function were set to $ω_{1} = ω_{2} = ω_{5} = 0.2$ , $ω_{3} = 0.3$ , and $ω_{4} = 0.1$ .

Using the mission-oriented formation selection model and the SAPSO optimization algorithm, the desired formation relative positions were obtained and are presented in Table 4. In this formation, UAV1 is designated as the leader, while UAV2 through UAV5 are designated as the followers.

Table 4.

Optimized relative positions for the desired multi-UAV formation.

UAV ID	$d x (m)$	$d y (m)$
UAV1	0	0
UAV2	−98.4	−95.3
UAV3	−98.4	95.3
UAV4	−105	0
UAV5	−179.3	0

The response of the SAPSO optimization fitness function and the resulting optimized formation are shown below.

As can be seen from Figure 6.(a), the optimization algorithm converges after 50 iterations, which indicates that the formation optimized by SAPSO, based on the mission-oriented performance constrained model, has achieved maximum operational effectiveness. Based on the relative positions of the leader and followers in the formation, as shown in Table 4, and in conjunction with the leader-follower multi-UAV formation structure from Figure 1, the resulting optimized UAV formation is depicted in Figure 6.(b). In this figure, the dashed lines outline the shape of the multi-UAV formation. It is evident that this diamond-shaped UAV formation is the optimal configuration for a reconnaissance and strike scenario. Subsequently, this optimized diamond formation will be used as the input for future research on cooperative guidance and control.

Figure 6.

Formation optimization simulation results: (a) fitness function curve; (b) formation illustration.

Path planning simulation results and analysis

The coordinates of the radar and artillery threats for the simulation are provided in Table 5.

Table 5.

Location of radar and artillery threats.

Threat type	$x_{e t} (m)$	$y_{e t} (m)$	$z_{e t} (m)$	$R_{e t} (m)$
Radar/Artillery	3000	3000	3200	1000

The coordinates of the flight obstacle zone are given in Table 6.

Table 6.

Location of the flight obstacle zone.

Obstacle type	$x_{o b} (m)$	$y_{o b} (m)$	$R_{o b} (m)$
Obstacle Zone	1600	2100	600

In the simulation scenario, UAV1 is designated as the leader, with the remaining aircraft acting as followers. The formation constraint is defined by the desired relative positions from the desired formation shown in Table 4. The simulation results for the multi-UAV cooperative path planning are presented below.

As illustrated in Figure 7, which show the 3D and X-Y plane trajectories for the multi-UAV cooperative path planning simulation, respectively, all members of the formation are able to arrive at the designated location in the specified formation despite starting from different initial states. The results demonstrate that the formation effectively avoids the flight obstacle zone and the radar/artillery threats. Furthermore, the trajectories are smooth and do not violate the safety distance constraints, which validates the effectiveness of the cooperative trajectory planning, obstacle avoidance, and overall cooperative design.

Figure 7.

Position trajectories of the cooperative path planning with five UAVs: (a) three-dimensional; (b) X–Y plane.

Figure 8 shows the response curves for the position states x, $y$ , and z of each member in the multi-UAV formation. The plots demonstrate that the cooperative path planning ensures the UAVs fly according to the specified formation and ultimately stabilize at their desired altitudes, confirming that the trajectory responses are consistent with realistic flight conditions.

Figure 8.

Position states of the cooperative path planning with five UAVs: (a) x; (b) y; (c) z.

Figure 9 depict the response curves for each UAV's velocity V, climb angle $γ$ , and flight path azimuth angle $χ$ , respectively. It can be observed that the state responses for each UAV are smooth and stable. The climb angle and flight path azimuth angles of each formation member are consistent with the overall trajectory's trend, which validates the effectiveness of incorporating the UAV's dynamic equations constraints into the multi-UAV cooperative path planning design.

Figure 9.

Intermediate states of the cooperative path planning with five UAVs: (a) v; (b) $γ$ ; (c) $χ$ .

The responses of the control inputs and constraints for the multi-UAV cooperative path planning are shown in Figure 10–11. These results show that the control commands for each UAV in the cooperative path planning scenario exhibit a rapid response. The state responses for the flight path bank angle $μ$ , angle of attack $α$ , sideslip angle $β$ , and their respective rates of change $\dot{μ}$ , $\dot{α}$ , $\dot{β}$ are all smooth and remain within their specified constraint boundaries.

Figure 10.

Control inputs of the cooperative path planning with five UAVs: (a) $μ$ ; (b) $α$ ; (c) $β$ .

Figure 11.

Constrains of the cooperative path planning with five UAVs: (a) $\dot{μ}$ ; (b) $\dot{α}$ ; (c) $\dot{β}$ ; (d) $l o a d$ .

Figure 11. (d) illustrates the load factor n response for each member of the multi-UAV formation. As can be seen, the load factor throughout the planned trajectories does not exceed the constraint value, indicating a good dynamic response and ensuring the safety of the cooperative flight. Meanwhile, the oscillation period of the load shown in the figure is usually around 3–5 s, which is less than 1 Hz, completely within the response bandwidth of the actuator, and will not damage the structure of the UAV. This demonstrates the effectiveness of the performance constraint design within the cooperative path planning framework.

Figure 12 present the relative distances between each UAV and all other members of the formation throughout the cooperative path planning simulation. As shown in the plots, each member of the multi-UAV formation successfully maintains the minimum safe separation distance from all other aircraft during the entire trajectory, thereby satisfying the flight safety distance constraint.

Figure 12.

Relative distance between the specific and all other formation members: (a) UAV1; (b) UAV2; (c) UAV3; (d) UAV4; (e) UAV5.

To quantitatively evaluate the effectiveness and rigorous constraints satisfaction of the proposed multi-UAV cooperative path planning, several key numerical metrics are extracted from the optimization results and summarized in Table 7.

Table 7.

Quantitative evaluation metrics for each UAV.

Evaluation metrics	UAV 1	UAV 2	UAV3	UAV4	UAV 5
Flight Time(s)	35.8	35.8	35.8	35.8	35.8
Path Length(m)	6287.02	6237.92	6483.65	6206.28	6067.87
Min Inter-UAV Distance(m)	39.78	39.78	57.11	58.75	74.36
Min Distance to Obstacle (m)	155.36	158.66	147.61	158.67	155.19
Min Distance to Threat (m)	60.31	84.61	72.04	59.00	94.80

As shown in Table 7, the uniform flight times clearly verify the successful satisfaction of the simultaneous arrival constraint. Although the entire mission involved maneuvering around no-fly zones, the minimum inter-UAV distance recorded during the flight was 39.78 m, strictly satisfying the 30 m collision avoidance threshold. Similarly, the clearance distances to physical obstacles and radar/artillery threat areas remained sufficiently large, ensuring the safety of the formation flight. These quantitative indicators comprehensively demonstrate the feasibility and cooperative performance of the proposed algorithm.

In conclusion, the simulation results verify that the proposed approach, based on the RPM, can simultaneously satisfy multiple objectives and constraints. These include the dynamic equations constraints, performance limitations, obstacle avoidance, minimum control effort, safety distance requirements, and formation keeping constraints. Ultimately, the method successfully achieves synchronous arrival at the destination, which demonstrates the effectiveness of the multi-objective cooperative path planning methodology designed in this paper.

To demonstrate the algorithmic superiority, the proposed RPM is compared against Particle Swarm Optimization (PSO)²⁰ and Model Predictive Control (MPC)²⁴ under identical dynamic models and constraints. It is important to emphasize that the cited PSO algorithm integrates a violation function into its cost evaluation to effectively handle constraints, such as dynamic equations and performance boundaries, thus ensuring a fair comparison. The simulation results are illustrated in Figures 13–15.

Figure 13.

3D trajectories of the cooperative path planning with five UAVs using different methods: (a) PSO; (b) MPC.

Figure 14.

$μ$ of the cooperative path planning with five UAVs using different algorithms: (a) PSO; (b) MPC.

Figure 15.

$α$ of the cooperative path planning with five UAVs using different algorithms: (a) PSO; (b) MPC.

As Shown in Figures 13–15, while the PSO algorithm maintains geometric avoidance boundaries, the angle of attack exhibits severe high-frequency chattering, yielding an unsmooth flight trajectory. Although the MPC generates smooth trajectories and chatter-free states, its flight path bank angle varies considerably, resulting in aggressive maneuvering. In contrast, the proposed RPM effectively executes collision avoidance while strictly adhering to both the dynamic equations and performance constraints. The resulting trajectories and states remain perfectly smooth, strictly conforming to principles of actual flight.

To quantitatively demonstrate the algorithm's advantages, Table 8 summarizes the relevant state indices of different methods. The analysis results show that, compared with the PSO and MPC methods, the proposed RPM algorithm achieves a shorter total flight time and a lower average flight time for formation members. These quantitative indices fully demonstrate the effectiveness and feasibility of the proposed RPM algorithm.

Table 8.

Comparison of metrics from different algorithms.

Algorithms	Maximum flight path bank angle $μ_{max}$ (degree)	Flight time(s)	Average path length(m)
PSO	19.3	36.1	6309.2
MPC	17.9	36.9	6395.1
RPM	13.8	35.8	6256.5

To further evaluate the practical applicability of the proposed RPM-based multi-UAV cooperative path planning algorithm, a computation efficiency and scalability analysis is conducted. The empirical computation times for different UAV formation scales are summarized in Table 9.

Table 9.

The empirical computation times for different UAV formation scales.

Number of UAVs	Total computation time
3	$\sim 6 s$
5	$\sim 14 s$
8	$\sim 45 s$
10	$\sim 125 s$

As illustrated in Table 9, the computation time exhibits an upward trend with the increasing number of UAVs. The fundamental bottleneck regarding the scalability of this method is the combinatorial explosion of constraints associated with the growing number of UAVs. For example, regarding the cooperative safety distance constraint, according to Equation (30), the collision avoidance requirement between any two distinct UAVs (i.e., $i \neq j$ ) is transformed through RPM discretization into inequality path constraints that must be strictly satisfied at every collocation point, as formulated in Equation (43). As the formation scale expands, the number of such constraints grows combinatorially, leading to a substantial increase in the computational load for the NLP solver to find an optimal solution. Consequently, from an engineering perspective, this RPM-based approach is well-suited for fine-grained, offline, high-fidelity trajectory generation for small-to-medium-sized formations, but it is challenging to meet the demands of real-time collision avoidance for large-scale swarms.

Conclusion and future work

This paper studies the multi-UAV cooperative path planning that integrates formation design optimized for mission-oriented performance constraints and multi-constraint in detail. A mission-oriented performance constrained formation model was established, accounting for cooperative detection, cooperative maneuvering penetration, target destruction, and communication command capability, and the SAPSO algorithm was employed to derive desired formation. Building on this foundation, a cooperative path planning algorithm based on the RPM was proposed, transforming a complex optimal control problem with 6-DOF dynamics, performance bounds, no-fly zones, formation maintenance, and synchronous arrival into a nonlinear programming problem. Simulations verified that the proposed approach generates smooth, safe trajectories satisfying all dynamic and cooperative constraints, while consistently preserving optimized formation geometry and ensuring synchronous arrival. Future research will extend the approach to more dynamic scenarios with moving obstacles, real-time replanning, and heterogeneous swarms, with practicality and robustness validated through hardware-in-the-loop simulations and outdoor multi-UAV flight experiments.

Footnotes

ORCID iD

Zequn Liu

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Javaid

Ullah

Khan

, et al. Communication and control in collaborative UAVs: recent advances and future trends. IEEE Transactions on Intelligent Transportation Systems 2023; 24: 5719–5739.

Liu

. A modified hp-adaptive pseudospectral method for multi-UAV formation reconfiguration. ISA Transactions 2022; 129: 217–229.

, et al. Optimization of multi-target continuous dynamic trajectory for un-manned aerial vehicles. Aerospace Science and Technology 2024; 150: 108958.

Chen

Bai

Zhao

, et al. Closed-loop optimal control based on two-phase Pseudospectral convex optimization method for swarm system. Aerospace Science and Technology 2023; 143: 108704.

Cao

, et al. Cooperative path planning optimization for multiple UAVs with communication constraints. Knowledge-Based Systems 2023; 260: 110164.

Jin

Zhang

, et al. Cross-platform mission planning for UAVs under carrier delivery mode. Defence Technology 2025; 53: 76–97.

Wang

, et al. Carrier platform-enhanced multiple-UAV cooperative task assignment with dual heterogeneities. Artificial Intelligence Review 2025; 58: 248.

Wang

Gao

Wang

, et al. Resilient multi-objective mission planning for UAV formation: a unified framework integrating task pre- and re-assignment. Defence Technology 2024; 45: 203–226.

Raheem

Al-Obaidi

ASM

Ali

, et al. Path planning algorithm using D* heuristic method based on PSO in dynamic environment. American scientific research journal for engineering. Technology, and Sciences 2018; 49: 257–271.

10.

Peng

Guo

, et al. Three-dimensional integrated guidance and control for strapdown interceptor under divert and attitude control system. IEEE Transactions on Aerospace and Electronic Systems 2026;62: 9331–9344.

11.

Zhang

Dai

, et al. Optimal path planning with modified A-star algorithm for stealth unmanned aerial vehicles in 3D network radar environment. Proceedings of the institution of mechanical engineers. Part G: Journal of Aerospace Engineering 2021; 236: 72–81.

12.

Zhang

Wang

, et al. Research on multi-UAV obstacle avoidance with optimal consensus control and improved APF. Drones 2024; 8: 248.

13.

Zhu

Yin

Lyu

. Automatic collision avoidance algorithm based on route-plan-guided artificial potential field method. Ocean Engineering 2023; 271: 113737.

14.

Wang

Tao

, et al. A warm-started trajectory planner for fixed-wing unmanned aerial vehicle formation. Applied Mathematical Modelling 2023; 122: 200–219.

15.

. Multi-UAV search and rescue with enhanced A* algorithm path planning in 3D environment. International Journal of Aerospace Engineering 2023; 2023: 8614117.

16.

Jabbar

Abass

Hasan

. A modification of shortest path algorithm according to adjustable weights based on Dijkstra algorithm. Engineering and Technology Journal 2023; 41: 359–374.

17.

Huang

Chen

. Multi-UAV cooperative online searching based on Voronoi diagrams. IEEE Transactions on Aerospace and Electronic Systems 2024; 60: 3038–3049.

18.

Abid

El Kafhali

Amzil

, et al. Optimization of UAV flight paths in multi-UAV networks for efficient data collection. Arabian Journal for Science and Engineering 2025; 50: 7207–7232.

19.

Chen

Liang

Meng

. A UAV path planning method for building surface information acquisition utilizing opposition-based learning artificial bee colony algorithm. Remote Sensing 2023; 15: 4312.

20.

Salamat

Tonello

. Stochastic trajectory generation using particle swarm optimization for quadrotor unmanned aerial vehicles (UAVs). Aerospace 2017; 4: 27.

21.

Jiao

Chen

Xin

, et al. Three-dimensional path planning with enhanced gravitational search algorithm for unmanned aerial vehicle. Robotica 2024; 42: 2453–2487.

22.

Zhao

Yang

Zhong

, et al. Multi-UAV path planning and following based on multi-agent reinforcement learning. Drones 2024; 8: 27.

23.

Aouf

Song

. Explainable deep reinforcement learning for UAV autonomous path planning. Aerospace Science and Technology 2021; 118: 107052.

24.

Yildiz

Keskin

. Dual-objective model predictive control for longitudinal tracking and connectivity-aware trajectory optimization of fixed-wing UAVs. Drones 2025; 9: 719.

25.

Jacquet

Kivits

Das

, et al. Motor-level N-MPC for cooperative active perception with multiple heterogeneous UAVs. IEEE Robotics and Automation Letters 2022; 7: 2063–2070.

26.

Chen

Yang

. Multistage linear gauss pseudospectral method for piecewise continuous nonlinear optimal control problems. IEEE Transactions on Aerospace and Electronic Systems 2021; 57: 2298–2310.

27.

Koeppen

Göttlich

Leugering

, et al. Fast mesh refinement in pseudospectral optimal control. Journal of Guidance, Control, and Dynamics 2019; 42: 711–722.

28.

Wang

Liang

, et al. Mapped Chebyshev pseudospectral methods for optimal trajectory planning of differentially flat hypersonic vehicle systems. Aerospace Science and Technology 2019; 89: 420–430.

29.

Mahmoud

Soliman

Abdelrahman

, et al. Trajectory optimization for ascent and glide phases using Gauss pseudospectral method. International Journal of Modeling and Optimization 2016; 6: 289–295.

30.

Chai

Tsourdos

Savvaris

, et al. Real-time reentry trajectory planning of hypersonic vehicles: a two-step strategy incorporating fuzzy multiobjective transcription and deep neural network. IEEE Transactions on Industrial Electronics 2020; 67: 6904–6915.

31.

Zhang

Wang

, et al. Time-optimal memetic whale optimization algorithm for hypersonic vehicle reentry trajectory optimization with no-fly zones. Neural Computing and Applications 2020; 32: 2735–2749.

32.

Cheng

Zhang

. Efficient ascent trajectory optimization using convex models based on the Newton–Kantorovich/pseudospectral approach. Aerospace Science and Technology 2017; 66: 140–151.

33.

Song

. The ascent trajectory optimization of two-stage-to-orbit aerospace plane based on pseudospectral method. Procedia Engineering 2015; 99: 1044–1048.

34.

Chen

Gong

, et al. A study of morphing aircraft on morphing rules along trajectory. Chinese Journal of Aeronautics 2021; 34: 232–243.

35.

Fan

Xing

, et al. Bi-level programming modeling and hierarchical hybrid algorithm for antimissile dynamic firepower allocation problem with uncertain environment. Pattern Analysis and Applications 2017; 20: 287–306.

36.

Sun

. Continuous transportation network design problem based on bi-level programming model. Procedia Engineering 2016; 137: 277–282.

37.

Yang

Zhang

. Modeling of situation assessment in regional air defense combat. Journal of Defense Modeling and Simulation 2018; 16: 91–101.

38.

Darby

Hager

Rao

. An hp-adaptive pseudospectral method for solving optimal control problems. Optimization Control Applications and Methods 2011; 32: 476–502.

39.

Sagliano

. Performance analysis of linear and nonlinear techniques for automatic scaling of discretized control problems. Operations Research Letters 2014; 42: 213–216.

40.

Patterson

Hager

Rao

. A ph mesh refinement method for optimal control. Optimization Control Applications and Methods 2015; 36: 398–421.