Abstract
Multi-robot path planning has evolved from research to real applications in warehouses and other domains; the knowledge on this topic is reflected in the large amount of related research published in recent years on international journals. The main focus of existing research relates to the generation of efficient routes, relying the collision detection to the local sensory system and creating a solution based on local search methods. This approach implies the robots having a good sensory system and also the computation capabilities to take decisions on the fly. In some controlled environments, such as virtual labs or industrial plants, these restrictions overtake the actual needs as simpler robots are sufficient. Therefore, the multi-robot path planning must solve the collisions beforehand. This study focuses on the generation of efficient collision-free multi-robot path planning solutions for such controlled environments, extending our previous research. The proposal combines the optimization capabilities of the A* algorithm with the search capabilities of co-evolutionary algorithms. The outcome is a set of routes, either from A* or from the co-evolutionary process, that are collision-free; this set is generated in real-time and makes its implementation on edge-computing devices feasible. Although further research is needed to reduce the computational time, the computational experiments performed in this study confirm a good performance of the proposed approach in solving complex cases where well-known alternatives, such as M* or WHCA, fail in finding suitable solutions.
Introduction
Technological advances make the management of logistic centers easier by using robots in the transportation tasks, either with homogeneous or heterogeneous robots [1]; in this study, we restrict ourselves to this latter category, with all the robots moving in the same common space, with same type of movements and the same speed. The problem of determining the best route for each robot moving in a shared space without collisions nor human intervention is known as collision Free Multi-robot Path Planning (MPP). MPP has applications in multiple fields such as intelligent laboratories[2, 3], space exploration [4], warehouse management [5], education [3], surveillance [6], manufacturing [7], unmanned aerial vehicles [6, 8, 9, 10], and many others.
In a MPP problem, each robot moves from its initial point -either in a depot or in a free parking area- to their current destination on a common workspace. It has been shown that, although a NP-problem [11], optimization techniques can solve it effectively [12, 13]. The most cited algorithm used in single-robot path planning is A* [14], that is, A* determines the optimal route for a robot from its initial position to its goal. Since its introduction, several different alternative algorithms have been proposed to cope with the MPP problem: Collaborative A*, Hierarchical Collaborative A* (HCA*) and Windowed HCA* [15],
A different alternative is proposed in [26], where a construction of a mixed-split graph for every two connected nodes is used; a new extended version of the original graph is constructed by repeating this structure T times -with T the expected time of arrival- and introducing extra edges representing whether a robot stays in its node or moves to the next one. Then, a linear programming algorithm is used to determine the route for each robot. Finally, the constrain based search was proposed in [27], with two abstraction layers: on the high level, agents with collision free routes are found, for each one searches on the low level are performed until a feasible combination is determined.
Optimization metaheuristics have been found performing similarly or better than classical methods for the MPP problem [23]; in this study, a comparison of different alternatives, including Genetic Algorithms (GA), Differential Evolution (DE), Particle Swarm Optimization (PSO), Cuckoo Search, or the Dijkstra algorithm, among others. In [28], Ant Colony Optimization (ACO) has been reported for determining the optimal routes. An hybridization of PSO and Gray-Wolf Optimization has been proposed in [29], relying the collision detection to local resolutions using the robot sensory system and local search. Similarly, the studies in [30, 31, 32] split the problem in two: a centralized stage determining the optimal routes for each robot and a decentralized stage in charge of solving the collisions. To do so, the authors in [30] proposed solving some linear equations based on constraints related with the robot goals, while in [31] the authors proposed to use Lion Optimization for this decoupled task. In contrast, DE was proposed as the main heuristic in [32].
PSO has not only been applied by itself to the MPP problem, but also hybridized with other techniques, such as differential perturbed velocity [33] or gravitational search [34]. Authors in [35] solved the path planning problem based on a two-step process: the first makes use of Artificial Potential Fields (APF) to generate the initial feasible routes, and then, the second stage uses of GA in order to optimize the paths and collision avoidance. Similarly, [36] solves MPP problems using APF and fuzzy inference systems. Not related with metaheuristics but worth mentioning is the use of Reinforcement Learning for MPP [37, 38].
This study is restricted to grid-based problems, such as warehouses or virtual remote labs, where the space is represented by means of a grid and, mainly, the 4-way connectivity rule [39]. Applied in this context, Windowed Hierarchical Cooperative A* (WHCA*) has been employed in solving the MPP using the number of turns as heuristic for the A* algorithm [40]. On the other hand, the capability of D*Lite to cope with a changing environment is used in [41] for solving a warehouse MPP problem. The same problem is addressed for a warehouse in [5], where a specific algorithm is develop determining some intermediate points and joining them with straight lines. A different proposal is presented in [39, 42], using the so-called DDM algorithm (Diversified-path Database-driven Multi-robot Path Planning) able to compute near-optimal solutions to large-scale MPP. DDM follows the decoupled paradigm and first creates a shortest path for each robot from its initial vertex to goal vertex, ignoring other robots. Then, a simulated execution is carried out. As conflicts are detected, they are solved within local sub-graphs with the help of the main heuristics. Similarly, two coordinated layers are proposed in [43] for solving the MPP problem in an industrial warehouse: one layer focuses on the abstract representation of the problem, and the second one focuses on determining the paths. A very similar approach is reported in [44], dividing the warehouse in sectors; a top layer determines the number of vehicles in each sector, while A* with conflict-based searching strategy is used in the path generation at the lowest coordination level.
In this research, a metaheuristic based solution for the MPP problem is proposed, enhancing our previous research results reported in [45]. The solution construction is split in three stages: firstly, an A* initial optimal route finding process is performed, followed by an A*-based alternatives search, and finalizing with a co-evolutionary optimization process that examines and modifies the available pool of routes among all the conflicting robots until a feasible solution is found. This research solves the high computational cost issues reported in the initial research, greatly improving the performance and making it capable of solving problems faster and with a significantly higher number of robots.
The structure of this documents is as follows. Section 2 explains the proposal, remarking the differences with the previous research. Next, Section 3 gives details of the experimentation set up, while Section 4 shows the results and discussion. Finally, Section 5 draws the conclusions extracted from this study.
A collision free multi-robot path planning proposal
This proposal focuses on MPP in grid-mapped environments represented as {M, R, S, G}, where M is the map of the environment stored as a 2D graph where the robots are located in and including the obstacles, R is the set of robots {
A simple example of a maze representation of a problem. The path for the blue, yellow and green robots (colored squares) from their starting points to their goals (colored circles) are LDDDDLLULLDDLLL, DRDD and DRRURRDRUURRRRRRR, correspondingly, assuming (L)eft, (R)ight, (U)p and (D)own movements. 
Let us denote by
To solve the MPP problem a three stages procedure is proposed (see Fig. 2 and Algorithm 2): i) searching for optimal path for each robot using A*, ii) searching for alternative paths for robots with collisions using A*, and iii) producing feasible paths using a co-evolutionary algorithm.
Block diagram of the proposed approach for solving the MPP.
General algorithmeach robot r Calculate the A* route Detect collisions as explained in [15]There exist any collision Alternative path search Co-evolutionary collision resolution
Basically, the first stage runs the A* algorithm for each robot, so an initial optimal set of routes is available; however, these routes might collide and alternatives are searched in the remaining two stages. The next subsections explain these two stages; a final subsection is devoted to remark the differences with our previous research.
This stage is an iterative process that i) detects any collision among the routes, and ii) searches for alternative A* routes for the robots when the collision cells are considered static obstacles. This procedure is repeated at most a predefined maximum number of times (
The collision detection is performed in a similar fashion as reported in [15] for the Cooperative A*, using a reservation table. Thus, new A* routes are calculated for those colliding robots by introducing the collision point as an obstacle in their own map M
Alternative path search
Co-evolutionary algorithm for collision resolution
The co-evolutionary algorithm is the responsible of finding new routes for those colliding persistent robots, those whose
Although it would be much faster if we select one colliding robot -that of smallest number of collisions- to be also kept constant and finding routes for the remaining ones, we decided against it because cross-collisions might lead to introduce more heuristics in the selection of the optimal robot to be kept fixed. Given that the low convergence speed of co-evolutionary algorithms, we chose to avoid this stage introducing all colliding robot in the process.
The next subsections describe the co-evolutionary design, the co-evolutionary evaluation, and the co-evolutionary process.
The co-evolutionary design
In a co-evolutionary algorithm we keep several populations that evolve independently, sharing their effort to obtain a valid final global solution. In this study, a
The
The co-evolutionary scheme makes use of elitism, so the best K individuals from population
Generation of a populationSelect the best individuals to keep as the elite subpopulation next population is not complete Select the parent individual for evolution using tournament Apply mutation to the parent individual resulting a new individual the new individual is valid Introduce it in the new population
The
Mutation operation when the HALT movement is introduced. The route goes from 
Mutation operation when two movements are introduced. The route changes from 
Mutation operation when only one movement is introduced. The route changes from 
However, if the selected operation is one of the other four movements, a second point called
It is worth noticing that in this research we restrict the MSP to be chosen among the movements before a collision occurs, which reduces the search space. The same can be reasoned for the MEP, so it also must be before the collision occurs.
Finally, the mutation operator considers a different probability for each movement in
There are two co-evolutionary evaluations: the local co-evaluation, which is computed on each individual, and the global evaluation, which is computed for each robot population.
In the
To calculate the estimated number of collisions, firstly, Q routes are selected from each sub-population, with Q taking a relatively small number -for example, 3 or 5; the probability of a route to be selected is proportional to its local evaluation. The mean of the number of collisions occurring in each of the possible combination of routes, together with the collision-free routes represents the estimated number of collisions. This collision counting is calculated similarly as for the initial A* stage.
The routes in a population are sorted according to the local co-evaluation, first using the estimated number of collisions and, then, using the length of the path. This design decision aims to promote finding collision-free solutions rather than shorter paths. However, this can be changed to either pursue shorter paths or even sort according to Pareto non-dominance.
Besides, the
The co-evolutionary process
Co-evolutionary algorithms [46, 47, 48] are proposed for the evolution of the colliding paths in search of the collision-free solutions for every robot. The co-evolutionary algorithm is depicted in Algorithm 2.2.3, where r refers to a colliding robot, nE is the defined number of generations,
The co-evolutionary algorithm
Differences with previous research
As mentioned before, this research represents a step forward with respect to our previous study in [45]. Several enhancements have been included, which have introduced a great improvement in the performance -as will be shown in the experimentation results-:
A reduction in the number of A* routes, reducing the second stage to the colliding robots. Enhancement in the mutation operator due to the probabilities of the different movements. Reducing the search space with the constraints in the MSP and MEP. Improvements in the individual representation and co-evolutionary process.
With all these changes, the new solution shows as more competitive than before and finding solutions where other proposals fail as it will be shown in the next section.
The goal of the experiments is to establish whether the enhancements show better performance or not. To do so, we compare our solution with the original one in [45] -referred to as Kiadi2022. Moreover, we also compare our proposal with some well-known algorithms in the MPP problem domain: M* [19, 20] and WHCA [40]. M* has been chosen because it represents an enhancement based on A*; WHCA has been selected because of the specific application field where it was developed for -warehouse MPP-.
The comparison replicates the two scenarios proposed in [45]: the first one with a single room and fixed obstacles; the second one with 4 connected rooms. In both cases, the number of robots is varied to evaluate the collision-avoidance capabilities and the time consumption in finding a solution. In addition, a third scenario is proposed with a warehouse facility extracted from [41].
Figure 6 shows the first experimental scenario: a single 10
Parameters used in the experimentation
Parameters used in the experimentation
The first parameter belongs to the general method, while the remaining ones belongs to the co-evolutionary algorithm. The size of the elite population kept from one generation to the next is relatively small.
Graphical representation of scenery 1 used on experimentation.
Graphical representation of scenery 2 used on experimentation.
Graphical representation of scenery 3 used on experimentation.
To compare the effectiveness of the solutions we incorporate the standard measurements [26]: i) the maximum distance travelled by any single-robot, and ii) the aggregated total distance. The total path-planning time for the MPP algorithms will be registered as well. Up to 10 runs of each algorithm and scenario will be run in order to estimate the statistics of the performances. During the experimentation, the parameters in the Table 1 were used.
Mean (MN) and standard deviation (SD) of the path-planning time in seconds needed for each method and each scenario to obtain a solution according to the number of robots
Mean (MN) and standard deviation (SD) of the path-planning time in seconds needed for each method and each scenario to obtain a solution according to the number of robots
Max and aggregated length of the paths obtained from each method for each scenario according to the number of robots
Comparative graphs of the evolution of the execution time concerning the number of robots for each method. Scenario 1, 2 and 3 are shown at the upper, central and lower part, respectively. No results for M* are included for scenario 3 because its runtime for 10 robots’ is far too large.
Comparative graphs of the length of the paths concerning the number of robots for each method. Scenario 1, 2 and 3 are depicted at the upper, central and lower parts.
As mentioned before, two well-known metrics have been used: the execution time and the path length. Table 2 shows the numerical values obtained for the execution time metric, with gray cells for those methods that were not able to find a suitable solution for the corresponding scenario and number of robots. Interestingly, for scenario 2 our method was the single one that coped with the task satisfactorily.
Figure 9 graphically represents the execution cost of each method according to the scenario and the number of robots including the variance in the metric; as long as only our method found solutions for the second scenario and Table 2 includes the standard deviation, the corresponding figure is omitted. Probably, these results are due to the lack of heuristic search in the other methods as long as they are based on graph search. Interestingly, even though our method has a longer computational time in the first scenario, it can also obtain collision-free routes for a larger number of robots.


Example of the routes in the third scenario.
No comparison can be made for the second scenario because only our method found solutions for the considered number of robots. It is clear that restricting the passages between rooms with very narrow corridors made the scenario so complex that only our proposal was able to find solutions.
For the third scenario, which is the most reliable representation of a warehouse environment, our method has the shortest execution time. The proposal in [45] was directly dismissed due to its poor performance in the previous scenarios. M* only found solutions for the smallest number of robots while WHCA proposed solutions for some number of robots, but in those cases the computation time was higher than our proposal. As for scenario 2, the problem of very restricted passageways seems to burden the M* and the WHCA.
The second used metric, the length of the paths, is included in Fig. 10 and Table 3. Figure 10 depicts the boxplots of the length of the paths from the different runs of the methods, while Table 3 includes i) the length of the longest path and ii) the sum of the lengths of all the paths of a solution. The same behaviour observed for the calculation time in the three scenarios is also present for this metric. In case of the first scenario, the lengths obtained with M* or WHCA were shorter than those obtained with our method, while the aggregated length among all the robots were comparable among all of them.
Several differences can be seen when analysing and comparing the routes proposed in [45] and those obtained in this research: in our method, keeping the collision-free A* paths forces the colliding robots to explore different alternatives and to make a better use of the available space. This can be clearly seen in Fig. 11 and Fig. 12; these figures plots the path generated for each robot using the two methods: the former corresponds with the first scenario and 10 robots, the latter with the second scenario and 3 robots per room. In the first scenario, the variety in the routes introduced by our approach is the one responsible of finding solutions where other proposals fail; however, exploring the domain has a cost in computation time. For the second scenario, the main difference is that the HALT operator reduces the collisions without further exploration of the space.
This study proposes a co-evolutionary algorithm with several improvements to solve the multi-robot path planning with collision avoidance, outperforming our previous research in [45]. The enhancements include i) keeping the collision-free robots’ A* routes, only evolving routes for the colliding robots, ii) reducing the search space by restricting the mutation points, and iii) tuning the probabilities of the movements to be considered in the mutation operator.
To evaluate this proposal, an experimentation in three different scenarios is proposed comparing our approach against two state-of-the-art methods -M* and WHCA- and also our previous research [45]. The obtained results show that our method is capable of finding solutions where the other three methods fail; however, this has a cost in computational time. The quality of the solutions in terms of length are comparable for scenarios such as warehouses.
Future research includes using different metaheuristics for optimization to reduce the computational time, which implies adapting them to the co-evolutionary process. Moreover, the solution space can be reduced by fixing some colliding robots -without collisions between them- and only finding routes for the remaining colliding robots. This can be solved using a sort of integer optimization to promote the A* alternatives. Considering multi-objective techniques to evaluate each route with the different metrics is still pending. These developments must also consider newly proposed learning algorithms, such as Neural Dynamic Classification algorithm [50], Dynamic Ensemble Learning Algorithm [51], and Finite Element Machine for fast learning [52]; these techniques can be used in the improvement of the metaheuristic design.
Footnotes
Acknowledgments
This research has been funded by European Union’s Horizon 2020 research and innovation program (project DIH4CPS) under Grant Agreement no 872548. Furthermore, this research has been funded by the Spanish Ministry of Economics and Industry, grant PID2020-112726RB-I00, by the Spanish Research Agency (AEI, Spain) under grant agreement RED2018-102312-T (IA-Biomed), by CDTI (Centro para el Desarrollo Tecnológico Industrial) under projects CER-20211003 and CER-20211022, by and Missions Science and Innovation project MIG-20211008 (INMERBOT). Also, by Principado de Asturias, grant SV-PA-21-AYUD/2021/50994 and by ICE (Junta de Castilla y León) under project CCTT3/20/BU/0002.
