Abstract
We consider the NP-hard problem of multirobot informative path planning in the presence of communication constraints, where the objective is to collect higher amounts of information of an ambient phenomenon. We propose a novel approach that uses continuous region partitioning into Voronoi components to efficiently divide an initially unknown environment among the robots based on newly discovered obstacles enabling improved load balancing between robots. Simulation results show that our proposed approach is successful in reducing the initial imbalance of the robots’ allocated free regions while ensuring close-to-reality spatial modeling within a reasonable amount of time.
Introduction
Multirobot informative path planning (MIPP) 1 is a highly practical problem in robotics. It is encountered in several applications of multirobot systems, including unmanned exploration of unknown environments, robotic search and rescue, collecting samples of ambient phenomena like fire, nuclear radiation, soil moisture, or algal concentration, from difficult and hazardous environments, such as postdisaster scenarios, combat environments, forests, or water bodies, such as oceans and lakes. 1 –3 The main challenge in MIPP is to coordinate a set of robots so that they can collect samples of the ambient phenomenon and report it back to a central location such as a base station. The robots need to perform this collection task in an energy-efficient manner so that they can collect the maximal information from the environment with the limited battery energy available with them. The problem is further complicated because the locations in the environment where the most information is available are not known a priori and the robots have to explore the environment and discover these locations in real time. Also, the robots have to coordinate their movements and collection actions so that they do not end up collecting redundant samples and are also able to avoid collisions and path conflicts with each other. 3,4 In this research, we consider another practical aspect of MIPP in the form of communication constraints—also not known a priori are the locations in the environment where the robots can form a partial or complete network via wireless connectivity to exchange local knowledge.
However, we observe that in an unknown environment, where the locations, numbers, and shapes of the obstacles are unknown, it might happen that the areas of an initial Voronoi partition are imbalanced. 5 –7 As the robots navigate the environment and discover obstacles, their perceptions about their allocated regions change. Therefore, we posit that their allocated partitions may benefit from dynamic adjustments as each robot discovers inaccessible subregions (e.g. due to obstacles) within its current partition, over time, rendering the load of the information collection task better balanced across all robots. Robots partition the region based on current perceptions of the environment; so, if perceptions include newer information, then there is an opportunity to refine partitions upon regaining connectivity with neighboring robots. We have verified our proposed approach with different numbers of robots having different communication and light detection and ranging (LIDAR) device ranges performing the task of information collection within environments having different distributions of information samples using the Webots simulator. Our results show that the proposed approach is able to successfully reduce the difference between the optimal and initially allocated area distribution among the robots while incurring a reasonable time for up to eight robots.
A preliminary version of this work was published as a short article at the FLAIRS 2019 conference. 8 The conference paper does not contain any experimental evaluation. This article adds extensive simulation experiments and clarified algorithmic details, accordingly expanding upon the abstract, introduction, and related work sections.
The contributions in this article are as follows: To the best of our knowledge, it presents the first formalization of the MIPP problem under communication constraints in an unknown environment. It proposes a greedy informative path planning strategy combined with a novel load-balancing algorithm to recursively repartition the multiple robots’ allocated Voronoi components and achieve a better load-balanced information collection system. It extensively tests our framework with varying numbers of robots as well as varying communication and laser ranges within a high-fidelity robotics simulator, Webots.
This article is organized as follows. First, we discuss the related literature. We next describe the problem under study in this article, including the initial Voronoi partitioning strategy and the greedy informative path planning technique our approach employs. After that, we discuss the recursive Voronoi repartitioning strategy and detail the experimental evaluations and simulation results. Certain limitations of our approach, and their implications in our study, are also identified. Finally, we conclude and suggest potential future directions.
Related work
One of the very first studies on the studied NP-hard MIPP problem is due to Singh et al.
1
They have proposed a branch-and-bound based budgeted path planning technique to solve the problem. This work uses Gaussian processes (GPs)
9
to model environmental spatial phenomena. GP is a popular regression model, which has been used not only in MIPP studies but also for general information collection algorithms using static wireless sensors.
10
The authors have used two metrics to study the information collection quality: entropy and mutual information. In our work, we use entropy to be our quality metric. The MIPP problem with mobile robots having periodic connectivity has been studied by Hollinger and Singh.
11
In this study, the robots have to come within each other’s communication ranges periodically. Another periodic connectivity approach for information collection is proposed by Viseras et al.,
12
where they have employed a sampling-based path planner. Recently, Dutta et al.
4
proposed a solution for the MIPP problem, where the robots need to maintain a connected network throughout the information collection process. The authors have used bipartite matching to avoid inter-robot collisions while collecting maximal information and
Greedy informative path planning strategies are highly popular in this field 2,13 and the proposed strategies also provide provable near-optimal guarantees. However, the solution is centralized and does not scale well with the number of robots. A polylogarithmic approximation bound in a metric space is provided by Lim et al. 14 A greedy information collection strategy has also been used by Dutta and Dasgupta, 15 where a group of robotic modules forms user-defined configurations while collecting maximal information from an environment. Our work in this article uses a similar greedy informative path planning strategy. A sampling-based path planner is used for information collection by Hollinger and Sukhatme, 16 where the authors have proposed a variant of the RRT* algorithm. 17 An MIPP algorithm has recently been proposed by Luo and Sycara, 18 where they used a Gaussian mixture model to better prediction results. Partially observable Markov decision process has also been used for modeling the information collection phenomena. 19 Singh et al. studied a nonmyopic version of the MIPP problem. 20 A similar strategy in a spatiotemporal field is used by Meliou et al. 21 Environment monitoring with a team of underwater vehicles is studied by Kemna et al. 22 and Lermusiaux et al. 23 In contrast to the work proposed by Woosley et al., 24 where all robots explore and select locations from the entire environment, this article partitions the environment a priori to enable robots to divide the exploration task between themselves so that they can select and visit information sampling locations more rapidly. Ma et al. 3 have used information collection techniques for ocean monitoring while employing a bipartite matching approach. However, most of these discussed studies on MIPP do not take the communication constraints of the robots, a highly practical consideration, into account. Moreover, the environment structure, that is, the obstacle locations and shapes are supposed to be known. In this article, we relax these assumptions and the robots collect information from an initially unknown environment while gracefully handling their restrictive communication ranges.
Voronoi partitioning is a scheme of decomposing an area into disjoint cells, where every point in each cell is closer to the center of that cell than any other cell. 7 Voronoi partition has been used in solving several problems in robotics. Choset and Burdick 25 use a generalized Voronoi diagram for a robot path planning problem. Zhou et al. 26 recently proposed an online collision avoidance mechanism using buffered Voronoi cells. Hungerford et al. 27 proposed to decompose the environment into Voronoi cells and each robot is made responsible for covering its corresponding Voronoi cell. Moreover, if part of one robot’s allocated cell is inaccessible to that robot, other robots cover the inaccessible part of the cell. A similar area coverage strategy using a group of networked robots has been studied by Breitenmoser et al., 28 where the environment and the obstacles are assumed to be nonconvex. Coverage path planning in an initially unknown environment using a multirobot system has been studied by Guruprasad et al., 29 where the authors have proposed an online Voronoi partition-based coverage algorithm. Multirobot coverage path planning using Voronoi partitioning has been studied by Cortes et al., 30 where each robot communicates its position while dynamically adapting the partitions with nearby robots complete coverage. Recently, a Manhattan distance-based Voronoi partitioning scheme is proposed by Nair and Guruprasad 31 for multirobot area coverage. In a different work, Nair and Guruprasad have combined geodesic and Manhattan distance-based partitioning strategies for coverage in presence of obstacles. 32 Schwager et al. 33 proposed a control mechanism that allows a swarm of robots to position themselves to optimize the measurement of sensory information in the environment. A Voronoi repartitioning strategy based on the unbalanced areas of the initial partitions has been presented by Tewolde et al. 5 We use their proposed algorithm for Voronoi cell repartitioning among the communicating robots. Decentralized area partitioning strategy has been previously used by Acevedo et al. 34 for area patrolling. However, the authors modeled the solution in a way that the robots periodically come into each other’s contact for sharing information, which is not required in our approach. Moreover, the article does not handle unknown obstacles in the environment.
Our proposed method aims to bring the abovementioned concepts together: how to use the idea of region partitioning for better load-balancing among the robots for information collection. The closest study to this work is due to Kemna et al., 35 which we complement and extend in the following way: Kemna et al. does not consider any obstacle in the environment, whereas our repartitioning algorithm works based on robots’ explored regions and the unknown obstacles that they discover. Moreover, the work by Kemna et al. 35 assumes the environment to be restrictive for communication, whereas we consider a different but realistic constrained-communication model, where the robots have limited communication ranges while the environment itself is not communication-restrictive.
Problem setup
Let
Initial Voronoi partitioning
The Voronoi partition is a widely used mechanism for separating a space into nonoverlapping components based on the “nearness” concept. Given a convex polygonal region E and n points in the set
where
Recall that all robots R are initialized with common knowledge of starting locations

Eight robots (circles) navigate through their initial Voronoi components (separated by black-solid lines) of an environment with obstacles (green polygons). Robots r 2, r 5, and r 6 eventually move within communication range (blue circles), where the union of their respective components (shaded gray) is repartitioned (separated by red-dashed lines) to rebalance the remaining collection load based on updated perceptions of the free versus inaccessible subregions.
Informative path planning
Consider a set of information collection points, or the points of interest (POI), that cover environment E in a grid structure. Each robot moves from one
We model the observable environmental phenomena using GPs,
36
which assume that all the collection locations in the environment (POI) generate information according to a Gaussian random vector
with
The diagonal elements of the covariance matrix
and objectively quantify which unvisited locations in the environment, when considered individually, contain the highest amount of uncertainty,
10
that is, contribute most to the mean-square-error when predicting future information measurements via the posterior mean vector
Our multirobot planning approach to uncertainty reduction will adopt the straightforward “greedy” informative path generation strategy: each robot ri simply employs equation (3) to calculate the best neighboring location to visit (within its Voronoi component Ei ) based on the measurements from all previously visited locations V and up-to-date knowledge of discovered obstacles Eobst. More formally, given robot ri is currently in location poicurr, it will select informative location poinext according to
with
Algorithms
As the robots do not have any prior knowledge about the locations and the shapes of the obstacles in the unknown environment, it may happen that the initial partitioning of the environment is imbalanced, that is, some robots have more free informative locations in its Voronoi cell and the others have less. As the robots move in the environment and collect information, they may also discover obstacles. When they later meet other robots, they share their updated local knowledge of the environment and may repartition their currently allocated Voronoi cells.
Voronoi repartitioning for load balancing
In this section, we present the load balancing algorithm used in our recursive region partitioning algorithm. This algorithm is an adaptation of the virtual force-based load-balancing algorithm proposed by Tewolde et al. 5
Assume that during the task of information collection, a set
The underlying idea is to allocate more cells to a robot whose workload is low, thereby reducing workloads of the overloaded robots.
The algorithm with all its technical details is presented in Algorithm 1. Refer to Figure 2 for an illustration using two robots. Note that in the original algorithm,
5
the number of iterations is not specified. However, in our experiments, we have found 1000 to be a reasonable upper-bound on the number of iterations required to achieve an equilibrium in the load-balancing process; we denote this upper bound by K. Also, in our experiments, we terminate the load-balancing process when the standard deviation of the wi
reaches a value of τ or less. In our implementation, we have set
The load-balancing algorithm

An illustration of the dynamic load balancing is shown using two robots
Recursive region partitioning for information collection
Each robot, ri
, will continue to generate a new path within its designated Voronoi cell, Ei
, and will follow that path until one of the following two things happen: The path length covered so far by ri
exceeds the assigned budget, B,
ri
meets (i.e. comes within the communication range of) another robot rj
.
While visiting new poi cells within its designated region, ri will continuously broadcast messages containing its unique ID, current location, and its currently allocated region, that is, Ei , a polygonal region. If ri meets another robot, then depending on this received information from the other robot(s), they will decide whether they should repartition the union of their current Voronoi cells to achieve load balancing or they should ignore repartitioning and continue to stay within their currently designated cells.
Recursive region partitioning for information collection
To keep track of contacted robots and their corresponding Voronoi cells along with their discovered obstacles so far, each robot, ri , maintains a data-structure, named perception of world (PoW i ). The PoW i can be imagined as a 2D array of time-stamped objects indexed by poi’s and contains information about the following: Ei ’s, discovered obstacle locations, and the sensed information measurements. It might happen that robot ri discovered obstacles in robot rj ’s current Voronoi cell, which rj might or might not have any knowledge about. But when they meet, both of them will have the knowledge about these obstacles.
As the obstacles are assumed to be static, if PoW are to be exchanged between ri and rj , this perception will be shared by them and the corresponding repartitioning strategy will be implemented. Note that PoW i is a local data structure, that is, different robots’ PoW i might be different in terms of stored content depending on which robot(s) they came into contact with. This data structure will help the robots to partition their respective regions if required in a more intelligent way as we will see next.
When to repartition? Let
ri
has never communicated with rj
before.
ri
has communicated with rj
before. perceptions of ri
and/or rj
about the other robots’ partitions (PoW
i
and/or PoW
j
) have changed after ri
last communicated with rj
. PoW
i
and PoW
j
are still the same from ri
and rj
’s last communication.
We will handle these cases one by one. We will start off with the simplest case (2b)—ri and rj ’s perceptions about the other robots’ partitions (PoW i ,PoW j ) have not changed from the last time these two robots have met. In this case, they do not have to update their plans and/or their respective Voronoi cells and they can plan their next best poi’s to visit within their current Voronoi cells (line 12 in Algorithm 2).
If ri
and rj
are meeting each other for the first time (case 1), they might or might not have current partition information about each other. In any case, they will repartition their current Voronoi cells based on their latest knowledge about the obstacles in their respective PoW. As the environment is unknown to the robots in the beginning and they are discovering shapes and locations of the obstacles in the environment as they explore more, the initial Voronoi cells might not be “balanced” in terms of the free space available to explore. Consequently, one robot might be tasked with exploring significantly larger/smaller amount of region in the environment compared to the others. To minimize this potential high level of variance and bring balance to the obstacle-free areas of robots’ exploration regions, we use a Voronoi cell repartitioning technique described in Algorithm 1. As a result of this repartitioning strategy, the participating robots are allocated to a unique and new polygonal region (possibly concave) in the environment for information collection. The algorithm virtually shifts the sites for the Voronoi partitions on the plane such that the new cells are balanced in terms of the number of free poi cells in them and it terminates when the standard deviation of the load distribution is sufficiently low enough (see Algorithm 1 for details). Note that the repartitioning algorithm is only executed on the area bounded by the union of the Voronoi cells of the coordinating robots (
In case ri
and/or
Proof. Coverage of E is complete when every
Evaluation
In this section, we discuss the simulation settings and present the results found via rigorous simulation experiments within a 3D robot simulator Webots. 37
Settings
We have tested our proposed informative path planning approach in simulation, which is performed on a desktop computer with a 4.6 GHz. Intel Core i7-8700 processor, 16 GB RAM, and an AMD Radeon Rx 550 4 GB graphics card. We have used a simulated model of the Pioneer 3AT robot within the Webots simulator that is equipped with a GPS and a Sick LMS 291 LIDAR sensor (Figure 3). Two different 49 × 49 m2 environments (shown in Figure 4) are considered, choosing initial robot locations as well as rectangular obstacles at random. We have varied the obstacle amount in the environment among 5%, 10%, and 15%. LIDAR ranges are varied between

(a) An instance of the simulation environment with eight robots is shown along with a zoomed-in look at a single Pioneer 3AT robot equipped with a Sick LIDAR and (b) the corresponding initial Voronoi partitioning.

The 2-D environmental information models used in this article with pairwise covariance parameters (a)
List of parameters used in our experiments.
LIDAR: light detection and ranging.
Modeling environmental information
Our experiments assume a 49 × 49 m2 grid, where the corner points comprise the 49 × 49 poi’s underlying a homogeneous, isotropic Gaussian Markov random field using exponential pairwise covariance functions: for any pair of poi’s i and j at spatial locations
where
Results
In this section, we present the results obtained by our proposed approach in different environmental settings. This section is divided into six subsections, each of which discusses the results for one particular performance metric.
Entropy
First, we present the results for our optimization metric, entropy. As shown in equation (4), the objective of the robots is to collect maximal possible information from the environment through maximizing the entropy in every iteration. To decide the next “best” location, each robot finds the neighbor poi that yields the maximum entropy. In our experiments, the robots use all the visited poi’s to predict the information measurements in the unvisited poi’s. The results of the entropy metric are shown in Figure 5. Our results are consistent with the findings of Cao et al. 2 and it can be observed that the greedy method cannot exploit the low spatial correlation in the environment very well. Therefore, the entropy plots do not show a consistent trend. The myopic nature of the used planning strategy reinforces it further. Our proposed framework is generic—a more sophisticated informative path planning strategy 2,18,21 can be used in place of this greedy strategy without impacting the proposed novel coordination mechanism.

Amount of entropy collected by the robots in
Root mean square error
Root mean square error (RMSE) is a standard metric to evaluate the quality of a regression model. 9 In our case, lower RMSE indicates that the GP prediction can closely model the underlying spatial phenomena shown in Figure 4. The results are shown in Figures 6 and 7. LIDAR range is fixed to 10 m in these plots. We can observe that varying communication ranges of the robots do not have a significant effect on the RMSE metric. On the other hand, with an increasing number of robots in the environments, RMSE values go down consistently. This result is consistent with the results found by the previous information modeling strategies. 1,2

RMSE in prediction for

RMSE in prediction for
We only present the results for the laser range of 10 m as we believe it to be the best representative. For
Variance
Following Krause et al., 10 we are also interested in studying the posterior variance found in the diagonal elements of the covariance matrix (equation (2)). The results for this metric are shown in Figures 8 and 9. We can see that with time, the posterior variance values generally go down. The slope is usually steeper with more number of robots in the environment.

Posterior variance for

Posterior variance with
Time
Next, we show the execution time of our proposed approach with different test settings. The result is presented in Figure 10. From this plot, we can observe that with increasing robot count, the run time does not increase significantly. The slight increment is due to the potentially more repartitioning among the robots, while the budget of each robot remains the same. The average run time with 5% obstacles in

Run time of the proposed approach for
The execution time of the proposed approach has not varied significantly in
Load
To demonstrate the effectiveness of the use of the Voronoi repartitioning strategy, we use the following terminologies. Let
To show the effectiveness of the repartitioning approach, we calculate the differences between
The results are shown in Figures 11
to 16. The y-axes in these plots show the differences of

Average load of the robots in

Average load of the robots in

Average load of the robots in

Average load of the robots in

Average load of the robots in

Average load of the robots in
Similarly, a longer laser range helps the robots to detect farther obstacles. Along with a higher communication range, this enables the robots to better reduce the Δ metric. This is also evident from our results. For example, with

(a, b) Average load of the robots in
On the other hand, a similar trend can be noticed for different obstacle percentages in the environment. For example, with
Detected obstacles
We are also interested in finding how many obstacles are detected by the robots. This is important for applications like environment mapping. The results are shown in Figures 18 to 20. The detected obstacle percentages are plotted against the laser ranges of the robots’ on-board LIDARs. Higher the range, farther obstacles are detected by the robots. One should note that the results presented here represent the union of all discovered obstacles by all the robots, that is, if an obstacle is observed by more than one robot, it is counted only once.

Detected obstacle percentage for

Detected obstacle percentage for

Detected obstacle percentage for
As expected, with a higher laser range, generally, the robots were able to detect more obstacles in the environment. On the other hand, with more robots present in the environment, more obstacles are detected in general. This phenomenon is evident in Figures 18
to 20. For example, with eight robots,
Sample paths
Finally, we show (Figures 21 to 23) a set of sample informative paths planned by different numbers of robots in the two tested environments. The red circles represent the sequence of cells visited by the robots, that is, paths and the black + signs indicate the detected obstacles in the environments. As each robot’s local GP model is trying to reduce the information measurement uncertainty, we can notice that the robots are usually scattering in the environment while collecting information.

Paths followed (red circles) by n robots in

Paths followed (red circles) by eight robots in

Paths followed (red circles) by 8 robots in
Key findings
The key findings from the results can be summarized as follows: (1) Repartitioning is more effective in a cluttered environment; (2) with higher values of C and L (i.e. better sensors), load balancing is more effective; (3) with a higher number of robots present in the environment, the prediction model is better; and (4) the percentage of obstacles identified in the environment is proportional to the number of robots used.
Limitations and discussion
Our results show that when two or more robots come within communication range, they are successfully able to reduce the imbalance in their workloads by repartitioning their current Voronoi cells. Although our approach can handle online repartitioning successfully with convex obstacles, it does not work with concave obstacles. Online repartitioning with concave obstacles will require further sophisticated communication mechanism. As our model does not require the robots to form a connected network at any point in time, we cannot guarantee a complete allocation of all regions with concave obstacles present in the environment with our current approach. As the environment size increases, each robot will have a larger region to cover. Thus, it will increase the run time to repartition such environments. Furthermore, the GP-based learning would also be more time consuming due to the increased POI set-size. Our proposed approach also does not handle a couple of corner cases, as shown in Figure 24. The first scenario is depicted in Figure 24(a), where two robots ri and rk are within their communication ranges but their current Voronoi cells do not intersect with each other. As these cells are disjoint, if the robots were to do repartition their cells, the union of them would produce multiple disjoint polygons. Our algorithm does not particularly handle this scenario. Thus, ri and rk continue their information collection work in their current cells without executing the repartitioning procedure.

Two limiting scenarios are illustrated: (a) The robots
Secondly, if two robots ri and rj are within their communication ranges but their respective Voronoi cells intersect on a single point, the robots do not participate in the repartitioning process. As the Voronoi cells intersect on a point only, if one of the robots were to go to the other side of that intersecting point after the repartitioning, it would need to go through that single point. We do not explicitly handle this situation and the robots continue to collect information in their current cells. The scenario is shown in Figure 24(b). On the other hand, if ri and rj are within each other’s communication range and rj and rk are within each other’s communication range but ri and rk are outside of each other’s communication range, then the repartitioning is possible through rj (Figure 24(a)).
Our work in this article is the first to study the MIPP problem under communication constraints in unknown environments. Given that there is no guarantee that two or more robots will meet each other during the information collection process, it is difficult to provide optimal load balancing. However, we have modeled the solution in a way that whenever robots meet with each other, they locally balance their loads. Although we have used a greedy informative path planning strategy that has been proved to yield reasonably good solutions in the literature, 2,4 any other sophisticated information collection strategies, 18 or mutual information-based optimization technique instead of entropy-based, 10 can be used in its place. This will not change the overall coordination strategy for the communication constrained robots in an initially unknown environment proposed in this article. This will help the future researchers in this topic to better reason about the impact of the highly practical yet less studied communication constraints of the robots, especially in environments, where the obstacle locations and shapes are initially unknown. Moreover, our proposed framework can easily be adapted for coverage path planning, that is, to cover all the free locations in the environment while handling the robot’s limited communication ranges in an unknown environment.
Conclusion and future work
In this article, we have proposed a continuous Voronoi partitioning-based informative path planning algorithm to collect the maximal amount of information from an unknown environment using a set of mobile robots. In our model, the robots initially do not have any knowledge about the environment. As the robots move to collect more information, they possibly discover new obstacles and their corresponding perceptions about their allocated workloads, that is, the amount of area they need to cover for information collection, change. When two or more robots come within each other’s communication ranges, they share their current perceptions about the environment, and based on their local knowledge bases, they repartition their initial Voronoi cells, which have been allocated to them without taking the obstacles of this unknown environment into account. This helps the robots to achieve a better workload balance. Our used path planning approach is distributed in nature and each robot plans its local path only for one future informative way-point at a time. Results show that our proposed strategy helps the robots to model the underlying spatial phenomenon that is close to reality while successfully reducing the imbalance in the amount of allocated free areas to the robots.
Physical robot implementation and experiments will be part of future work. Details of physical implementation such as motion planning around obstacles and noise from sensors will be addressed as part of physical implementation along with appropriate techniques from handling these issues. Right now, we have made simplifying assumptions for these aspects to focus on the computational aspects of the problem that is the main contribution of the article. We also plan to incorporate various connectivity constraints proposed in the literature, for example, periodic and continuous, into our proposed framework.
Footnotes
Authors’ note
The work by PD was done when he was a full-time faculty with the Computer Science Department at the University of Nebraska, Omaha.
Acknowledgement
The authors AD, AB, and OPK are thankful to the University of North Florida Foundation Board for partially funding this research.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The University of North Florida Foundation Board for partially funding this research.
