Abstract
The efficiency of exploration in an unknown scene and full coverage of the scene are essential for a robot to complete simultaneous localization and mapping actively. However, it is challenging for a robot to explore an unknown environment with high efficiency and full coverage autonomously. In this article, we propose a novel exploration path planning method based on information entropy. An information entropy map is first constructed, and its boundary features are extracted. Then a Dijkstra-based algorithm is applied to generate candidate exploration paths based on the boundary features. The dead-reckoning algorithm is used to predict the uncertainty of the robot’s pose along each candidate path. The exploration path is selected based on exploration efficiency and/or high coverage. Simulations and experiments are conducted to evaluate the proposed method’s effectiveness. The results demonstrated that the proposed method achieved not only higher exploration efficiency but also a larger coverage area.
Introduction
In recent years, autonomous exploration path planning and simultaneous localization and mapping (SLAM) in an unknown environment has drawn increasing attention from robotics and computer vision communities. 1 –6 In the research fields of driverless cars, 7 –9 unmanned aerial vehicles, bionic robot, 10 and household robot, a real-time, high-precision environmental map is the basis for these applications to achieve other high-level jobs such as navigation, surveying, and picking and dropping items. However, the actual application scenarios are ever-changing, and it is impossible to preestablish maps for each scene. The lack of autonomous environmental exploration capabilities will significantly limit the widespread use of robot technology. Therefore, many scholars began to study the robot’s autonomous exploration path planning problem in an unknown environment, enabling the robot to explore and establish the map actively. 11 –21 They combine the local observation data with a partially established map to get an optimal exploration trajectory. The robot autonomously conducts mapping and navigation along this trajectory. Some scholars have also considered the optimization of the sensor’s pose during the exploration. 22 –24 The classic frontier-based approach 11,14,25 can achieve a full coverage observation of the environment by selecting the nearest accessible and unvisited frontier, but it is a one-step greedy method and cannot guarantee that the final exploration trajectory is optimal. The exploration efficiency may decrease significantly due to frequent retrace the route and the fragmentation of the unexplored region. The possibility of positioning loss will also increase without constraints of the uncertainty in the robot’s location. Some scholars have adopted an optimization-based multistep approach to obtain the optimal path by maximizing the information gain during the exploration. 15,23,26 –30 However, the computation cost increases significantly as the length of the generated path increases. As a result, the path length has to be limited, or the map resolution is reduced. The optimization-based methods tend to have a high efficiency of exploration at the beginning, but at the end of the exploration, as the fragmentation of the exploration environment increases, the exploration efficiency reduces drastically.
In this article, a novel exploration path planning method based on information entropy has been proposed. The proposed method tries to follow the boundary of the unexplored space and the explored space, avoid breaking the unexplored region into pieces and maintain the integrity of the unexplored area. This mechanism achieves a higher and more stable exploration efficiency throughout the exploration process. Information entropy can fully reflect the quantity of information, which is more accurate and has more advantages than the semantic concepts in the frontier-based methods. First, this representation for the environment makes the proposed method not only has the same global convergence as the typical frontier-based methods but also overcomes the disadvantage of the efficiency reduction due to the fragmentation. Second, the representation simplifies the calculation of information gain. The information entropy gain can be obtained by a more straightforward summation operation rather than a large number of multiplication operations as needed to calculate the posterior probability. A Dijkstra algorithm-based exploration path generation strategy is proposed to get a set of candidate paths under the consideration of the robot’s pose uncertainty along these paths. Then an efficiency-optimal selection strategy is used to choose the final exploration path. The results of simulations and experiments show that the proposed method has a significant improvement compared to two classic exploration methods.
The main contributions of this article are summarized as follows: A path generation method based on information entropy maps and boundary features is proposed to generate efficient exploration paths. Uncertainty changes in the pose of the robot are also taken into account in the candidate exploration paths. Practical validation of the proposed method on a mobile robot.
Related work
The exploration in an unknown environment to active SLAM have been studied by many researchers. 11,14 –17,25,26,29,31 –42 We will give a brief review of these methods as the following three categories.
Frontier-based exploration methods
The frontier-based exploration methods
11,14,25,38,42
in unknown scenes can be traced back to the seminal work of Yamauchi.
11
The frontier refers to the boundary between the open space and unexplored space. The robot explores new areas by moving to the nearest frontier until all the frontier in the environment has been explored. They also extern the frontier-based methods to multiple robots.
42
Freda and Oriolo mainly use a data structure that they called the Sensor-based Random Tree (
Environmental characteristics-based method
Mu et al. use the features extracted from the environment to construct the geometric representation of the environment, which they called topological feature graph (TFG). 43 They use a sampling-based method to generate the exploration path with TFG. Xu et al. use the time-varying tensor field to represent the environment and guide the robot to move. 23 At the same time, an RGB-D camera is attached to the robot arm to scan the environment, and the camera’s 3-D trajectory is optimized simultaneously to move smoothly to obtain high-quality scans. Vallvé et al. build potential information filed by evaluating the path and map entropy reduction. Then they compute an exploration path in the gradient descent direction of the potential information filed. 28 We extract the boundary features of the environment to generate the exploration path. The boundary feature is stable and has lower computational complexity than the method 11 and the method. 29
Information-theoretic methods
Leung et al. treat the robot’s exploration path planning in an unknown environment as an optimal path planning. 26 They assume that the features in the environment are static, and use model predictive control to obtain a control sequence of a robot by maximizing the information entropy gain. Vallvé et al. propose a method based on rapidly exploring random tree (RRT). 40 They use RRT to generate a set of candidate paths, and then they use POSE SLAM to evaluate these candidate paths by predicting changes in the uncertainty of the robot’s path and the information entropy of the map to select the final exploration path. Stachniss et al. use a particle filter to predict changes in maps and robot’s poses after moved to each possible position. 37 They proposed a method to select the exploration actions by trading off between the expected information gain and the possible sensor observation. Bourgault et al. propose a weighted utility evaluation function based on the uncertainty of the localization and information gain, and used the optimization method to obtain the path of exploration. 15 As the length of the planning path increases, the computational cost will increase dramatically. Bai et al. propose a Bayesian optimization-based approach. 29 Firstly, some random exploration targets around the robot are generated, and mutual information at these targets are calculated. Moreover, a Gaussian process (GP) is trained to predict the distribution of mutual information. After obtaining the initial GP model, they use the Bayesian optimization and the GP model to find the best exploration target, and the mutual information moving to it and add them to the training data set to train a new GP model. Then they use this model to calculate the best target and mutual information. This iterative process is repeated several times. Finally, the final exploration goal is obtained. Since the acquisition of the posterior probability is a computationally complex calculation, the existing methods mostly address this problem by using a shorter planning path or reducing the resolution of the map. However, the boundary features used in this article cover only a small portion of the map. Therefore, the number of planning paths is significantly reduced.
Approach overview
We refer to the problem that a robot plans exploration paths autonomously in an unknown environment and performs simultaneous mapping and localization as the active SLAM. The active SLAM can be separated into two subproblems, SLAM and the active exploration path planning. The traditional SLAM approach passively receives observation data and solves the problem of localization and mapping. Based on the partial information of the existing environment, the active algorithm plans an efficient path for the next exploration steps. This path must be traded off between the benefits of exploring new areas and revisiting areas to improve accuracy. In order to ensure that the exploration process does not fail, the uncertainty of the robot’s pose will also be added to the constraints of the exploration path planning.
Figure 1 shows the pipeline of our proposed method. In this article, an open-source SLAM 44 has been used to provide an occupied grid map, robot positioning, and the uncertainty estimation of the robot’s pose. In Figure 1(a), the local grid map and sensor data are used to update the information entropy map incrementally. Changes in the grid map will be updated synchronously to the information entropy map. The steps for generating the candidate paths are as shown in Figure 1(b). First, the boundary features in the grid map and the information entropy map will be extracted by a convolution operation and a threshold segmentation algorithm. Then, a path generation algorithm based on the extracted boundary features will be used to generate a set of candidate exploration paths. The length and number of these candidate paths are modified by the candidate path correction module to produce a set of candidate exploration paths that cover the unexplored area. In Figure 1(c), the candidate paths obtained from Figure 1(b) are evaluated by the proposed comprehensive path selection strategy, and finally, an optimal exploration path is obtained. We repeat the above process until the information entropy of all the grids in the information entropy map is reduced to a sufficiently low threshold.

Overall scheme of our exploration path planning approach for the active SLAM. In Figure (a), the local grid map and sensor data are used to update the information entropy map incrementally. In Figure (b), the boundary features are extracted, and a set of candidate exploration paths is generated. The length and number of these candidate paths are corrected. In Figure (c), these candidate paths are evaluated by a comprehensive path selection strategy, and an optimal exploration path is obtained.
Information entropy map
We construct a grid information entropy map form the occupied grid map and the sensor data. The entropy map is used to represent the perception of the environment. The occupied grid map can only give a state of each grid its occupied or idle state, and it is susceptible to noise as the state of the grid changes too fast with the observed data, which has a significant impact on the path planning processing. A continuous information entropy map is created and updated synchronously with the SLAM process to solve the issue. The entropy of a grid is gradually and smoothly updated as the observation progresses, which reflects the average state of the observations, and thus, the grid information entropy map is less sensitive to noise.
Information entropy is a quantitative measure of information. For a discrete random variable
For each grid, its information entropy is calculated and continuously updated as the observation progresses. There are only two states in each grid, that is, idle or occupied, so the information entropy in this article is defined as
where
where
Candidate exploration path generation
The active SLAM is mainly concerned with how to use local observation information to plan the exploration path, and simultaneously locate itself and establish a complete environment map with or without a prior environmental map. In this section, we will first introduce the candidate path generation method based on the local observation information, and then we will give the final optimal exploration selection strategy in the next section.
Boundary feature extraction
After obtaining data from the sensor, we update the probability estimations in the occupied grid map

Boundary feature extraction and exploration path determination. We build and update an information entropy map shown in Figure (b) with the same scale as the occupied map shown in Figure (a). Figure (c) presents the area marked by the red box in Figure (a). In Figure (c), the thick pink line
We obtain candidate exploration paths based on two kinds of boundary features. The entropy map boundary feature
We first select an appropriate partition threshold to divide the information entropy map into high information entropy areas and low information entropy areas. A convolution kernel
where
The frontier-based method moves toward to the frontier to explore, and the proposed method is to try to explore along with the proposed boundary feature. Different from the classic
Different direction selection strategies result in significant efficiency differences. The proposed method tends to plan a path that tries to surround the unexplored area, rather than directly go toward the unexplored areas and fragment it. Hence, we have fewer repetitive observations to improve coverage and keep a higher exploration efficiency. Simultaneously, we consider the optimal distance of observation and the robot’s positioning uncertainty in path planning.
We define an analogous parameter
where
where

The definition of the effective radius
Only about half of the area is unexplored when exploring along the boundary, which is shown in Figure 4. The overlapping areas were observed almost twice. It is generally not optimal to use boundary features for candidate path generation. Thus we introduced a parameter

The overlapping area between two adjacent observations. By adjusting the coefficient
When the updated radius is twice the effective observations radius, that is,
When we increase the value of
Candidate path generation
As mentioned above, the exploration along the boundaries of information entropy is very efficient in an unknown environment. We use the
In this article, we use a graph structure to represent the environment, which is similar to J Vallvé’s.
40
A graph
We treat the current position of the robot on the grid map as the starting node and the grids on the boundary as the ending nodes. Then we use the
where
The

Candidate paths generation. The red marker point
Correction of candidate paths based on uncertainty estimation
Rodriguez-Arevalo et al. 45 reported that the monotonicity of spatial propagation of uncertainty is preserved when the determinant of the covariance matrix is used as criteria in 2-D space. Inspired by their work, we use the dead-reckoning algorithm to estimate the covariance matrix and use the determinant of the covariance matrix as a criterion to quantify the uncertainty of the robot’s pose as the robot moves. We constrain the uncertainty of the candidate path by choosing a threshold according to the monotonicity property.
The pose of the coordinate
The covariance matrix
where the
Let
So from equation (10) we can get
Carrillo et al. 17 have reported that the monotonicity of the uncertain criteria is preserved under linearized assumptions when using the determinant of the covariance matrix as the uncertain criteria and equation (14) is established.
When the pose of the robot changes from step
We assume that the distribution of
where
The
By evaluating the uncertainty of the
Equation (19) means that if the uncertainty constraint corrects the candidate path, the uncertainty of the robot’s location is guaranteed not to exceed
Exploration path evaluation and selection
Information gain (IG) metrics for evaluation
We use information gain as metrics to evaluate each candidate path. The information gain in this article is measured in the number of grids which are covered by a candidate path on the information entropy map
Because we maintain a map of information entropy synchronously with the update of the occupied map, the calculation of the information entropy gain becomes very straightforward. For a certain path
where
Reexplore metric for evaluation
When the uncertainty of the robot positioning increases to a critical value, continuing to explore along the path of high entropy may result in the loss of positioning. Exploring alone candidate paths that are in the previously explored area will lead to a meager information entropy gain, but reexploring along these paths can improve the accuracy of the map and reduce the uncertainty of the robot’s location
Here we assume that the updated information entropy is the same as the predetermined entropy
Comprehensive evaluation and selection algorithm
The exploration path planning method proposed in this article is listed in Algorithm 1. The algorithm uses a grid map
Mobile robot autonomous exploration path planning algorithm.
First of all, the information entropy map is initialized as the same size as the grid map and expand as the grid map. We use the convolution kernel described in the fourth section to convolve the information entropy map
Then we build a graph
Finally, we first select the path with the highest information entropy gain from these paths. If the path does not exist, this indicates that we have explored all the areas, or according to the uncertainty constraints. If we continue to explore the unknown area, we will lose the location when a decrease in the average information entropy within the specified
Simulations and experiments
We compared the proposed algorithm with two other methods 11,29 in the simulation. We also verified our algorithm’s adaptability and efficiency in three real-world experiments.
Evaluation in synthetic scenes
We use ROS-Stage package
47
and three public grid maps to set up the simulation environment. The robot used in the simulation is equipped with an odometric sensor with noise covariance
Figure 6 shows the generated map in the simulation and the recorded exploration trajectory. Figure 7 shows the information entropy changes with exploration. Table 1 represents all statistical results. It gives the total reduction of the

The comparison experiments with the other two methods on three different open-source map. The picture on the left shows the experiment on the

The experimental result corresponding to Figure 6. The curve represents the change of the information entropy of the environment when different exploration methods are adopted. The blue line (the proposed method) shows the shortest exploration path length in three different scenarios, and the information entropy of the map decreases faster throughout the process.
Comparative data in the three simulation experiments.
Bold face value indicates that the proposed method is better or has a greater improvement than the other two methods.
We performed experiments in the same simulation environment. When the robot moves forward 10 cm, we calculate and save the information entropy of the map, the pose of the robot, the covariance matrix
In the beginning, the three methods have a similar exploration efficiency. We can see that the decline rate of information entropy is very close to Figure 7. As the exploration progresses, the unexplored area using the comparative methods is broken into pieces, and as a result of the efficiency of the exploration using other methods
11,29
has dropped a lot. This can be seen from different experiments is shown in Figure 7(a) to (c). As the fragmentation of the unexplored area is aggravated, the maximum information gain under control is becoming harder to predict by using
The proposed method is to explore along the boundary instead of going to the border to explore. Because the boundary-based exploration method avoids the fragmentation of the unexplored area, the efficiency of the exploration can be very and efficient constant. Table 1 shows that the average decline rate of
Under the same conditions, we conduct comparative experiments of the proposed method with the classical frontier-based approach 11 and the Bayesian optimization-based approach. 29 Figure 8 provides a comparison plot of information entropy gain over the simulation step in three simulations, respectively. The blue circle indicates the end of the exploration. As the exploration progresses, the frontier-based exploration strategy intensifies the fragmentation of unexplored areas and has a significant impact on the efficiency of exploration. As a result, we can see that the computational speed of the method 11 is the lowest in the three simulations. The method 29 is a Bayesian optimization-based method. At each step, it needs to solve a Bayesian optimization problem to find the exploration target. The experimental results show that the proposed method has higher computational efficiency than other methods.

Comparison of the computational speed of the proposed method, the method,
11
and the method,
29
measured by information entropy gain over time step in the three simulation experiments. Figure (a) shows the experiment on the
Evaluation in real scene
We also verify our proposed method in real scenes. The robot used in the experiment is a

The structure of the real experiment robot. It is built on a Turtlebot robot, with Velodyne VLP-16 Lidar and an Intel UNC mini PC installed on it.
Figure 10 shows the exploration trajectory, and the generated 3-D point cloud map in the three different real scenes. In the experiment of Figure 10(a) and (b), we set

The figure shows the exploration trajectory and the generated 3-D point cloud map after the robot actively explores the three different real scenes. Figure (a), Figure (b), and Figure (c) are top views of Figure (d), Figure (e), and Figure (f), respectively. We choose
Since the proposed method is based on the path planning method, it is possible to dynamically adjust the exploration trajectory according to changes in an environment. For a dynamic object, the exploration path does not change as long as it moves outside the safe area of the robot. The exploration algorithm is triggered to replan the exploration path only when a dynamic object enters the safe area of the robot. When the dynamic object is removed, the area where the previous dynamic object located will also be considered again in the future exploration path. In Figure 10(e) and (f), we can see that there are many pedestrians in the point cloud being built, but from the exploration trajectory in the above picture, we can hardly see the impact of these pedestrians.
Figure 11 shows the information entropy gain of the map over the traveled distance recorded at a specific distance interval during the exploration process. Table 2 lists the experiment results. The proposed method has a relatively stable rate of information entropy reduction in different scenarios.

The above figures show the information entropy gain of the map over the traveled distance.
Comparative data in the three real experiments.
IG: Information gain.
We also study the effects of uncertainty constraints on the robot’s exploration. From Figure 12, we can see that after adding the uncertainty constraint, the uncertainty of the localization of the robot is limited to a specific range, which significantly improves the adaptability of the algorithm to the environment.

It shows the changes in the uncertainty of the robot’s pose in the three real environments using the proposed method.
Conclusion
In this article, we address the exploration path planning for a robot to achieve SLAM in an unknown environment. We proposed an exploration path planning method, which is based on boundary features and uncertainty estimations. The proposed path planning method tries to generate candidate paths along the boundaries. It avoids breaking the unexplored region into pieces and can be adjusted by
In our further work, we will perform more real-world experiments, and verify the ability of the proposed approach in an outdoor environment. We will study the ways for multiple robots to share perceptual information and study the mechanism of multi-robot cooperative exploration and then extend the proposed method to multiple robots to improve the overall exploration efficiency.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the National Natural Science Foundation of China under Grant U1713222, 61773378, and 61703401; and in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China under Grant 61421004, and in part by Beijing Science and Technology Project under Grant Z181100003118006.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
