Abstract
The exploration of an unknown environment by a robot system is a well-studied problem in robotics; however, although many of the proposals made in this field represent efficient tools in terms of exploration paradigm, most of them are not efficient for time critical applications since the robot may visit the same place more than once during backtracking. In this way and considering these limitations, this article presents a novel approach called the random exploration graph, which addresses the problem of exploring unknown environments by building a graph structure created incrementally by the random choice of the free frontier in the observation range of the robot. In addition, the random exploration graph algorithm uses a new concept called “frontier control” introduced in this work, used to store nodes left behind in the graph structure that have not been fully explored and that will be used to guide the exploration process in an efficient way, when the algorithm needs to go back to previously visited areas to continue exploration. The frontier control concept next to the versatility of the graph structure used for the exploration process is the main contribution of this work.
Introduction
One of the fundamental challenges of contemporary robotics is to obtain robust and efficient mechanisms by which robotic systems can obtain real autonomy. In mobile robotics, this problem is addressed by developing path planning algorithms that allow robots be conducted to areas of interest where they can perform the task for which they were created. The main utility of robots is that they can be used to reduce costs, improve efficiency, or perform tasks where human intervention would be too risky or the characteristics of the environment make it impossible.
From the above, it is possible to understand the importance that path planning has in the field of robotics; however, this concept requires prior knowledge of the environment where the robot is performing its task. Thus, although many robotic systems have environment maps to operate in particular environments, most lack of this information, so motion planning is an impossible task, and therefore, it is not feasible to course the environment.
Path planning in unfamiliar environments, also known as the problem of “exploration of unknown environments,” is a technique that answers the question of where the robot has to move next in order to navigate through an uncharted environment, avoiding collisions with the existing obstacles and obtaining maximum environment information for decision-making.
While the benefits of using mobile robots are convincing, the challenge of their operation in uncertain environments has proven to be considerably difficult. The strategies in this area require fast algorithms that can adapt to changes in the environment when new characteristics or obstacles are detected. Thus, many algorithms to explore unknown new environments have been developed in the literature, 1 –8 being the most popular the ones presented by Oriolo et al., whose most known proposal is the random Sensor-based Random Tree (SRT) navigation tree. 6 This navigation tree is based on the incremental construction of a tree-type data structure through the random generation of robot configurations within a local security (LSR) area detected by the robot sensors where the robot can move freely without risk of collision with an obstacle. These configurations are stored in the structure that represents the road map of the scanned area, with their respective representation of the environment segment (LSR region) detected. In this method, the exploration is directed through the selection of free boundaries from a given position to continue the exploration. Once the robot is located in an area where it is not possible to continue with the examination, the robot is automatically sent to the parent node of the current position to find new unexplored areas. The method ends when the automatic return process leads the robot to the root of the structure.
Although the SRT method represents a good option for exploring unknown environments, it has some problems that should be considered. The first problem refers to the time that the method requires for the development of exploration, since the criteria to look back to explore new positions, once the robot is located in completely examined areas, requires that the robot navigates twice the created tree structure, which represents a significant waste of time. Likewise, the lack of knowledge of the current state of the nodes of the tree in terms of exploring options makes it impossible to ensure that the environment will be fully explored.
Franchi et al. 3 also present a new method based on the SRT algorithm for a multirobot case, known as the random graph exploration (Sensor-based Random Graph [SRG]). In this algorithm, the tree structure is transformed into an exploration graph when a safe route to travel between two nodes is found. In this version, the choice of the next location to be explored is based on the use of a probability proportional to the arc length of the free boundaries found in the node where the robot is currently positioned. Another important SRG method feature is the way in which nodes with free frontiers are revisited once the node in which the robot is present has been fully explored. In this case, Franchi’s method builds a minimum spanning tree with all nodes of the adjacent graph to the current node, and the adjacent node tree with the largest weight in relation to the free frontier length is chosen.
Like the SRT method, SRG has certain difficulties to be considered; among them is the computation time required in selecting nodes to be revisited; also, if the number of adjacent nodes and the number of nodes forming the environment are too high, the time to complete the exploration will be large, too.
Also, although the exploration method generates a graph data structure which fusions the structures generated by robots that operate in the environment into one, the new structure is not fully exploited because the form of revisiting nodes to verify unexplored areas creates a tree structure, generating a discontinuous path that forces the robot to go through the parent nodes of the graph, ignoring its versatility.
Considering the problems stated above, El-Hussieny et al. 9 used the SRT exploration method, and add a new heuristic to the backward process revisit the structure nodes. Here, the way the exploration takes place is the same used in the SRT exploration method; however, in the backward process to look for new areas to explore, the tree structure is analyzed in reverse order, starting from the current node and going to parent nodes, and checking if these can give more information. This way, the gain of each of these nodes is calculated based on the ray-casting algorithm. 10 This measure will determine, if it is greater than a threshold node, whether it will be worth visiting the analyzed node, or not. So, if it meets the described criteria, the node is considered the most informative node, and a trajectory to that node is planned using the A* algorithm. Otherwise, if there are no nodes that meet the described criteria, exploration will be considered complete, and the robot will remain in the node where it currently is.
Again, the tree structure used is too rigid, forcing the robot to travel through it without being able to use shortcuts when possible. Also, the content of the nodes is analyzed twice: once, when the structure is constructed during exploration and the second one when verifying whether any of the nodes can contribute more information to the exploration, causing the extension of the scan time because of the lack of a priori information for decision-making. Moreover, the criteria used to revisit nodes and continue exploration makes it impossible to ensure that the environment will be fully explored.
As stated, the problems encountered in the described algorithms are excessive lead time methods for exploration of environments, and the lack of certainty that these algorithms will make a full exploration of the environments, both caused by not knowing information that nodes can provide a priori, and the lack of exploitation of the capabilities of the created structures. Thus, this article contains a new approach to the problem of exploration of environments supported by the basic principles of SRT method, but this one is based on the construction of a graph of exploration that fully exploits the benefits of such structures, and this is in turn combined with an accurate knowledge of the areas, with the possibility of exploration collected as the graph is built incrementally, before making decisions on the areas that need to be revisited to complete the exploration.
Random exploration graph
The random exploration graph (REG) presented in this research is a modified version of the SRT algorithm, in which the tree structure becomes an exploration graph when a safe route is found to travel between two nodes without hierarchical relationship, which is determined based on the intersection of the free frontiers belonging to these nodes (Figure 1(c)). Although our REG method uses the SRT method as a starting point and restructures the tree in a similar way than the one the SRG method uses, the approach presented here fully exploits the generated data structure, using it to find optimal paths when the scan cannot find more valid configurations to explore and must be directed to previously visited locations; this way, the algorithm will not waste time in navigation.

Transformation of the tree structure in an exploration graph. (a) The transformation of the structure is not possible because there is no safe route between nodes
Likewise, the choice of the next position to explore is performed randomly; however, unlike the SRT and SRG methods, our approach is not based on the random release of configurations within the LSR to approach free frontiers, but in the random choice of a free boundary (if there is more than one as a result of the segmentation of these by the presence of obstacles) where the robot will try to approach the free boundary so that the arc length of the free frontier is completely covered by the new LSR detected in the new robot configuration chosen; or alternatively, to cover the widest extension to avoid leaving discontinuous frontiers (Figure 2(c)), as this will have a direct impact on scan time, because when a return to this area is necessary, the system will encounter one unexplored frontier instead of finding two of these in two different directions (Figure 2(b)).

Random choice of a new frontier to explore. (a) LSR frontier segmented in the presence of obstacles. (b) Wrong choice of the next position to explore, as this leaves two new free unexplored frontiers in opposite directions. (c) Election of the new frontier to explore choosing a robot configuration through which frontiers do not leave more segmented frontiers.
Another important aspect of our method is the way in which the nodes with free frontiers will be revisited once the current node, in which the robot stays, has no more free frontiers to explore, because this planning will be executed only with a priori knowledge of the free frontiers nodes through the introduced frontier control concept (Figure 3), which shall be executed as follows: Once a random unexplored frontier of the current node is chosen as the next location to explore, the frontier control will add the rest of the unexplored frontiers of the current node that were left behind (Figure 3(a)) and the arc length of each of them to a list of nodes. This list will be used to plan a route to nodes with possibilities of being explored (Figure 3(b)) when a backward movement is required to unexplored areas.

Control frontier. (a) Semiexplored environment where arches
As it may be observed, the relevance of frontiers control resides in identifying only the information of nodes not completely explored during the development of the exploration, saving free frontiers not analyzed contained in the nodes. This concept allows the systematic exploration of the acquired knowledge, allowing the REG method to plan trajectories to well-identified zones needed to be explored. The REG method works without the need to come back and physically verify each of the nodes within the structure, as the SRT method requires. Using REG, there is also no need to perform a complete analysis of the exploration structure generated during task performance every time there is the need to find a new zone with possibilities to be explored, as required by the SRG method. These differences with both SRT and SRG represent a substantial computational improvement of REG in terms of execution time.
The planning of roads will be executed using the A* method bidirectionally, using the graph structure created by planning the route from the current node to nodes contained in the frontier control and from nodes in the frontier control to the current node, ending when an existing path is found between the current node and any node in the list (Figure 4).

Bidirectional application of the A* method from the current position of the robot to the nodes susceptible of exploration stored in the frontier control.
The REG exploration algorithm
As in the SRT exploration method, in our method, the structure graph built represents the road map of the explored area of the environment, and is gradually built expanding the structure to randomly selected free unexplored frontiers, so that the new configuration to explore is contained within an LSR detected by the sensors in the current node. Each node of the graph is a collision-free configuration

REG exploration algorithm. REG: random exploration graph.
The initial node of the algorithm is considered the beginning and return. This node contains the following information: the starting position
Once the verification and updating of the structure are made, the FRONTIERS function will obtain the remaining
Unlike the traditional SRT method, our approach does not require a validation function to verify that the new direction chosen is not within the LSR of another node, because frontier control removes frontiers contained in other nodes and establishes them as explored (INTERSECTION_ACT function).
Once the random direction is chosen, the DISPLACEMENT function obtains the new position
Once the robot has reached the target position, a new perception process is performed to estimate its surrounding space
Information about the remaining free frontiers in the previous node will classify it as a node with the possibility of exploring its free frontiers, and its head is added to the list of nodes to be explored. When the node is added to the structure of the graph, the curves on the map will be extended with new information using the ADD function, and the cycle will begin again using the information
Typically, when the space where the robot is located has been fully explored, the algorithm will fail to find a free frontier and the exploration will continue in any of the nodes stored in the frontier control list. The search of the next node to explore is performed bidirectionally using the graph search algorithm A* (Figure 4). The method will end when the node list to explore is empty and there are no more free frontiers; in this case, the robot returns to the starting node from which it started exploration.
The use of the bidirectional A* algorithm will extend the path of both the initial position and the desired position to the final position reached by the opposite side, finishing when both paths are on the same node. This strategy is used simultaneously in our method, with all nodes contained in the list and it ends when a path is found. This process is performed with the FIND_PATH function. The reason to search for individual routes from the current position to each node with the possibility of exploration, rather than simply a Euclidean distance to the nearest node, is the existence of inaccessible spaces and other structures because the distance to the node might seem small, whereas the distance to reach it may be too large.
Once the
Results
In this section, we have carried out numerous experiments with simulated data in order to verify and validate the exploration REG method proposed in this research. The data obtained have allowed the verification of the accuracy and consistency of the properties of the algorithm comparing it with existing methods.
Since the method presented here is focused only in exploring unknown environments, in the performed tests, the following explicit assumptions are made: (1) The localization of the robot was performed using an independent module based on the method proposed by Pedraza et al., 11 which will not be discussed in this document. (2) The exploration method is limited to a two-dimensional space, given the nature of the sensors used. (3) The environment does not contain elements that change position during the exploration job.
The efficiency of the exploration method can be analyzed and approved individually by checking typical quantitative variables in this area, such as exploration time, distance covered to explore the environment, and map coverage; therefore, in this section are performed comparative tests between the SRT exploration method, 6 the SRG method, 3 the variation of the SRT method proposed by El-Hussieny et al., 9 and the REG method developed in this article. The methods will be compared analyzing the results of the tests’ measured variables and comparing them.
The tests for the methods were performed simulating a differential Pioneer P3DX robot; the choice was made on the basis that this type of robots has been well studied by other authors and their characteristics as well as its dynamics are well known.
12
The robot is equipped with a laser sensor URG-04lx Hokuyo which has a detection range of 0.02 to about 5 m, with an atypical deviation of 1% of the measure, an angular resolution of 0.36°, and a scanning angle of 240°. In addition, the robot has a ring of 16 ultrasonic (US) sensors, six of which are considered to cover the 120° where the laser sensor cannot see. Although laser sensors are more accurate than US sensors, data coming from both sources are fused to complement the surrounding region of the robot (Figure 6). Differences in accuracy are caused by differences in respective footprints; whereas laser sensor footprint is semiradial,
13
US sensor footprint is semiconvex.
6
It is also pertinent to mention that based on empirical evidence, the radius of coverage exploration algorithm was calculated to be 3 m by setting parameter

Sensorial system used by the robot during the tests performed. Black dots represent the laser sensor, while crosses represent the ultrasonic sensor.
The simulated environments used for our experiments SRT garden and the SRT office-garden, adapted from the literature, 2 are shown in Figure 7.

The SRT environments adapted from the study of Franchi et al. 2 (a) Garden environment. (b) Office-garden environment.
As mentioned, the REG method developed in this work generates a structure that determines the paths which the robot can navigate through. Figures 8 and 9 show the development of the exploration and the structures generated using the SRT garden and the SRT office-garden environments, respectively. Figures 8(a) and 9(a) show the structure of the generated exploration graph, Figures 8(b) and 9(b) show the trajectory followed by the robot during the exploration, and Figures 8(c) and 9(c) show both graphics superposed, where the transition of the robot can be observed on top of the graph structure.

Development of the exploration by the REG method in the SRT garden environment. (a) Generated exploration structure. (b) Trajectory followed by the robot during exploration. (c) Transition performed by the robot on top of the graph structure. REG: random exploration graph.

Development of the exploration by the REG method in the SRT office-garden environment. (a) Generated exploration structure. (b) Trajectory followed by the robot during exploration. (c) Transition performed by the robot on top of the graph structure. REG: random exploration graph.
Furthermore, an additional benefit that comes when having a prior knowledge of what are the areas not fully explored is that the method will guarantee, with a high degree of confidence, and that it will explore in most cases 100% of the unfamiliar environment; to perform this measurement, and being a difficult parameter to measure because the methods are focused on unstructured environments, the working environment was divided into discrete grids, as proposed by Elfes 14 and Moravec. 15 However, for our environment, the division was only used to determine which cells were explored, allowing us to know the total area examined by the methods.
This can be verified in Figures 10 to 12, where data obtained through different methods are compared to verify the assumptions that have been made. The results were obtained on the basis of 30 tests.

Average distance travelled and standard deviation for the REG, SRT, SRG, and El-Hussieny exploration method to cover the SRT garden and SRT office-garden environments, respectively, on the basis of 30 tests. REG: random exploration graph.

Standard deviation and average time needed for the REG, SRT, SRG, and El-Hussieny exploration method to explore the SRT garden and SRT office-garden environments, respectively, on the basis of 30 tests. REG: random exploration graph.

Average surface covered and standard deviation for the REG, SRT, SRG, and El-Hussieny exploration methods in the SRT garden and SRT office-garden environments, respectively, on the basis of 30 tests. REG: random exploration graph.
Finally, based on the information presented in the graphs above, we can conclude that the length of the path and the time needed for exploring the environment are in most cases less for the REG method than for the rest of the analyzed methods. This happens because of the way the REG method chooses the free frontiers to explore in a given position or the positions where the robot should continue exploration once the area where it is located can no longer be scanned. Furthermore, it is noted that the area covered by our approach in most cases is 100% or close to this value, this again because our method has a constant knowledge of the nodes that have not been fully explored and of the areas that need to be revisited for further exploration.
Conclusions
This article has presented a new methodology for the process of exploring unknown environments based on the construction of a graph of exploration. Although the use of this structure has been used to solve these kinds of problems, our solution fully exploits the functionality of the structure for exploration in the scope of familiar surroundings, regardless of the purpose. The new properties proposed in our algorithm can reduce the time and distance the robot must travel to perform the scan and additionally ensure complete coverage of the working environment. This happens because the proposal of the election of the positions to be explored, which are based not only on the free frontiers choice but also on points of the free frontiers that would allow a maximum coverage of unexplored areas.
Also, this research presents a new concept called frontier control, which allows a continuous knowledge of unexplored areas left behind in the process of exploration, which is used in two ways. First, to plan an optimal path to potential exploration areas, once the current position in which the robot is located has been completely covered; and second that knowledge of the not fully explored areas ensures that the robot will visit each of these, and there will not remain unexplored areas in the environment.
Tests on our method consider classic variables to measure the performance of the approach, such as time and distance needed for performing the task, and full coverage of the environment, which were compared with similar methods that have proved very effective and in which the results of our algorithm show better performance.
Although it is well known that multirobot versions of the exploration algorithms have a better performance than single-robot versions, multirobot versions have the characteristic to spend as much time as the time the slowest robot spends performing the task. This way, the objective of the proposed method is to reduce the exploration time of one robot, which may be translated to the multirobot version to improve its performance. Consequently, as future work, we intend to bring the proposal to the multirobot version and to the field of three-dimensional action.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
