Abstract
Intelligent path planning is one of the key techniques for autonomous underwater vehicles for the purpose of target detection, environmental survey and so on. In order to realize automatic motion plan, an intelligent cognitive architecture for autonomous underwater vehicle motion planning has been proposed to realize complicated target detection and mobile target following in the disturbance environment. A novel adaptive ant colony optimization and particle swarm optimization fusion-based fuzzy rules optimization algorithm has been proposed to generate optimized fuzzy rules. Through this optimization algorithm, the preliminary fuzzy rules can be optimized to realize intelligent motion planning for complicated operation tasks. Experiments of channel following for wall detection and mobile target following in the oceanic environment have verified the validity of path planning method in the implementation of detection and operation tasks.
Keywords
Introduction
Currently, autonomous underwater vehicles (AUVs) are increasingly attractive for various underwater tasks such as environment exploration, 1 seabed survey, 2 harbour protection 3 and submarine search and rescue. 4 With the development of artificial intelligence and computer science, autonomous path planning can not only help the vehicle initialize the operation route but also realize reasonable reaction on different states of obstacles and target during operation. Path planning methods include global path planning and local path planning. Global path is generated on the basis of preliminary environmental understanding and modelling. 5 On the other hand, local path planning is responsible for the generation of local or regional path to handle unknown and temporary obstacles under environmental disturbances. 6
In the last decade, a variety of solutions have been developed for the underwater vehicle path planning problems. 7 The fast marching global plan algorithm includes graph-based algorithm, 8 heuristic search algorithm, 9 evolutionary optimization algorithm 10 and so on. Zhuang et al. 11 proposed a hybrid optimization algorithm to integrate particle swarm optimization (PSO) algorithm with Legendre pseudo spectral method (LPM) for AUV operating in cluttered and uncertain environments, and the searching process is accelerated through LPM. Cheng et al. 2 proposed a dynamic programming-based genetic path planner algorithm in which the random-based crossover operator is replaced with a deterministic crossover operator.
Local path planner algorithm includes artificial potential field (APF), 12 fuzzy path planner 13 and so on. Compared with surface vehicles, 14,15 the local path planning of AUV is more eligible to be disturbed by the environmental disturbance. Potential field algorithms are mostly applied because they can generate the forward path by constructing an APF to weigh the influences of obstacles and goal points with better consistency and convergence for AUV local path planning. 16 But local minima may stop the objective from being achieved. In order to overcome this problem, Melingui et al. 17 proposed a novel plan and navigation approach to integrate APF and fuzzy logic into a common framework, which utilizes both the heuristic knowledge and the sampled input–output data pairs. Park et al. 12 presented an advanced fuzzy potential field method for mobile robot obstacle avoidance. The method primarily generates the repulsive forces of surrounding obstacles, and secondly handles linguistic variables with fuzzy rules. However, unknown and complicated coastal environment often causes difficulties in the formulation of fuzzy rules. 18
However, AUV path planning in applications often confronts with complicated difficulties. Global path planning is proposed to fulfil missions like regional coverage, target search and tracking 19,20 in limited time. Local path planning is proposed to evade obstacles, make close observation and even operation on specified target, which may sometimes meet unpredicted difficulties and require infinite efforts. 21 The AUV should adjust its route and coordinate with global and local task within the consideration of uncertain disturbances, operation time, local sampling cost and so on.
In comparison with the reflection and reasoning of human beings, the intelligence of AUVs is still in their early stages. Invariable rules-based fuzzy reasoning cannot adapt for various and complicated environment.
22
The cognition of human brain, on the contrary, operates on the basis of working memory, generates different fuzzy rules in the face of different conditions,
23
through the integration of perceiving, understanding, reasoning and learning.
24
This article will propose a novel intelligent AUV path planning algorithm on the basis of cognitive architecture for unknown obstacle avoidance, target following and detection. The contributions of this study are described as follows: This study has proposed an intelligent fuzzy path planning method on the basis of cognitive architecture for AUV target following to avoid unknown obstacles in the disturbance environment. A novel adaptive ant colony optimization (ACO) and PSO algorithm-based fuzzy rules optimization algorithm has been proposed to generate optimized fuzzy rules to obtain optimal path for proposed operation tasks. Canal wall following and oceanic mobile target following experiments have verified the validity of path planning method in the implementation of detection and operation tasks.
The rest of this study is organized as follows. The cognitive architecture for AUV path planning is issued in the second section. A novel intelligent path planning method is proposed in the third section with the adaptive ACO-PSO algorithm for fuzzy rule optimization. Tank and typical environmental experiments will be discussed and analysed in the fourth section. We will make conclusion in the last section.
Cognitive architecture for AUV path planning
The proposed cognitive architecture has been designed on the basis of the torpedo shape and intelligent AUV in Figure 1. It is an AUV with the length at 5.5 m, the greatest gyrator diameter at 0.63 m, greatest submergence depth at 2000 m, greatest cruising speed for 5 kn and greatest continuation journey at 350 km. The AUV is equipped with a ‘Tritech Super SeaKing DST’ digital scanning sonar, a 740 TV line camera, two vertical channel thrusters, one propeller, vertical rudders and horizontal wings and so on. The exploration range of the digital scanning sonar is 300 m with its vertical beam width at 20° and horizontal beam width at 3°. Its navigation and position reckoning can be realized through Doppler Velocity Log (DVL) and inertial navigation system. The operation system is on the basis of PC104 embedded with C++ as the software language.

Torpedo shape and intelligent AUV construction. (1) Digital scanning sonar. (2) Vertical channel thruster. (3) GPS and wireless antenna. (4) Propeller. (5) Vertical rudders and horizontal wings. (6) Multi-beam Bathymeter. (7) Sidescan sonar. (8) Underwater CCD. AUV: autonomous underwater vehicle.
The proposed cognitive architecture of the AUV includes a high level of knowledge-based path planner, a low level of vehicle control and executive module and environmental perception module (see Figure 2). The high-level autonomous planner and low-level control module interact each other simultaneously through environmental perception and state awareness. The executive module fulfils the AUV control command with current situation feedbacks.

Cognitive architecture for AUV path planning. AUV: autonomous underwater vehicle.
The high level of knowledge-based path planner includes the global path planner and the local path planner. The global waypoints are initialized in advance according to the marine map and mission tasks before the mission start. The global path planner is realized through a revised heuristic search and coverage algorithm. 25 Through the global path planner, the AUV tasks and path waypoints are arranged and organized.
In order to realize local path planning for target detection and obstacle avoidance in the disturbance environment, the oceanic environment is modelled not only with the digital scanning sonar of mid and far distant obstacles but also with a camera carefully observing the ambient environment and operation target. Fuzzy rules are initially proposed and optimized through adaptive ACO and PSO rule optimization algorithm.
Intelligent path planning method
Obstacles and target modelling
The digital scanning sonar is the device that can not only detect the target and the obstacle but also measure their distances from the vehicle. The range of scanning sonar is 20° with the radius as R. The representation of sonar detection space is important for environmental modelling and target tracking. In the sonar detection space, an occupancy grid map of environment has been constructed. Each cell of the map in the discrete region is characterized through two states: empty and occupied. The state is obtained through the projection of sonar profile. From the data flow of projection profile, one can obtain whether the cell is occupied or empty. The sonar can be applied to model and find uncertain static and dynamic object with Gaussian distribution. The operation profile of scanning sonar is shown in Figure 3.

The operation range of the digital scanning sonar (the space of sonar includes seven radial directions, that is, left (l), front small left (Sfl), front left (fl), front (f), front right (fr), front large right (fr), right).
Through Demster–Shafer theory, the goal of the occupancy grids is to determine the cell possibilities of empty and occupied. If we define O as the states of occupation by obstacles, and E as the empty states, one can obtain the set of discernment Θ as
Each grid in the workspace is defined by the cell state
Each cell
with every cell in this map is initialized as
In the path planning process of potential field, the attractive forces come from the goal and the observing targets, while the repulsive forces come from the obstacles. On the basis of sonar profile, the repulsive force between the vehicle and the obstacles can be obtained to prevent collisions through the sum of gradients of the potential field
where m is the number of ambient obstacles,
In the path planning process, the obstacles must be avoided once they are found, while the targets should be observed and detected with the existence possibilities. According to the theory of potential field, the attractive forces come from the goal and the observing targets. On the other hand, the position of observed target can be described with
where
In the local planning, a novel multilayered potential field approach has been proposed for the target detection. The attractive force includes the goal and the observed target
where
Therefore, the artificial force can be obtained as
From equations (5) to (10), the local obstacle and the target can be remodelled and the path can be planned through Figure 3.
Fuzzy logic descriptions
For the AUV cruising in the complicated environment, it should not only detect the target and avoid obstacles but also overcome oceanic current and uncertain disturbance. The fuzzy logic path planning method includes the following steps: 1. Fuzzification: all the fuzzy logic inputs, including AUV current positions 2. Rule generation and optimization through adaptive ACO and PSO algorithm.
The initial fuzzy rules are the combinations of these corresponding forces and reactions. The number of fuzzy rules nr should be correspondent with the sum and combinations of environmental conditions such as obstacles, targets, current and disturbances. The number of the fuzzy rules is related to the number of the input variables and the logical relationships between them. Although the former is easy to be get, the latter is hard to be established. In this article, the interplay between those AUV state variables and the outside information such as velocity, angle and position will aggravate the difficulty of the establishment of the fuzzy rules. The sum and combinations of the environmental conditions refer to the internal relationship of the fuzzy inputs.
The nr fuzzy rules are designed by using Takagi–Sugeno fuzzy model of ‘IF-THEN’ rules as follows.
The ith rule Rp
: IF
where
3. Inference mechanism: in this process, the reasoning process includes optimized fuzzy inference rules. 4. Defuzzification: in this process, the defuzzification equation is the weighted average of the fuzzy outputs
where
Adaptive rules generation and optimization algorithm
If the fuzzy rules can complete all the possible conditions and fuzzy inputs, one should take the considerations with

The block diagram of adaptive PSO and ant rule optimization algorithm. PSO: particle swarm optimization.
The first step is to code each rule into a particle or an ant, before the application of ACO and PSO. For the ith input variable and the jth rules solution, the coded result can be obtained as the rules solution individually represented in the first row of Figure 4. For PSO, the rule particles can move and expand quickly in the search space but without global communication. Each rule particle has a position and velocity vector. Each rule particle updates its position and velocity through every learning cycle.
While in continuous ACO, ant rules are randomly distributed among particles. Not only the information can be exchanged among the ant individuals but the ACO Gaussian sampling can push individuals to the end of search from local optimum.
In the beginning of this algorithm, if we set
In the second step, the particle rules individuals are generated and expanded to the search space; the generated particle individual number is
where ς is the constrict coefficient,
where
From this approach diagram in Figure 4, the generated fuzzy rules solutions
where ρ is a parameter between 0 and 1, and
The third step is ACO algorithm, and the ant generation process includes three stages: 1. Ant path selection: the new fuzzy rules solution is generated through ant path which is generated through the elite tournament technique. At the
The fuzzy rules solutions are generated through elite selection with the increase of global search. In the selection process, the node can be selected from fuzzy rules particle or fuzzy rules ant.
2. In the Gaussian sampling process, a new value near the temporary solution via Gaussian sampling which means the standard deviation is computed as follows
where
3. In the solution refinement process, through the consideration with the attraction of ants to the optimal individual in the overall population, the solution can be obtained as
where
In order to realize rules solution iteration and optimization, reinforcement learning is applied to execute and evaluate the AUV path planning actions. The Q-function can realize mapping from rules-action pairs to predict return. 26 The Q-function-based learning will speed up the convergence speed and improve system performance with disturbance. The output of the Q-function can be updated as follows
where
According to equation (22), the Q value can be updated as follows:
The best Q values correspond with the quality of the rules actions above. For each rule, there will be a greedy action to achieve maximum Q values. For each rules solution and path planning action, the maximum Q value can be obtained from the estimation as
where
Experimental results
The following section describes experimental results of two different cases of motion plan with the AUV of Figure 1. In the experiments, the number of fuzzy rules is set to 8. The population of ACO and PSO was set to 60. The total iteration number was set as 150. Thus, the total evaluation trials per run were
Case 1: Canal following for the wall detection
For the inland canal, the wall following is very important for the wall detection. Although AUV can carry out delicate detection on underwater target, it should cruise close to the wall of complicated contour under disturbances. Therefore, the motion plan should be made according to the detected wall contour and current distance between canal walls. In order to follow the canal wall, distant measurement sonars have been equipped on the AUV. In order to make comparisons, three kinds of motion planning algorithms including the common fuzzy rules-based motion plan of 3.1 and 3.2, PSO optimized fuzzy rules-based motion plan and adaptive ACO-PSO optimized fuzzy rules-based motion plan are used.
In Figure 5(a), the start point is (0, 0) and the end point is (−120, 1000). The expected distance between the AUV and the canal wall is 4 m. The adaptive ACO-PSO fuzzy planner can obtain the most reasonable trajectory for the canal detection. Since all the planned trajectories are obtained according to the AUV real-time state such as headings, speed, obstacle positions and disturbance, the planned trajectories are complicated curves. The motion control of AUV was realized according to the results of the adaptive ACO-PSO fuzzy planner. From Figure 5, the fuzzy planned motion cannot adapt to the environmental disturbance and the dynamic environment, the fixed rules usually lead to late and inflexible responses, thus the more complicated conditions for the disturbance and the turning of the wall the greater will be the errors. The PSO algorithm with fuzzy rules optimization embodies its advantage in the rule changing according to the real state, but the algorithm cannot converge very quickly at some local regions which cause the planning errors great at some time and even joggle at the place of complicated disturbance and turnings. While the adaptive ACO-PSO fuzzy planner is more adaptable for bend or angular intensively changing canals, because the ACO and reinforcement learning can help the algorithms with regional rules generation and quick convergence respectively.

Comparisons on the motion plan of canal following. (a) Comparisons on the motion plan of wall following of the canal. (b) Heading plan comparisons of motion plan. (c) Distance of AUV from the canal wall. AUV: autonomous underwater vehicle.
Case 2: Moving target following plan
Moving target following is very important but difficult for AUV advanced detection. The AUV should keep correspondent distance with the same speed and heading angle under real-time disturbance. The target following experiments for motion plan have been made in the Penglai offshore of Chinese Shandong province. The oceanic environmental disturbances were obtained from DVL in Figure 6. In the experiments, the planned trajectories from adaptive ACO-PSO fuzzy planner are obtained according to the AUV real-time state such as headings, speed, target speed and positions and disturbance. The motion control of AUV was realized according to the results of the real-time adaptive ACO-PSO fuzzy planning. The ideal distance between the AUV and the mobile target for the target following is 8.5 m.

Environmental disturbance obtained from DVL. DVL: Doppler Velocity Log.
From Figure 7, the AUV can realize mobile target following through motion plan of ACO-PSO fuzzy planning algorithm. The distance between the AUV and the mobile target is from 7.5 m to 10 m.

Motion plan of target following. (a) Following trajectories. (b) Heading plan for target following. (c) Speed plan for target following. (d) Distance from the target during following.
Conclusions
The AUV can realize high-accuracy submarine detection and target following through intelligent path planning. This article has proposed a novel adaptive PSO and ACO fusion-based fuzzy rules optimization algorithm to realize intelligent motion planning. Through this algorithm, the preliminary fuzzy rules can be optimized through continuous selection, sampling, refinement and learning. Two typical cases of experiments including canal wall following and mobile target following have been made in order to verify the validity of intelligent path planning method to implement the complicated operation tasks.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by National Natural Science Foundation of China (no. 61633009, 51579053, 5129050) and the China Scholarship Council (201806685042). It is also supported by the Field Fund of the 13th Five-Year Plan for the Equipment Pre-research Fund (no. 61403120301). All these supports are highly appreciated.
