Abstract
Underwater vehicles coordination and formation have attracted increasingly attentions since their great potential on the real-world applications. However, usually such vehicles are underactuated and with very different environmental difficulties, which are different from those vehicles (robots) on the land. This study proposes a novel approach to integrate potential field and interval type-2 fuzzy learning algorithm for autonomous underwater vehicles formation control based on formation system framework. For the system nonlinearity and complicated environment, support vector machine has been applied to generate optimal rules for the type-2 fuzzy systems. This approach can generate optimal and reasonable formation rules on the face of different situations through classification. Furthermore, reinforcement learning has been combined with fuzzy systems to deal with limited communication state during formation. Therefore, autonomous underwater vehicles can not only execute actions through the evaluation, but also can avoid coupling character between communication state and potential field. Finally, simulations and experiments results have been extensively performed to validate the proposed methods.
Introduction
In compare with the robots on the land, underwater vehicles will confront with more complicated and extensive environment. It is more important for the vehicles to autonomously adapt to the environment and cooperate in the operations. The collaboration of multiple comparatively cheap and small autonomous underwater vehicles (AUVs) can accomplish complex task more effectively than an expensive and multi-functional AUV. 1
Recently, researchers have made significant efforts on the multiple AUVs coordinative formation. Different approaches such as generalized coordinate approach,2,3 the leader–follower approach,4,5 virtual structure approach, 6 the potential field approach, 7 and behavior-based approach. Comparing with robot formation on the land, AUVs submarine formation confronts with acoustic communications with propagating delays and channel noisy. Moreover, for the purpose of distant cruising and operation, AUV is usually underactuated. 8 It is propelled with one thruster with vertical rudders for the heading and horizontal wings for the diving. In order to realize the formation of multiple AUVs, heterogeneous underwater vehicles are usually employed, particularly with leader–follower formation form, to enhance spatial coverage and improve operation accuracy. The leaders usually equip with higher performance navigation instruments systems and perform the leading role, while the followers obtain position information with comparatively lower precise navigation instruments and relative positioning from the leaders. Strategies of relative positioning showed up in the work of Matsuda et al. 9 and Allotta et al. 10
Hou and Cheah from Singapore have proposed an adaptive proportional and derivative controller with multi-layered potential field on the basis of formation structure 11 and realized 6 degree of freedom AUVs formations in the simulations. 12 Parlangeli and Indiveri 13 from Italy have established a time varying nonlinear observability models with state augmentation on the basis of global observability model, positioned underactuated AUVs during coordinate motion control, and realized formation cruising. Since those AUVs groups are composed by heterogeneous AUVs with navigation systems of different accuracies, leader–follower approach is often used for the formation control. 14 Researchers from Portugal and Germany have proposed coordinate tracking strategy of heterogeneous AUV on the basis of line of sight and verified the strategy with oceanic experiments. 15 Millán et al. 16 from Spain have established the linearization model for the formation maintenance through the AUVs kinematics and Taylor series expansion and proposed an H2/H∞ controller in order to improve the formation robustness with time delayed communication. Cui et al. 17 from China have constructed a virtual leader AUV in the formation of underactuated AUVs, in order to realize the trajectory convergence. Moreover, a position tracking controller is proposed for the followers to track the virtual leader using Lyapunov and backstepping synthesis.
However, since the formation of multi-AUV in the oceanic environment confronts with limited communication,9,18 system nonlinearity and complicated unknown environment, high efficiency interaction between AUVs may be difficult for nonlinear formation control algorithm. Therefore, the establishment of formation rules according to current state and environment may improve robustness through fuzzy inference and machine learning. 19 In compare with type-1 fuzzy logic system, type-2 fuzzy logic system can handle rule uncertainties with antecedent or consequent membership functions. Although it is samely characterized with IF-THEN rules, its antecedent or consequent sets are type-2. 20 The type-2 fuzzy sets allow researchers to model and minimize the effects of uncertainties of changing environment in rule-based systems. Tsai and Tai 21 present a distributed consensus formation control law through recurrent interval type-2 fuzzy neural networks. Using online learning and Lyapunov stability theory, intelligent formation has been carried out. Ali et al. 22 propose a type-2 fuzzy ontology simulator to calculate and provide accurate information about collision risk and the marine environment during real-time marine operations. Such decision-making system provides high-level autonomy for AUVs formation. On the complicated oceanic environment and formation state, this article will propose a support vector machine (SVM)–based type-2 fuzzy learning approach for the formation control of multiple AUVs.
The rest of this study is organized as follows. In section “Multiple AUVs formation system framework,” a multiple AUVs formation system framework will be established. A SVM-based type-2 fuzzy learning approach for the formation control of multiple AUVs will be proposed in section “Interval type-2 fuzzy learning approach for AUVs formation.” Simulations and experiments will be discussed and analyzed in section “Simulations and experiments.” We will make conclusion in section “Conclusion.”
Multiple AUVs formation system framework
In order to realize cooperative control and interaction, the framework of formation control has been established (see Figure 1) with propulsion subsystem, oceanic survey subsystem, motion control subsystem, navigation subsystem, acoustic communication subsystem, and formation strategy subsystem. Each subsystem is independent with correspondent hardware. Communication between each subsystem is realized through PC104 bus–based TCP/IP protocol.

Multiple AUVs formation system framework.
Communication between AUVs is realized through AquaComm network acoustic communication module. This multiple channel communication module is developed in the Innovation in Digital Signal Processing and Processing Company of Australia. It can release broad frequency signal through extended frequency spectrum under low energy consumption state.
In order to establish the topology for multiple AUVs coordinate, an
If
where
In order to investigate the dynamic state during multiple AUVs formation, multiple AUVs formation is taken as a constraint multibody system; adjacent AUVs are connected through a virtual spring (see Figure 2). Therefore, the virtual spring between the ith and the jth AUV is set to be at natural length

Model of multi-AUV formation.
On the contrary, the rejected force is generated
Since the formation of multiple AUVs needs to maintain vertical and horizontal distance at the same time, the natural length of virtual spring
The potential energy derived from the attractive and repulsive forces is
where
Since multiple AUVs formation is facing with complicated environmental disturbance and limited communication, multiple layered stiffness virtual spring has been established. In Figure 3,

Multi-layer region of the ith AUV.
The attractive or repulsive forces exert on the jth AUV should be the sum of attractive or repulsive forces derived from virtual spring from the Rth layer to its located layer. Moreover, when the jth AUV is located at the innermost layer, the repulsive energy exerted from the ith AUV to the jth AUV will be the sum of virtual spring repulsive energies from the Rth layer to the first layer. Considering the AUV is underactuated, its vertical and horizontal movements are realized through thruster with wings or rudders, potential functions are order reduced using log function so as to improve the convergence of the controller (see Figure 6 in the simulation)
Thus, a multi-layered region for multiple AUV formation can be defined in Figure 4, in order to improve the formation stability.

The multi-layer region for AUV formation.
Interval type-2 fuzzy learning approach for AUVs formation
As a type of effective and reliable machine learning techniques, SVM has been widely used in classification problems. It constructs a hyper plane to separate two classes of data so that the margin is largest. 23 In compare with neural fuzzy systems, the basic SVM requires less prior knowledge of the problem, but can solve the learning problem with smaller number of samples. Instead of minimizing an objective function on the basis of training, SVM attempts to minimize one bound on the generalization error. 24 Since the problem of multiple AUVs formation confronts with tracking and formation maintain, collision avoidance, oceanic current and uncertain disturbance, and so on, the enumeration of fuzzy linguistic rules cannot cover all the situations. Although SVM is computationally expensive, it can generate optimal and reasonable rules on the current situations through classification. Therefore, the interval type-2 fuzzy learning approach has integrated SVM to generate optimal rules for the complicated formation conditions. Furthermore, in the combination with reinforcement learning, limited communication state will be dealt with for the fuzzy systems during multiple AUVs formation. Its structure is illustrated in Figure 5.

Structure of interval type-2 fuzzy learning approach.
SVM rules generation algorithm
For the formation state between the ith and the jth AUV, the rules are set in the following: IF (
Apparently, the states include permutation and combination of various conditions of potential field, cruising tracking trajectory state, uncertain oceanic current, and so on. It is difficult to enumerate the linguistic rules and input output data pairs to cover all the fuzzy situations. This article will apply the SVM as a tool in the fuzzy rules determination. For the SVM rule generation approach, the determine function is defined in the following
where
The obtaining of a hyper plane separating two classes will lead to the following optimal problem 24
Thus, optimal hyper plane will be formulated from the following optimization problem
where C > 0 is the regularization parameter which control the trade-off between margin and error for the classification. The primal problem of equation (8) can be solved by the following Lagrangian function
where
where
where
Interval type-2 fuzzy learning approach
The system contains two components: the type-reducer, which maps a type-2 fuzzy set into a type-1 fuzzy set and a normal defuzzifier, which transforms a fuzzy output into a crisp output. A type-2 fuzzy set
where
where ∫∫ indicates union over all admissible
where
Using an algebraic product operation, the fuzzy operation is implemented. The result is an interval type-1 fuzzy set. For the ith rule, the firing strength is computed as
where
where Lc and Rc represent the left and right crossover points, respectively, and nr is the number of the rules. On the basis of the center sets type reduction, the defuzzified output is the average of
Since reinforcement learning is a suitable algorithm for solving optimization problems through online interactive learning rather than depending on existence database,
25
it will help AUVs issue and improve formation control commands with uncertain conditions.
26
For the multiple AUVs formation, uncertain acoustic communication conditions may affect great on the understanding of the adjacent AUV position. The interval type-2 fuzzy learning approach will apply the Q-learning algorithm to optimize the
where
According to equation (19), the Q value will be updated as
For each rule and action, the maximum Q value can be estimated as
where
Simulations and experiments
Experimental setup
In order to validate the proposed interval type-2 fuzzy learning approach, simulation platform with three heterogeneous underactuated AUVs has been established. These AUVs are connected in series with correlated parameters presented in Table 1. These parameters are corresponding to three real AUVs available of our institutions. Simulation platform includes three different AUV control platforms and maneuverability simulation platforms.
Size and maneuverability parameters for the three AUVs.
AUV: autonomous underwater vehicle.
In the AUV control platforms, S surface control scheme 27 has been applied for the AUV control, and the control commands of propeller, heading rudder, and horizontal wing are brought forth through thrust allocation in order to realize AUV cruising velocity, heading control, and diving control.
The maneuverability simulation platform has been established on the basis of dynamic equations with hydrodynamic parameters of correspondent AUV. In the maneuverability simulation platform, the inputs are control voltage of propeller, angle control commands of heading rudder and horizontal wing, control voltage is converted into propeller thrust through thrust–volt curves from thruster experiments, the outputs are the AUV real-time positions deduced from hydrodynamic equations. Formation control and planning sections are set up on the basis of the leader AUV.
Figure 6 illustrates the formation simulation trajectory results from the three AUVs of Table 1. Three AUVs are set out at initial positions (0,0,0), (40,40,0), and (60,60,0), respectively. Then make a 6 m dive and 90° heading control, afterward move along the north direction. Since different AUVs are different in size, gyration radius, and maneuverability, the formation process involves the coupling effects of diving and heading control. By taking underactuation into consideration, potential functions are order reduced using log function so as to improve the convergence of the controller. Therefore, the AUVs formation is stabilized after fluctuation and adjustment.

Potential field comparisons between second order and “ln.”
In the formation simulations of Figure 7, comparisons have been made between the adaptive potential formation scheme of Huang et al. 7 and formation approach of this article. The disturbance is set with current speed vcur = 0.2 m/s, west. In Figure 7(a) and (b), the three AUVs plans to follow folding line and round curve with line shape, for example, the followers are planned to maintain the same distance one after another. The membership function of type-2 fuzzy logic system includes the potential field, expected cruising trajectory, oceanic current state as inputs (see Figure 8). From comparisons, for the underactuated AUVs, great change of potential energy plus current field disturbance may cause formation shape fluctuation, and even failure, since limited controlled freedom is available. However, this article has proposed an interval type-2 fuzzy learning approach integrated with potential field. This approach automatically generates fuzzy rules according to current conditions, reduces the magnitude of potential energy through SVM rule generation algorithm. As a result, it not only maintains the formation shape more stable, but also improves the controller robust against current disturbance.

Comparison between adaptive potential field (APF) and type-2 fuzzy learning approach integrated with potential field (T2FLP): (a) formation along folding line and (b) formation along round curve.

Membership functions.
Figure 9 illustrates the simulation of interval type-2 fuzzy learning approach with communication state evaluation. In the simulations, three AUVs are making formation on the basis of topology, the communication state such as packages losses, time delay, and throughput is prescribed in Figure 9(a)–(c). Before reinforcement learning, the formation of multiple AUVs cannot be realized due to communication deficiency. After reinforcement learning in Figure 9(d), the formation approach has been optimized through the communication evaluation, taking actions and rewards; thus, the formation of multiple AUVs has been greatly improved through reinforcement learning.

Simulation results from communication state evaluation: (a) package delay, (b) package loss, (c) throughput, and (d) formation trajectory.
Results from offshore experiments of AUVs formation on the water surface are illustrated in Figure 10. The AUVs formation traces were given pre-programmed missions. In the first scenario, AUVs were planned 90° yaw trace to test formation performance of heterogeneous vehicles. In the second scenario, AUVs were planned linear traces. In the third scenario, AUVs were planned with polygon path to validate the performance of regional search. From the experiments, vehicles of different sizes, gyration radius, and maneuverability can realize formation cruising under the proposed approach with.

Surface experiment on strait and folding trajectory.
Conclusion
In the consideration with the AUVs operation and formation framework, this study has developed a novel approach to integrate potential field and interval type-2 fuzzy learning for the formation control of multiple AUVs based on formation system framework. Contributions and novelties of this article are twofold:
For the system nonlinear character and complicated formation state, SVM has been integrated into the type-2 fuzzy learning systems to generate optimal rules. In the simulations, this approach automatically generates fuzzy rules, reduces the magnitude of potential energy, and as a result, not only maintains the formation shape more stable, but also improves the controller robust against current disturbance.
Reinforcement learning has been combined with type-2 fuzzy systems to deal with acoustic communication conditions during the formation process. Therefore, AUVs can not only execute actions through the evaluation, but also can avoid coupling character between communication state and potential field.
Simulations and comparisons for the cruising in disturbances have been made and illustrated the effectiveness of the proposed formation strategies. Offshore experiments with heterogeneous AUVs surface formation testify its performance of regional coverage.
Footnotes
Handling Editor: Tao Feng
Author’s note
Authors Qirong Tang and Hai Huang contributed equally to this work.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project is supported by National Science Foundation of China (Nos 61633009, 51579053, 5129050), Major National Science and Technology Project (2015ZX01041101), the Promotion Funds for the National Significant Requirements of Central Universities (HEUCFP201603) and also funded by the Key Basic Research Project of “Shanghai Science and Technology Innovation Plan” (No. 15JC1403300). All these supports are highly appreciated.
