Research on autonomous underwater vehicle wall following based on reinforcement learning and multi-sonar weighted round robin mode

Abstract

When autonomous underwater vehicle following the wall, a common problem is interference between sonars equipped in the autonomous underwater vehicle. A novel work mode with weighted polling (which can be also called “weighted round robin mode”) which can independently identify the environment, dynamically establish the environmental model, and switch the operating frequency of the sonar is proposed in this article. The dynamic weighted polling mode solves the problem of sonar interference. By dynamically switching the operating frequency of the sonar, the efficiency of following the wall is improved. Through the interpolation algorithm based on velocity interpolation, the data of different frequency ranging sonar are time registered to solve the asynchronous problem of multi-sonar and the system outputs according to the frequency of high-frequency sonar. With the reinforcement learning algorithm, autonomous underwater vehicle can follow the wall at a certain distance according to the distance obtained from the polling mode. At last, the tank test verified the effectiveness of the algorithm.

Keywords

Autonomous underwater vehicle weighted polling mode reinforcement learning wall following ranging sonar

Introduction

Nowadays, water conveyance tunnels have been an important transportation route in hydraulic engineering.¹ When detecting the wall crack in the water conveyance tunnels, the ability of following the wall at a certain distance is the basic premise of autonomous underwater vehicle’s (AUV’s) safe operation.² Ranging sonar is widely used in obtaining the distance of obstacles, its principle is the transducer actively emits acoustic waves and obtains the distance information of obstacle by receiving the echo reflected by the obstacle.³ Efficient access to stable and accurate distance data is an important performance index of the ranging sonar.

The water conveyance tunnel inspection AUV (AUV-T; Figure 1) in this study is equipped with eight ranging sonars. The ranging sonars are mounted on the top, bottom, left, and right positions of the bow and stern to measure the distance of the AUV from the surrounding walls. The model of the sonars is DYW-50/200-NB with a range of 0.6–120 m at 200 KHz. Their half power angle is 7.5°. If the channel width is less than 0.6 m, the stability and accuracy of the sonars will decrease rapidly due to the interference caused by the echo.

Figure 1.

“AUV-T” equipped with eight sonars. AUV-T: tunnel inspection autonomous underwater vehicle.

Usually, the working mode of the ranging sonar is that a plurality of sonars emit acoustic waves simultaneously at a fixed frequency and acquire obstacle information,⁴ but there will be interference in the sonar information in different directions. Experiments show that the stability and accuracy of the ranging sonar data will decrease rapidly with the decrease of the channel width after the channel width is lower than a certain threshold, and even the data become erroneous. Although the use of polling mode of the ranging sonar can avoid the interference between the sonar, but the frequency of access to data will be greatly reduced, thus reducing the AUV obstacle avoidance efficiency. By reducing the polling interval, the efficiency can be improved, but when the channel is narrow, shortening the polling interval can cause mutual interference between the sonar and cannot obtain accurate data.

On the other hand, the different sampling frequencies of the different sensors output by the dynamic weighted polling mode (DWPM) will cause asynchronous problems. From the engineering point of view, the existing time registration methods, such as least squares, extrapolation, maximum entropy, and so on, have some limitations and one-sidedness. These methods use the sampling frequency of the low-frequency sensor as the standard, reduce the utilization rate of the measured data, and reduce the accuracy of the system, so that the weighted polling mode proposed in this article has lost its meaning.

Recently, there have been many researches on the application of sonar interference and reinforcement learning in AUV. Ohya et al. constructed two ultrasonic ranging systems to investigate the influence of characteristics of the sensing system.⁵ The characteristics of the system differ from each other. Li et al. proposed the ultrasonic ranging method using discrete chaotic phase modulated signal.⁶ The chaotic phase modulated signal showed the property of sharp autocorrelation and flat cross-correlation. Kleeman presented the sonar system which can produce accurate measurement and on-the-fly single cycle classification of planes, corners, and edges.³ This article showed how to use double pulse coding of transmitted pulses to simultaneously suppress interference and classify. Kleeman proposed the approach to rejecting interference between sonar systems which was based on identifying a transmitter by sending a double pulse with known separation.⁷ The ability was demonstrated by the experiment. Browne and Kleeman proposed the sonar ring refreshing at 60 Hz for 5.7-m range.⁴ It can result in lower latency and denser measurements. Huang et al. designed the environment states and obstacles avoidance behaviors,⁸ then the reinforcement learning was used to select the state–action combinations. Simulation results showed that AUV can meet the requirements of safe navigation. Liu et al. adopted the reinforcement learning to control AUV,⁹ Q-learning, back-propagation neural net, and artificial potential were integrated to implement avoidance planning for AUV. The simulation test verified the validity and feasibility of the motion planning.

It can be seen that the current research on sonar is mostly limited to the improvement of sonar performance, and it is difficult to solve the problem of mutual interference between sonar due to narrow channel, while the application of sonar information in AUV is limited to obstacle avoidance research, and there is basically no relevant research on wall following. Combining with the current sonar working mode and the shortcomings of the above existing situation, a DWPM for multi-sonar AUV is proposed in this article. In this mode, the ranging sonar works in a DWPM, the system can independently identify the environment complex situation, dynamically establish the environment model and switch the working frequency of the sonar. Through the polling mode, the interference between the sonar can be avoided, the data accuracy of the loudness can be improved. The system establishes the corresponding dynamic weighted frequency equation according to the velocity component and the obstacle distance in each sonar direction, and dynamically adjusts the working frequency of the sonar to ensure that the sonar adaptively improves or reduces the working frequency according to the environment variable in the current direction. When the speed is faster or the obstacle is closer, the obstacle distance information in the current direction can be quickly acquired. Through the interpolation algorithm based on velocity interpolation, the data of different frequency ranging sonar are time registered to solve the asynchronous problem of multi-ranging sonar, and the system is based on the frequency of the high-frequency sonar to output the obstacle distance. This approach avoids the asynchronous problem due to the noncoincidence of the multi-sonar sampling frequency and can perform efficient data output. When following the wall, the distance between AUV and wall obtained according to polling mode is taken as the input of reinforcement learning algorithm. According to the output action command, AUV performs the corresponding yaw movement, so as to realize the wall following.

The rest of the article is organized as follows. The multi-sonar DWPM steps are presented in the second section. In the third section, the data fusion and time registration algorithm are proposed. The application of reinforcement learning in AUV wall following is shown in the fourth section. Then the results of the experiments are given in the fifth section, and quantitative analysis is performed later. Finally, the sixth section concludes and summarizes the article.

Dynamic weighted polling mode

The multi-sonar weighted polling mode steps presented in this article are shown in Figure 2.

Figure 2.

Weighted polling mode.

Static safe distance

The static safety distance means the positioning error area in the sonar mounting direction.¹⁰ Usually, the positioning error area of AUV is an ellipse. As shown in Figure 3, the parameters of ellipse include the standard deviation, variance, spreading factor, and AUV heading angle of the ranging sonar. Here is the equation

\begin{array}{l} a = σ_{0} \sqrt{1 / 2 [σ_{e}^{2} + σ_{n}^{2} + \sqrt{{(σ_{e}^{2} - σ_{n}^{2})}^{2} + 4 σ_{e n}^{2}}]} \\ b = σ_{0} \sqrt{1 / 2 [σ_{e}^{2} - σ_{n}^{2} + \sqrt{{(σ_{e}^{2} - σ_{n}^{2})}^{2} + 4 σ_{e n}^{2}}]} \\ ϕ = π / 2 - 1 / 2 a r a tan (\frac{2 σ_{e n}}{σ_{e}^{2} + σ_{n}^{2}}) \end{array}

Figure 3.

Safety alert distance. (a) $|v| = 0$ and (b) $|v| > 0$ .

In the equation, $σ_{e}$ stands for the standard deviation of the ranging sonar mounted on the left (or the right) side, $σ_{n}$ stands for the standard deviation of the ranging distance mounted on the bow (or the stern); $σ_{e}^{2}$ and $σ_{n}^{2}$ stand for the ranging variance of the ranging sonar, $σ_{e n}$ and $σ_{n e}$ stand for the covariance of the ranging sonar; a stands for the semicircular axis of the ellipsoid of the positioning error ellipse, b stands for the semi short axis of the positioning error ellipse; ϕ stands for the bow angle, which can be obtained in real time by the attitude sensor; σ₀ stands for the spreading factor, which can be used to expand the error area and it is empirically obtained. In the case of a two-dimensional plane, when $σ_{0} = 2.15$ we think that the credibility is 95%, when $σ_{0} = 3.03$ we believe that the credibility is 99%. The ellipse center is the current AUV positioning position.

Safety alert distance

The alert distance threshold equation consists of the static part and the velocity part. Here is it

h_{i} = d \cdot a_{i} + v_{i} \cdot b_{i}

In the equation, h_i stands for the security alert distance threshold of AUV in the direction of i (including the bow, stern, port, starboard, same as below), d stands for the static safe distance, a_i stands for the static distance correction factor, v_i stands for the velocity, and b_i stands for the speed–distance correction factor.

Dynamic sonar sampling frequency

The safety distance triggering factor in the dynamic sonar sampling frequency equation is determined by the relationship between the obstacle distance detected by the sonar and the safety alert distance. Here is the equation

\begin{array}{l} f_{i} = f_{0} + h \cdot (v_{i} \cdot m_{i} + s_{i} \cdot n_{i}) \cdot g (x) \\ h = \{\begin{matrix} 0 (h_{i} < s_{i}) \\ 1 (h_{i} \geq s_{i}) \end{matrix} \end{array}

In the equation, f_i stands for the sonar sampling frequency in the direction of i, f ₀ stands for the basic sampling frequency of the sonar, h stands for the safe distance trigger factor, v_i stands for the velocity, m_i stands for the speed–frequency correction factor, s_i stands for the obstacle distance detected by the sonar, n_i stands for the distance–frequency correction factor, and g(x) is the external interface function.

When the safety alert distance threshold is not reached, the AUV will poll at the basic sampling frequency. At this time, the obstacle information will not trigger the local obstacle avoidance plan. When the safety alert distance threshold is reached, the mode of sampling frequency will be triggered. The system will take different sampling frequencies for the sonar in different directions depending on the speed in different directions and the distance of the obstacle. In addition, the equation sets the unified external interface function g(x) for the sampling frequency adjustment under complex conditions. If the voltage is low, the sampling frequency of all the sonar can be reduced by the external interface function to reduce the energy consumption. g(x) is set to 1 when no external interface function is required. Through the interface function, the system can refer to the AUV state and the sea condition information to realize the quadratic precision adjustment of the sampling frequency of the sonar.

Data fusion and time registration

There are three cases of weighted polling mode as shown in Figure 4.

Figure 4.

Data fusion and time registration.

Mode 1

When the distance in all directions is greater than the safety alert distance, the sonar will be polled according to the basic sampling frequency f_a . At this time the poll mode taken is the bow and the port sonars launch sound waves firstly, after the interval time t_a , the stern and the starboard sonars launch sound waves.

Mode 2

When the sonar in some or all directions takes the dynamic sampling frequency so that the sampling frequency of the sonar in each direction is the same (the sampling frequency is f_b ), the sonar will work in the polling mode. At this time the poll mode taken is the bow and the port sonars to launch sound waves firstly, after the interval time t_b (t_b < t_a ), the stern and the starboard sonars launch sound waves.

Mode 3

When the sonars in some or all directions take the dynamic sampling frequency and the sonar sampling frequencies in the respective directions are not exactly the same, the sonars will sample at the respective frequencies. In order to avoid the interference of the sonar in the relative direction, the following algorithm is used.

Assuming that the two sonars in the relative direction are 1 and 2, respectively, the real-time sampling frequency is $f_{1}, f_{2} (f_{1} \leq f_{2})$ , and the real-time sampling interval is as follows

Δ t_{1} = \frac{1}{f_{1}}, Δ t_{2} = \frac{1}{f_{2}}

The smaller real-time sampling interval is $Δ t_{min} = min (Δ t_{1}, Δ t_{2})$ ; the upper limit of the sampling frequency of the ranging sonar is f _max; the lower limit is f _min (the basic sampling frequency); the lower limit of the sampling interval is t _min; the upper limit is t _max; the total working time of the two sonars are respectively t ₁ and t ₂; and t is the total working time of the system. In the embedded operating system “VxWorks,” setting the obstacle avoidance system operating frequency to t ₀, t ₀ is the least common multiple of all sonar sampling intervals, that is, $Δ t_{1} = p \cdot t_{0} (p = 1, 2, 3 ...)$ and $Δ t_{2} = q \cdot t_{0} (p = 1, 2, 3 ...)$ . Through the watchdog timer, recursive call achieves the delay of t ₀ cycle. When $t_{1} + Δ t_{1} = t_{2} + Δ t_{2}$ (i.e. at next time the two sonars will sample at the same time), let $Δ t_{min} = 1.5 Δ t_{min}$ , so as to avoid the relative direction of the mutual interference between the sonars.

In the above three cases, the asynchronous problem is generated due to the sampling frequency of the sonar, and the time registration is performed by the fusion algorithm based on velocity interpolation. Normally, the AUV is slow and the speed will not change suddenly. Therefore, the distance information in the low-frequency signal can be interpolated according to the corresponding speed component and time interval in the direction, so that the system can output the frequency according to the frequency of the high-frequency signal.

Assuming that the frequency of the high-frequency signal is f _h, the total time of operation is t _h, the frequency of the low-frequency signal is f _l, the total time of operation is t _l, the component of velocity in its direction is v _l, then the output distance based on velocity interpolation is here

s_{l} = {s^{'}}_{l} - (t_{h} - t_{l}) \cdot v_{l}

In the equation, s_l stands for the obstacle distance obtained by interpolation of low-frequency sensors and ${s^{'}}_{l}$ stands for the obstacle distance obtained by low-frequency sensor on the last moment. When the high-frequency sensor sampling, the data of low-frequency sensor is also exported to the system. Through the obstacle distance obtained by speed interpolation, the system can output the obstacle distance in all direction according to the high-frequency signal rate.

Application of reinforcement learning in AUV wall following

AUV wall following is achieved by adjusting the heading of the AUV when detecting the wall crack in the water conveyance tunnels. AUV sails in tunnels with unknown environmental information, therefore, the desired heading cannot be set in advance. AUV obtains desired heading in real time through reinforcement learning algorithms. The input of the reinforcement learning algorithm is the distance of the AUV from the wall. AUV obtains the accurate distance from the wall according to the DWPM, and output the appropriate desired heading. Combined with reinforcement learning¹¹ and artificial potential field,¹² reinforcement learning algorithm is used to achieve the optimal control of wall following task. In this article, BP neural network and Q-learning algorithm are combined.^13
–15 The output of each network corresponds to the Q value of an action, that is, $Q (x, a)$ . Q function is defined as

Q (x_{t}, a_{t}) = r_{t} + γ max_{a_{t} \in A} Q (x_{t + 1}, a_{t})

Only on the premise of getting the optimal strategy can the above formula be established. In the learning phase, the error signal is

Δ Q = r_{t} + γ max_{a_{t} \in A} Q (x_{t + 1}, a_{t}) - Q (x_{t}, a_{t})

where $Q (x_{t + 1}, a_{t})$ is the Q value corresponding to the next state, the error is minimized by adjusting the weight of the network. When Q learning is realized by BP neural network, the weight is adjusted to

Δ W_{t} = α [r_{t} + γ max_{a_{t} \in A} Q (x_{t + 1}, a_{t}) - Q (x_{t}, a_{t})]

The choice of action is reflected by the value of strengthening function, and the external strengthening value is determined by the potential field method. Firstly, the resultant force of AUV at time t is calculated as follows

F (t) = {(d_{t} - d_{0})}^{2}

As shown in Figure 5, d_t is the distance from AUV to the wall calculated by DWPM at time t, and d ₀ is the desired distance, that is, the following distance from AUV to the wall. Then the resultant force of AUV at time $t - 1$ can be expressed as

F (t - 1) = {(d_{t - 1} - d_{0})}^{2}

Figure 5.

Resultant force of AUV. AUV: autonomous underwater vehicle.

The evaluation function of following distance between AUV and wall is defined as

Δ F (t) = F (t) - F (t - 1)

When $Δ F (t) < 0$ , it indicates that the distance between AUV and wall is close to the desired distance and should be rewarded. When $Δ F (t) > 0$ , it indicates that the distance between AUV and wall is far from the desired distance and should be punished. Therefore, the definition of enhancement signal $r (t)$ can be obtained

r (t) = \frac{1 - e^{\frac{Δ F (t)}{T}}}{1 + e^{\frac{Δ F (t)}{T}}}

The following behavior of AUV to the wall is divided into nine actions, including turning left with the maximum output, turning left 30°, turning left 20°, turning left 10°, direct flight, turning left 10°, turning left 20°, turning left 30°, and turning right with the maximum output. In the process of wall following, AUV obtains the accurate distance from the wall according to the DWPM and selects the appropriate actions using reinforcement learning algorithm. Through certain control strategies,^16
–18 such as fuzzy dynamic surface control¹⁹ and backstepping sliding mode control,^20
–22 AUV can achieve precise wall following in accordance with control instructions.

Experiment

As shown in Figure 6, in order to verify the effectiveness of the DWPM of sonars proposed, experiments are carried out in the pool. The sonar is fixed on a special customized support along the four directions of up, down, left, and right, and then the bracket is fixed on the x–y carriage. The aerial vehicle can realize the precise motion control in the two-dimensional plane. Therefore, the motion of AUV can be simulated by the x–y carriage.

Figure 6.

Pool test of sonar polling mode with x–y carriage.

As shown in Figure 7, the DWPM of sonar is realized by the software written based on Visual C++ 6.0, the working frequency of sonar is controlled by the software, and the data of multiple sonars are obtained.

Figure 7.

Sonar data acquisition software.

As the upper sonar is close to the water surface, the upper and lower sonar have slight interference, so the left and right sonar data are taken as an example for analysis. The data obtained from the dock test (about 5 m wide) under static state is shown in Figures 8 and 9. “L” indicates the distance obtained by the left sonars, and “R” indicates the distance obtained by the right sonars.

Figure 8.

Sonar data in non-polling mode.

Figure 9.

Sonar data in polling mode (conventional polling mode on the left and DWPM on the right). DWPM: dynamic weighted polling mode.

The conventional polling mode refers to the polling work after the same time interval between the left and right sonars. When the polling mode is not used, the interference between the sonars is relatively large. The standard deviation of the left and right sonar is 9.52 and 1.92, respectively.

The standard deviation of the left and right sonar with conventional polling mode is 0.027 and 0.023, respectively, and the standard deviation of the left and right sonar with DWPM is 0.023 and 0.007, respectively. DWPM can obtain more data in the same time under the premise of ensuring the same stability and accuracy as the conventional polling mode.

In the pool (about 30 m wide), the x–y carriage moves in X and Y directions simultaneously (Y direction along the sonar sound direction, X direction perpendicular to the sonar sound direction). The speed in x direction is $v_{x} = 1.0 m/s$ , and the sonar data obtained by taking different speeds in u direction is shown in Figure 10. The speed of the vehicle in the upper left, upper right, lower left, and lower right of Figure 10 is $v_{y} = 0.1, 0.5, 1, 1.5 m/s$ , respectively.

Figure 10.

Sonar data obtained when the vehicle is moving.

When $v_{y} = 0.5 m/s$ , the average speed of the left and right sonar is 0.5122 and 0.4982 m/s, and the error with the actual value (0.5 m/s) is 2.44% and 0.036%. When $v_{y} = 1 m/s$ , the average speed of the left and right sonar is 1.0358 and 1.0366 m/s, and the error with the actual value (1 m/s) is 3.58% and 3.6%. When $v_{y} = 1.5 m/s$ , the average speed of the left and right sonar is 1.5077 and 1.5077 m/s. The error between 1.4922 m/s and the actual value (1.5 m/s) is 0.51% and 0.52%. It can be seen that the sonar working in multi-sonar DWPM can still obtain stable and accurate data in the course of the vehicle motion.

As shown in Figure 11, the reinforcement learning algorithm is verified in the pool, and the effectiveness of following the wall is shown in Figure 12.

Figure 11.

Verification test in the pool.

Figure 12.

Wall following effect in pool test.

It can be seen that AUV can follow the wall stably through the motion instructions obtained by the reinforcement learning algorithm, and the following error is less than 0.5 m. The reinforcement learning algorithm proposed in this article is effective.

Conclusion

Through the multi-sonar weighted polling mode proposed in this article, we can obtain the wall distance information dynamically and efficiently and avoid the mutual interference between the sonar. Through the time registration algorithm based on the speed interpolation, we can avoid the asynchronous problems caused by sensor frequency inconsistency. In the process of wall following, the distance between AUV and wall obtained according to the DWPM is used as the input of reinforcement learning algorithm, and the corresponding yaw motion is executed according to the output action, so as to realize the wall following. Finally, the effectiveness of the algorithm proposed in this article is verified by the pool test.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was supported in part by the China National Natural Science Foundation (Nos. 51779057, 51709061), and the Equipment Pre-research Project (Project Number 41412030201).

ORCID iDs

Xiangbin Wang

Yushan Sun

Jian Cao

References

Acosta

Ibanez

OAC

Curti

, et al. Low-cost autonomous underwater vehicle for pipeline and cable inspections. New York: IEEE, 2007.

Karras

Bechlioulis

Abdella

, et al. A robust sonar servo control scheme for wall-following using an autonomous underwater vehicle. In: Proceedings of the IEEE/RSJ international conference on intelligent robots & systems, Tokyo, Japan, 3–7 November 2013.

Kleeman

. Advanced sonar with velocity compensation. Int J Robot Res 2004; 23(2): 111–126.

Browne

Kleeman

. A double refresh rate sonar ring with FPGA-based continuous matched filtering. Robotica 2012; 30(7): 1051–1062.

Ohya

Ohno

Yuta

. Obstacle detectability of ultrasonic ranging system and sonar map understanding. Robot Auton Syst 1996; 18(1–2): 251–257.

Wang

Kong

. A novel sonar ranging method using discrete tent-map chaotic phase modulated signal. J Comput Inf Syst 2013; 9(14): 5485–5493.

Kleeman

. Real time mobile robot sonar with interference rejection. Sens Rev 1999; 19(3): 214–221.

Huang

Pan

Chen

, et al. Simulation research on obstacle avoidance of autonomous underwater vehicle based on single beam ranging sonar. J Xiamen Univ 2014; 053(004): 484–489.

Liu

L-X

Zhang

W-D

Bai

J-Y

, et al. Design of ranging obstacle avoidance sonar system based on MEMS bionic vector hydrophone. Dianzi Keji Daxue Xuebao/J Univ Electron Sci Technol China 2016; 45(1): 155–160.

10.

Sun

Zhang

, et al. Obstacle avoidance of autonomous underwater vehicle based on improved balance of motion. In: Proceedings of the 2nd international conference on measurement, information and control (ICMIC), Harbin, China, 16–18 August 2013.

11.

Ahmadzadeh

Kormushev

Caldwell

. Multi-objective reinforcement learning for AUV thruster failure recovery. In: Proceedings of the computational intelligence, Orlando, FL, USA, 9–12 December 2014.

12.

Cao

Sun

Guo

. Potential field hierarchical reinforcement learning approach for target search by multi-AUV in 3-D underwater environments. Int J Control 2018; 1–12.

13.

Zhang

Dong

, et al. Local planning of AUV based on fuzzy-Q learning in strong sea flow field. In: 2009 International joint conference on computational sciences and optimization, Sanya, Hainan, China, 24–26 April 2009.

14.

Lei

Luo

Zhou

, et al. Multiperspective light field reconstruction method via transfer reinforcement learning. Comput Intell Neurosci 2020; 2020: 1–14.

15.

Yushan

Cheng

Zhang

, et al. Mapless motion planning system for an autonomous underwater vehicle using policy gradient-based deep reinforcement learning. J Intell Robot Syst 2019; 96(3–4): 591–601.

16.

Xiao

Wang

, et al. Swarm control with collision avoidance for multiple underactuated surface vehicles. Ocean Eng 2019; 191(106516): 1–10.

17.

Xiao

Wang

, et al. A novel distributed and self-organized swarm control framework for underactuated unmanned marine vehicles. IEEE Access 2019; 7: 112703–112712.

18.

Matveev

Teimoori

Savkin

. A method for guidance and control of an autonomous vehicle in problems of border patrolling and obstacle avoidance. Automatica 2011; 47(3): 515–524.

19.

Xiao

Wang

, et al. Three-dimensional trajectory tracking of an underactuated AUV based on fuzzy dynamic surface control. IET Intell Transp Syst 2019. DOI: 10.1049/iet-its.2019.0347.

20.

Xiao

Wan

, et al. Three-dimensional path following of an underactuated AUV based on fuzzy backstepping sliding mode control. Int J Fuzzy Syst 2018: 20(2): 640–649.

21.

Zhenzhong

Meng

Zhu

, et al. Fault reconstruction using a terminal sliding mode observer for a class of second-order MIMO uncertain nonlinear systems. ISA Trans 2020. DOI: 10.1016/j.isatra.2019.07.024.

22.

Xiangbin

Zhang

Sun

, et al. AUV near-wall-following control based on adaptive disturbance observer. Ocean Eng 2019; 190: 106429.