Intelligent diving guidance with terminal angle and velocity constraints via deep reinforcement learning

Abstract

An adaptive guidance strategy that integrates optimal guidance and deep reinforcement learning to address a highly dynamic terminal guidance problem that entails meeting terminal position, angle, and velocity constraints. The proposed strategy leverages optimal guidance commands to accomplish position and angle control while introducing a deep reinforcement learning-based bias for the velocity constraint. In the training process, a dual-velocity state space is constructed to enhance the adaptability of the strategy to different guidance tasks, while training is optimized using the prediction-correction and expert knowledge to improve the training efficiency and optimality of the strategy. Simulations demonstrate that the proposed guidance strategy can achieve simultaneous control of terminal position, angle and velocity, and adapt to different guidance tasks.

Keywords

Diving flight intelligent guidance velocity control angle control deep reinforcement learning

Get full access to this article

View all access options for this article.

References

Ding

Yue

Chen

, et al. Review of control and guidance technology on hypersonic vehicle. Chin J Aeronaut 2022; 35(7): 1–18.

Wang

Tang

Zhang

. Short-range reentry guidance with impact angle and impact velocity constraints for hypersonic gliding reentry vehicle. IEEE Access 2019; 7: 47437–47450.

Park

Kim

Tahk

. Biased PNG with terminal-angle constraint for intercepting nonmaneuvering targets under physical constraints. IEEE Trans Aero Electron Syst 2017; 53(3): 1562–1572.

Kim

Park

, et al. Quaternion based three-dimensional impact angle control guidance law. IEEE Trans Aero Electron Syst 2021; 57(4): 2311–2323.

Park

Kim

Tahk

. Range-to-go weighted optimal guidance with impact angle constraint and seeker’s look angle limits. IEEE Trans Aero Electron Syst 2016; 52(3): 1241–1256.

Wang

, et al. Nonlinear optimal impact-angle-constrained guidance with large initial heading error. J Guid Control Dynam 2021; 44(9): 1663–1676F.

Wang

, et al. Nonlinear optimal 3-D impact-angle-control guidance against maneuvering targets. IEEE Trans Aero Electron Syst 2022; 58(3): 2467–2481.

Kim

Lee

Kim

. Look angle constrained impact angle control guidance law for homing missiles with bearings-only measurements. IEEE Trans Aero Electron Syst 2018; 54(6): 3096–3107.

Kim

Lee

Kim

, et al. Look-angle-shaping guidance law for impact angle and time control with field-of-view constraint. IEEE Trans Aero Electron Syst 2020; 56(2): 1602–1612.

10.

Tahk

Moon

Shim

. Augmented polynomial guidance with terminal speed constraints for unpowered aerial vehicles. International Journal of Aeronautical and Space Sciences 2019; 20(1): 183–194.

11.

Guo

Zhang

, et al. Entry guidance with terminal time control based on quasi-equilibrium glide condition. IEEE Trans Aero Electron Syst 2020; 56(2): 887–896.

12.

Zhu

Zhang

Zhao

, et al. Multi-constrained intelligent gliding guidance via optimal control and DQN. Sci China Inf Sci 2023; 66(3): 132202.

13.

Zhou

Bai

, et al. Optimal guidance for hypersonic vehicle using analytical solutions and an intelligent reversal strategy. Aero Sci Technol 2023; 132: 108053.

14.

. Deep reinforcement learning: An overview. arXiv: 1701.07274, 2017.

15.

Gao

Zhou

Pan

, et al. Acceleration control strategy for aero-engines based on model-free deep reinforcement learning method. Aero Sci Technol 2022; 120: 107248.

16.

Jiang

Zeng

Guzzetti

, et al. Path planning for asteroid hopping rovers with pre-trained deep reinforcement learning architectures. Acta Astronaut 2020; 171: 265–279.

17.

Zavoli

Federici

. Reinforcement learning for robust trajectory design of interplanetary missions. J Guid Control Dynam 2021; 44(8): 1440–1453.

18.

Wang

Cui

, et al. Deep reinforcement learning-based impact time control guidance law with constraints on the field-of-view. Aero Sci Technol 2022; 128: 107765.

19.

Gaudet

Furfaro

Linares

. Reinforcement learning for angle-only intercept guidance of maneuvering targets. Aero Sci Technol 2020; 99: 105746.

20.

Scorsoglio

Furfaro

Linares

, et al. Image-based deep reinforcement learning for autonomous lunar landing. Orlando, FL: AIAA Scitech 2020 Forum.

21.

Liu

Wang

, et al. Learning prediction-correction guidance for impact time control. Aero Sci Technol 2021; 119: 107187.

22.

Haarnoja

Zhou

Abbeel

, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning. New York City, NY: PMLR, 2018, pp. 1861–1870.

23.

Haarnoja

Zhou

Hartikainen

, et al. Soft actor-critic algorithms and applications. arXiv:1812.05905, 2018.

24.

Phillips

. A common aero vehicle (CAV) model, description, and employment guide. Schafer Corporation for AFRL and AFSPC 2003; 27: 1–12.