Intelligent multi-target cooperative assignment strategy based on deep reinforcement learning

Abstract

An intelligent assignment strategy based on deep reinforcement learning is proposed for the multi-target allocation problem in interception. A comprehensive reward function is designed to improve the training efficiency under the concept of marginal reward in training. Meanwhile, to address the problem of poor searching ability induced by the huge solution space, the action masking module is introduced to modify the search strategy of the original deep Q-network (DQN) algorithm. The results show that the improved algorithm can achieve fast and smooth convergence. The trained strategy is generalizable and can generate a satisfactory allocation scheme in a short time.

Keywords

Multi-target assignment damage effectiveness intelligent decision deep reinforcement learning deep Q-network

Get full access to this article

View all access options for this article.

References

Liu

. Overview of target assignment algorithm and guidance law for multi-aircraft cooperative interception. Tactical Missile Technol 2022; 214(04): 90–97.

Chen

Dong

, et al. Three-dimensional cooperative guidance strategy and guidance law for intercepting highly maneuvering target. Chin J Aeronaut 2021; 34(5): 485–495.

Shalumov

Shima

. Weapon–target-allocation strategies in multiagent target–missile–defender engagement. J Guid Control Dynam 2017; 40(10): 2452–2464.

Kline

Ahner

Hill

. The weapon-target assignment problem. Comput Oper Res 2019; 105: 226–236.

Kong

Wang

Zhao

. Solving the dynamic weapon target assignment problem by an improved multiobjective particle swarm optimization algorithm. Appl Sci 2021; 11(19): 9254.

Jin

, et al. Evaluation model and exact optimization algorithm in missile–target assignment. J Guid Control Dynam 2023; 46(9): 1834–1841.

Chen

. A new exact algorithm for the weapon-target assignment problem. Omega 2021; 98: 102138.

Lyu

Dai

, et al. A multi-target consensus-based auction algorithm for distributed target assignment in cooperative beyond-visual-range air combat. Aerospace 2022; 9(9): 486.

Davis

Robbins

Lunday

. Approximate dynamic programming for missile defense interceptor fire control. Eur J Oper Res 2017; 259(3): 873–886.

10.

Chang

Kong

Hao

, et al. Solving the dynamic weapon target assignment problem by an improved artificial bee colony algorithm with heuristic factor initialization. Appl Soft Comput 2018; 70: 845–863.

11.

Xin

Chen

Peng

, et al. An efficient rule-based constructive heuristic to solve dynamic weapon-target assignment problem. IEEE Trans Syst Man Cybern A 2010; 41(3): 598–606.

12.

Zhai

, et al. Cooperative task allocation for multi heterogeneous aerial vehicles using particle swarm optimization algorithm and entropy weight method. Appl Soft Comput 2023; 148: 110918.

13.

Yue

Wang

, et al. Target assignment algorithm for joint air defense operation based on spatial crowdsourcing mode. Electronics 2022; 11(11): 1779.

14.

Guo

Liang

Jiang

, et al. Weapon-target assignment for multi-to-multi interception with grouping constraint. IEEE Access 2019; 7: 34838–34849.

15.

Waxenegger-Wilfing

Dresia

Deeken

, et al. A reinforcement learning approach for transient control of liquid rocket engines. IEEE Trans Aerosp Electron Syst 2021; 57(5): 2938–2952.

16.

Chen

Gao

Jing

. Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets. Aero Sci Technol 2023; 132: 108031.

17.

Wang

Zhu

Zhou

, et al. Deep-reinforcement-learning-based UAV autonomous navigation and collision avoidance in unknown environments. Chin J Aeronaut 2024; 37(3): 237–257.

18.

Tai

Paolo

Liu

. Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, BC, Canada: IEEE, 2017, pp. 31–36.

19.

Yao

, et al. Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving. Inf Sci 2020; 532: 110–124.

20.

Wang

Chen

, et al. Video captioning via hierarchical reinforcement learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 18-23 June 2018, pp. 4213–4222.

21.

Zhao

Wang

, et al. Composite observer-based optimal attitude-tracking control with reinforcement learning for hypersonic vehicles. IEEE Trans Cybern 2022; 53(2): 913–926.

22.

Gong

Chen

. Intelligent game strategies in target-missile-defender engagement using curriculum-based deep reinforcement learning. Aerospace 2023; 10(2): 133.

23.

Xin

Wang

Chen

. An efficient marginal-return-based constructive heuristic to solve the sensor-weapon-target assignment problem. IEEE Trans Syst Man Cybern Syst 2018; 49(12): 2536–2547.

24.

Sutton

Barto

. Reinforcement learning: an introduction. Cambridge, MA: MIT Press, 2018.

25.

Thrun

Schwartz

. Issues in using function approximation for reinforcement learning. In: Proceedings of the 1993 Connectionist Models Summer School, Hillsdale, NJ: Psychology Press, 1993, pp. 255–263.