Abstract
This paper investigates the active defense guidance problem for flight vehicles in target-pursuer-defender scenario. In the practical scenario, the target flight vehicle with active defense tries to evade the pursuer, and it is always subject to the limitations of incomplete observation information and observation noise. To handle these limitations, this paper proposes a novel cooperative active defense guidance based on a convolutional dueling double deep Q-network (CD3QN) reinforcement learning algorithm. Firstly, considering the spatiotemporal continuity properties of flight vehicles, a stacking mechanism is introduced to transform incomplete observational data into a plane tensor. Based on the mechanism, convolutional neural networks are further employed to effectively extract the feature tensor from the stacked information, which is then utilized by the dueling deep Q-network to derive the guidance law. The CD3QN algorithm resolves the partially observed problem in terms of the correlation between the feature tensor and the state. Moreover, a continuous reward function is shaped based on environmental potential functions, which ensures the optimality invariance and tackles the sparse reward problem during the training process of CD3QN. Finally, numerical experiments are performed to demonstrate the convergence, efficiency, performance and robustness of the proposed active defense guidance.
Keywords
Get full access to this article
View all access options for this article.
