Abstract
Reinforcement learning has achieved significant progress in UAV autonomous navigation. However, existing methods typically rely on either purely discrete or purely continuous action spaces to define the UAV’s maneuver mode. Discrete action space is simple to implement and converge quickly but lack sufficient control granularity. In contrast, continuous action space provides higher control resolution but often leads to inefficient training and susceptibility to local optima. Existing RL methods cannot adaptively switch maneuver modes within a unified framework because discrete and continuous action spaces differ fundamentally in structure and control objectives. To address this issue, we propose a hierarchical reinforcement learning framework with hybrid action space (HAS-HRL). Specifically, the high-level policy adaptively selects the maneuver mode according to the environment context, while the low-level policy consists of a set of primitive navigation skills associated with the hybrid maneuver modes. These skills generate executable control commands, enabling the UAV to perform smooth maneuvers in dense obstacle regions while cruising efficiently in open spaces. Furthermore, an event-triggered control rule is introduced to provide structured prior guidance during the early training stage, thereby improving exploration efficiency and convergence stability. Experiments in various simulation environments demonstrate that the proposed HAS-HRL framework consistently outperforms single-layer RL and HRL baselines in terms of success rate, obstacle-avoidance performance, and training stability. The results show that the hybrid maneuver modes effectively balance flight safety and navigation efficiency, offering a robust and efficient solution for UAV autonomous navigation in complex scenarios.
Keywords
Get full access to this article
View all access options for this article.
