Sage Journals: Discover world-class research

Abstract

The inherent training instability of the Deep Deterministic Policy Gradient (DDPG) algorithm has critically hindered its practical application to complex, safety-critical tasks such as quadrotor attitude control. To address this key challenge, this paper proposes an integrated approach named RS-DDPG (Robust and Stabilized DDPG), designed to enhance training stability and controller robustness. While individual components like delayed policy updates (adapted from Twin Delayed DDPG (TD3)) and exponential reward functions have been explored, our contribution lies in the synergistic integration of these elements with a structured curriculum and evaluation framework. This holistic approach is shown to be uniquely effective for this specific control problem. Extensive simulations and ablation studies, now benchmarked against both standard DDPG and TD3, provide definitive evidence of the efficacy of our approach. The resulting controller not only surpasses the baselines in convergence speed and performance but also exhibits exceptional robustness against a wide range of random initial states, persistent external disturbances, and significant model uncertainties. This work demonstrates how the careful integration of existing and novel components can yield a reliable, high-performance, data-driven controller, representing a vital step toward bridging the gap between simulation and real-world deployment in aerial robotics.

Keywords

Quadrotor control deep reinforcement learning DDPG robust control attitude control

Get full access to this article

View all access options for this article.

References

Abo Mosali

Shamsudin

Alfandi

, et al. (2022) Twin delayed deep deterministic policy gradient-based target tracking for unmanned aerial vehicle with achievement rewarding and multistage training. IEEE Access 10: 23545–23559.

Al-Shayeb

Abro

GEM

Khan

, et al. (2024) Integrating ai-driven robust control algorithm with 3D hand gesture recognition to track an underactuated quadrotor unmanned aerial vehicle (QUAV). In: 2024 IEEE 14th international conference on control system, computing and engineering (ICCSCE), 23–24 August, Penang, Malaysia, pp. 70–75. New York: IEEE.

Deng

Sun

, et al. (2023) Time-attenuating twin delayed DDPG for quadrotor tracking control. In: 2023 42nd Chinese control conference (CCC), 24–26 July, Tianjin, China, pp. 2323–2328. New York: IEEE.

Fei

Zhang

Zhou

, et al. (2024) Adaptive finite-time tracking control of a quadrotor UAV subject to input saturation. In: 2024 39th youth academic annual conference of Chinese association of automation (YAC), 7–9 June, Dalian, China, pp. 1906–1911. New York: IEEE.

Fujimoto

Hoof

Meger

(2018) Addressing function approximation error in actor-critic methods. In: International conference on machine learning, pp. 1587–1596. Stockholm, Sweden: PMLR.

Haarnoja

Zhou

Abbeel

, et al. (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning, pp. 1861–1870. Stockholm, Sweden: PMLR.

Gao

Wan

, et al. (2023) Asynchronous curriculum experience replay: A deep reinforcement learning approach for UAV autonomous motion control in unknown dynamic environments. IEEE Transactions on Vehicular Technology 72(11): 13985–14001.

Hua

Fang

(2023) A novel reinforcement learning-based robust control strategy for a quadrotor. IEEE Transactions on Industrial Electronics 70(3): 2812–2821.

Kim

Jung

(2024) Enhancing UAV stability: A deep reinforcement learning strategy. In: 2024 international conference on electronics, information, and communication (ICEIC), 28–31 January, Taipei, pp. 1–4. New York: IEEE.

10.

Wang

(2024) Design of quadrotor UAV active disturbance rejection controller based on DDPG. In: 2024 2nd international conference on artificial intelligence and automation control (AIAC), 20–22 December, Guangzhou, China, pp. 22–27. New York: IEEE.

11.

Lin

Han

, et al. (2024) Payload transporting with two quadrotors by centralized reinforcement learning method. IEEE Transactions on Aerospace and Electronic Systems 60(1): 239–251.

12.

Lin

Sun

(2019) Supplementary reinforcement learning controller designed for quadrotor UAVs. IEEE Access 7: 26422–26431.

13.

Mohammadi

Ebrahimi

Tayefi

, et al. (2025) Reinforcement Q-learning based flight control for a passenger aircraft under actuator fault. Discover Mechanical Engineering 4(1): 1–22.

14.

Moin

Shah

Khan

, et al. (2024) Fine-tuning quadcopter control parameters via deep actor-critic learning framework: An exploration of nonlinear stability analysis and intelligent gain tuning. IEEE Access 12: 173462–173474.

15.

Sonmez

Martini

Rutherford

, et al. (2024) Reinforcement learning based PID parameter tuning and estimation for multirotor UAVs. In: 2024 international conference on unmanned aircraft systems (ICUAS), 4–7 June, Crete, Greece, pp. 1224–1231. New York: IEEE.

16.

Trad

Choutri

Lagha

, et al. (2023) Simulated annealing-deep deterministic policy gradient algorithm for quadrotor attitude control. In: 2023 advances in science and engineering technology international conferences (ASET), 20–23 February, Dubai, United Arab Emirates, pp. 1–6. New York: IEEE.

17.

Wen

Hao

Feng

, et al. (2022) Optimized backstepping tracking control using reinforcement learning for quadrotor unmanned aerial vehicle system. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52(8): 5004–5015.

18.

Xiao

(2023) Flying through a narrow gap using end-to-end deep reinforcement learning augmented with curriculum learning and sim2real. IEEE Transactions on Neural Networks and Learning Systems 34(5): 2701–2708.

19.

Wang

, et al. (2022) Compensation control of UAV based on deep deterministic policy gradient. In: 2022 41st Chinese control conference (CCC), 25–27 July, Hefei, China, pp. 2289–2296. New York: IEEE.

20.

Yoo

Jang

Kim

, et al. (2021) Hybrid reinforcement learning control for a micro quadrotor flight. IEEE Control Systems Letters 5(2): 505–510.

21.

Zhang

Wang

, et al. (2024) Multi-quadcopter reinforcement learning control method with advantaged clipped critic networks. In: 2024 9th international conference on electronic technology and information science (ICETIS), 17–19 May, Hangzhou, China, pp. 370–374. New York: IEEE.

RS-DDPG: An enhanced DDPG with a robust training framework for quadrotor attitude control

Abstract

Keywords

Get full access to this article

References