Sage Journals: Discover world-class research

Abstract

An improved Deep Deterministic Policy Gradient (DDPG) algorithm, integrating self-attention mechanisms, adversarial training, and prioritized experience replay, is proposed to address the limitations of state representation capability, sample efficiency, and algorithm robustness in the DDPG algorithm under complex dynamic environments. First, a multihead self-attention module is constructed within the Critic network to enhance the network’s spatial modeling capability for complex environments by parallel computing multidimensional state-action association features. This improves the accuracy of Q-value estimation and the stability of training. Second, an adversarial training mechanism is introduced, where adversarial samples are proportionally mixed into the training data, enhancing the algorithm’s adaptability to state disturbances. Meanwhile, a prioritized experience replay pool is designed based on the SumTree structure, improving the reuse efficiency of high-value samples. Finally, dynamic and static scenarios are built on the Gazebo simulation platform, and real-world experiments are conducted. The results show that, compared with the original DDPG algorithm, the improved algorithm achieves a 55.3% faster convergence speed in dynamic scenarios, an 80.2% reduction in Critic loss, and stronger generalization ability across different scenarios. Real-world tests further validate the superiority of the algorithm in dynamic obstacle avoidance and trajectory smoothness.

Keywords

Attention mechanism adversarial training path planning DDPG mobile robot

Get full access to this article

View all access options for this article.

References

Chen

Liu

(2025) Research on mobile robot path planning based on autonomous exploration. Computer Engineering 51(1): 60–70.

Deshpande

Harikrishnan

Ibrahim

, et al. (2024) Mobile robot path planning using deep deterministic policy gradient with differential gaming (DDPG-DG) exploration. Cognitive Robotics 4: 156–173.

Dhulkefl

Durdu

Terzioğlu

(2020) Dijkstra algorithm using UAV path planning. Konya Journal of Engineering Sciences 8: 92–105.

Fan

(2024) A multi-strategy improved sparrow search algorithm for mobile robots path planning. Measurement Science and Technology 35(10): 106207.

Zhang

, et al. (2024a) Mobile robot path planning method based on multi-strategy fusion improved Harris Hawk algorithm. China Testing 50(9): 1–12.

Zhao

Feng

, et al. (2024b) One R-DQN: A botnet traffic detection model based on deep Q network algorithm in deep reinforcement learning. International Journal of Security and Networks 19(1): 31–42.

Huang

Xie

Yan

(2024) Inspection robot navigation based on improved TD3 algorithm. Sensors 24(8): 2525.

Jiang

, et al. (2024) An improved dynamic window approach based on reinforcement learning for the trajectory planning of automated guided vehicles. IEEE Access 12: 36016–36025.

Lin

Yue

Chen

, et al. (2022) Path planning of mobile robot with PSO-based APF and fuzzy-based DWA subject to moving obstacles. Transactions of the Institute of Measurement and Control 44(1): 121–132.

10.

Liu

Chen

Wang

, et al. (2024a) Trajectory planning for AGV based on the improved artificial potential field-A* algorithm. Measurement Science and Technology 35(9): 096312.

11.

Liu

Man

, et al. (2024b) Evaluating and selecting deep reinforcement learning models for optimal dynamic pricing: A systematic comparison of PPO, DDPG, and SAC. In: 8th International Conference on Control Engineering and Artificial Intelligence (ed W

Zhang

), Shanghai, China, 26–28 January, pp. 215–219. New York: Association for Computing Machinery.

12.

Luo

Zhang

(2023) Mobile robot path planning optimization based on deep reinforcement learning. Modular Machine Tools and Automatic Processing Technology: 36–3945.

13.

Nguyen

Nahavandi

(2020) Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE Transactions on Cybernetics 50(9): 3826–3839.

14.

Tao

Kim

(2024) Deep reinforcement learning-based local path planning in dynamic environments for mobile robot. Journal of King Saud University-computer and Information Sciences 36(10): 102254.

15.

Wang

Gao

Zhou

, et al. (2024) Path planning for the gantry welding robot system based on improved RRT. Robotics and Computer-integrated Manufacturing 85: 102643.

16.

Wei

(2024) Mobile robot path planning based on multi-experience pool deep deterministic policy gradient in unknown environment. International Journal of Machine Learning and Cybernetics 15: 5823–5837.

17.

(2024) Precise path planning and trajectory tracking based on improved A-star algorithm. Measurement and Control 57(8): 1025–1037.

18.

Xue

Chen

, et al. (2025) Improved DDPG based on enhancing decision evaluation for path planning in high-density environments. Expert Systems with Applications 279: 127378.

19.

Yang

Wang

, et al. (2023) Adaptive dynamic windowing approach based on risk degree function. Transactions of the Institute of Measurement and Control 46.

20.

Zhang

Tang

, et al. (2024) Mapless path planning for mobile robot based on improved deep deterministic policy gradient algorithm. Sensors 24(17): 5667.

21.

Zhang

Liu

, et al. (2025) Optimized path planning for underground coal mine filling: An greedy adaptive directional bidirectional rapidly-exploring random tree. Measurement Science and Technology 36(2): 026205.

Mobile robot path planning based on the improved DDPG algorithm

Abstract

Keywords

Get full access to this article

References