Hybrid path planning for UAV using rapidly-expanding random trees and proximal policy optimization in complex environments

Abstract

This paper proposes a hybrid Unmanned Aerial Vehicle (UAV) path planning method that combines the Rapidly-exploring Random Tree (RRT) algorithm with Proximal Policy Optimization (PPO). The proposed method aims to enhance the efficiency and adaptability of UAV path planning in complex and dynamic environments. The RRT algorithm excels at quickly generating a feasible path from a start point to a goal point. However, the quality of its paths is often suboptimal, and it lacks adaptability in dynamic settings. In contrast, PPO, a deep reinforcement learning algorithm, optimizes paths through iterative policy updates, enabling the UAV to adapt to environmental changes. Our approach first employs RRT to generate an initial path, which is subsequently refined by PPO to improve smoothness and adaptability. Experimental results demonstrate that the RRT-PPO hybrid method performs favorably in terms of path length, computational time, and obstacle avoidance capability, effectively improving the task completion efficiency of UAVs in complex environments.

Keywords

UAV path planning RRT PPO complex environments

Get full access to this article

View all access options for this article.

References

LaValle

Kuffner

. Rapidly-exploring random trees: progress and prospects: Steven m. lavalle, iowa state university, a james j. kuffner, jr., university of tokyo, tokyo, japan. In: Laumond

Overmars

(eds) Algorithmic and computational robotics. Wellesley, MA: A K Peters/CRC Press, 2001, pp. 303–307.

Karaman

Frazzoli

Sampling-based algorithms for optimal motion planning. Int J Rob Res 2011; 30(7): 846–894.

Kuwata

How

JP.

Cooperative distributed robust trajectory optimization using receding horizon MILP. IEEE Trans Control Syst Technol 2011; 19(2): 423–431.

Mnih

Kavukcuoglu

Silver

, et al. Human-level control through deep reinforcement learning. Nature 2015; 518(7540): 529–533.

Mohanty

Hughes

Salathé

Using deep learning for image-based plant disease detection. Front Plant Sci 2016; 7: 1419.

Ali

Chandrakar

Robot path planning in a dynamic environment using deep Q-learning. In: Singh

Kaur

Singh

(eds) Robotics and automation in industry 4.0. Bentham Science Publishers, 2024, pp. 9–33.

Lillicrap

Continuous control with deep reinforcement learning. arXiv preprint, arXiv:1509.02971, 2015.

Tai

Liu

A deep-network solution towards model-less obstacle avoidance. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), 2016, pp. 2759–2764. IEEE.

Cheng

Song

. Autonomous decision-making generation of UAV based on soft actor-critic algorithm. In: 2020 39th Chinese control conference (CCC), 2020, pp. 7350–7355. IEEE.

10.

Haarnoja

Zhou

Hartikainen

, et al. Soft actor-critic algorithms and applications. arXiv preprint, arXiv:1812.05905, 2018.

11.

Haarnoja

Zhou

Abbeel

, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning, 2018, pp. 1861–1870. PMLR.

12.

Yang

Hou

Chen

, et al. DRL-based path planner and its application in real Quadrotor with LIDAR. J Intell Robot Syst 2023; 107(3): 38.

13.

Singh

Ren

Lin

A review of deep reinforcement learning algorithms for mobile robot path planning. Veh 2023; 5(4): 1423–1451.

14.

Schulman

Wolski

Dhariwal

, et al. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.

15.

Bouhamed

Ghazzai

Besbes

, et al. Autonomous UAV navigation: a DDPG-based deep reinforcement learning approach. In: 2020 IEEE International symposium on circuits and systems (ISCAS). 2020, pp. 1–5IEEE.

16.

Zhang

Path following control for UAV using deep reinforcement learning approach. Guid Navig Control 2021; 01(01): 2150005.

17.

Ait Saadi

Soukane

Meraihi

, et al. UAV path planning using optimization approaches: a survey. Arch Comput Methods Eng 2022; 29(6): 4233–4284.

18.

Bayerlein

Theile

Caccamo

, et al. UAV path planning for wireless data harvesting: a deep reinforcement learning approach. In: GLOBECOM 2020-2020 IEEE global communications conference. 2020, pp. 1–6. IEEE.

19.

Dai

, et al. A lightweight reinforcement-learning-based real-time path-planning method for unmanned aerial vehicles. IEEE Internet Things J 2024; 11: 21061–21071.

20.

Gai

Zhong

, et al. A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning. Appl Soft Comput 2020; 89: 106099.

21.

Aouf

Song

Explainable Deep Reinforcement Learning for UAV autonomous path planning. Aerosp Sci Technol 2021; 118: 107052.

22.

Maw

Tyan

Nguyen

, et al. IADA*-RL: anytime graph-based path planning with deep reinforcement learning for an autonomous UAV. Appl Sci 2021; 11(9): 3948.

23.

Xie

Meng

Wang

, et al. Unmanned aerial vehicle path planning algorithm based on deep reinforcement learning in large-scale and dynamic environments. IEEE Access 2021; 9: 24884–24900.

24.

Puente-Castro

Rivero

Pedrosa

, et al. Q-learning based system for path planning with unmanned aerial vehicles swarms in obstacle environments. Expert Syst Appl 2024; 235: 121240.

25.

Zheng

Zhang

, et al. Priority-aware path planning and user scheduling for UAV-mounted MEC networks: a deep reinforcement learning approach. Phys Commun 2024; 62: 102234.

26.

Chai

Guo

Zuo

, et al. Cooperative motion planning and control for aerial-ground autonomous systems: methods and applications. Prog Aerosp Sci 2024; 146: 101005.