Abstract
Excavator path planning in modern smart ports facilitates efficient and safe access to work areas by accurately identifying bulk storage locations. This approach not only enhances operational efficiency and reduces costs but also prioritizes the safety of large machinery movement. However, the bulk cargo areas of ports often contain irregular and dynamic obstacles, posing significant challenges to traditional path planning algorithms. This study addresses the issues of local optima and training instability in excavator path planning by introducing an enhanced deep reinforcement learning (DRL) algorithm, AS-TD3 (an improved TD3 algorithm). The algorithm is evaluated by simulating the port bulk area environment on a continuous map. By integrating the A* heuristic function with the reward mechanism of the TD3 algorithm, AS-TD3 enhances the global optimal solution discovery by accounting for distance, time, and state variations in path planning. The A* component provides an efficient heuristic search, while TD3 refines the decision-making process through reinforcement learning. Additionally, the epsilon-greedy strategy effectively balances exploration and exploitation, facilitating smooth convergence of the reward curve. Experimental results indicate that the AS-TD3 algorithm reduces the steps required to find the optimal path by 5.7% and accelerates convergence by 58.75%.
Get full access to this article
View all access options for this article.
