Abstract
With the rapid development of robotic and artificial intelligence technologies, the autonomous decision-making capability of endoscopic surgical robot has been significantly enhanced, accompanied by growing demands for automation in non-decision-making tasks. This study focuses on the path planning for complex tasks in surgical robot by proposing a hierarchical reinforcement learning (HRL) framework based on the option framework. Within this framework, the Semi-Markov Decision Process (SMDP) is extended into an augmented Markov Decision Process (MDP) to optimize termination conditions and facilitate long-horizon task training. To address the sparse reward problem, a hierarchical reward function is designed with intrinsic temporal rewards specifically implemented for high-level policies. In addition, a Type-Shared Option Policy (TSOP) is proposed to enhance training efficiency. Experimental results demonstrate that the proposed HRL framework effectively improves both the success rate and stability of path planning for surgical robot in the da Vinci Research Kit (dVRK) simulation environment.
Keywords
Get full access to this article
View all access options for this article.
