Abstract
Industry 5.0 emphasizes the human role in intelligent manufacturing, particularly in flexible job shops, where optimizing workforce scheduling is crucial. However, current research on deep reinforcement learning in flexible job shop scheduling mainly focuses on single-objective or single-resource problems, with limited exploration of dual-resource dynamic scheduling considering human factors. This study considers two dynamic scenarios—new order insertion and machine failure—and proposes a human-machine collaborative flexible job shop dynamic scheduling method based on a multi-proximal policy optimization algorithm integrated with hybrid prioritized experience replay. The algorithm constructs three independent actor networks to enable the parallel learning of action policies for job selection, equipment allocation, and worker assignment, thereby enhancing the training efficiency. It employs a shared critic network for a coordinated value estimation across these components to ensure consistency in policy updates. The convergence speed of the algorithm was improved by incorporating a hybrid prioritized experience replay mechanism. Numerical experiments demonstrate that compared to traditional scheduling rules, heuristic algorithms, and other deep learning algorithms, the proposed method exhibits notable superiority across various scale test cases.
Keywords
Get full access to this article
View all access options for this article.
