Abstract
Faced with the rapid growth in demand for instant delivery, traditional logistics delivery modes have struggled to meet these needs effectively because of capacity constraints. Autonomous delivery vehicles (ADVs) can compensate for a shortage of human labor. ADVs, which rely on batteries for propulsion, occasionally need to return to battery-swapping stations to maintain the state of charge of their batteries during delivery. In the context of applying ADVs for instant delivery, we employ agent-based modeling to set the behavioral rules of customers, the ADVs, and the distribution center; therefore, an instant delivery scheduling simulation environment is created. A vehicle routing problem with time windows mathematical model is established and solved to optimize the delivery scheduling by the adaptive large neighborhood search heuristic algorithm. Given the dynamically changing environmental conditions, we utilize the Dueling Double Deep Q Network deep reinforcement learning algorithm, which adapts to these changes, to train ADVs on autonomous battery swapping decisions. The performance of the proposed model is compared with several benchmark policies, including threshold-based strategies, alternative reinforcement learning algorithms, and a fixed strategy in which the ADV swaps its battery on each return to the distribution center. Simulation experiments, based on real-world cases, demonstrate that the proposed model achieves better results. Specifically, it reduces the delay time by approximately 17.55% compared with the average delays of all other benchmark policies and decreases the number of battery swaps by approximately 49.06%. Furthermore, the model exhibits strong adaptability to the dynamically changing simulation environment.
Keywords
Get full access to this article
View all access options for this article.
