Abstract
Conventionally, urban logistics operate in various independent transport modes, for example, trucks, vans, and motorcycles. Such operation will not be optimal when facing large-scale demand because isolated operation can be inefficient and environmental-unfriendly. To enhance the efficiency and sustainability of the next-generation urban logistics system, a comprehensive framework is proposed to identify the critical operation scenarios, its main system components, and the key technologies for applications of smart multimodal urban logistics (SMUL). There are four innovative characteristics in the proposed SMUL system: (1) co-modality for passenger and freight transportation through public transit; (2) integration of drone or unmanned aerial vehicles, autonomous delivery robots, automated delivery vehicles, and multimodal parcel lockers for last-mile delivery; (3) deployment and operation of robotic smart warehouses for storage; and (4) joint scheduling of various vehicles using large language model (LLM)-based agents. This paper focuses on two application cases, a last-mile home delivery service and a freight-integrated public transit service. A cyber-physical SMUL system is designed as the digital twin for monitoring the operations. A novel simulation platform is designed, using LLM-agents, for predicting system dynamics and thus supporting the economic evaluation of an SMUL system. Moreover, stakeholder benefit allocation and operation resilience are discussed for designing a sustainable SMUL system. Finally, insights are provided for policy and practice, for example, the logistics service providers are recommended to integrate aerial, road, and underground transport modes in consideration of local regulation, land use, and public trust in technology to achieve synergistic economic, social, and ecological efficiency.
Keywords
Introduction
The logistics system is a critical component of modern cities due to its essential role in service delivery. Conventional forms of urban logistics include commercial delivery, humanitarian relief, and healthcare service (Dukkanci et al., 2024). Among these, the commercial delivery driven by the rapid growth of e-commerce over recent decades has become the dominant form of urban logistics. However, traditional delivery relying on large-size trucks contributes to traffic congestion, environmental degradation, and increased damage on road infrastructure. Advances in artificial intelligence (AI) and the emerging low-altitude economy (i.e., an economy built upon transportation by aerial vehicles operating at low-altitude aerial spaces) have significantly transformed both freight transportation and passenger mobility sectors. First of all, the mixed passenger-freight transport has emerged, leveraging the existing urban transit networks for goods movement through approaches such as crowdshipping (Gajda et al., 2025), bus transfer (Masson et al., 2017), and metro freight (Hu et al., 2025). Secondly, the multimodal urban logistics, integrating drones or other forms of unmanned aerial vehicles (or UAVs), trucks, autonomous delivery vehicles, autonomous delivery robots, and public transit modes (e.g., buses and metros) (Huang et al., 2020; Montero-Vega & Estrada, 2025), shows a strong potential to enhance logistics efficiency and sustainability. Thirdly, there is a growing trend toward integrating traffic management with logistics operations to mitigate the adverse impacts of freight activities on urban transportation systems, particularly through the strategic utilization of low-altitude airspace for drone deliveries (Filiopoulou et al., 2025).
In recent years, there have been studies and practices investigated on the transformation afore discussed. Nevertheless, a systematic analysis and design of the next generation of urban logistics systems, in particular, within the context of AI, robots, and low-altitude economy, remains unexplored. To this end, this paper aims to construct a comprehensive framework of smart multimodal urban logistics (SMUL). SMUL takes into account, holistically, unmanned drones, automated vehicles (and/or robots), and public transit modes, the operation procedure of individual transport mode, and the layout of related infrastructures (e.g., warehouses/depots, drone landing pads, and parcel lockers), with consideration of humanity design measures. For example, SMUL can produce as detailed as safety planning of the movement trajectories of delivery vehicles or drones, but also larger-scale level optimizations such as the layout of warehouses and schedule of forklifts for throughputs. As such, the proposed SMUL treats both the mode operation and facility design. Moreover, the paper will discuss and analyze the key technologies for SMUL, such as system designing, operation optimization, performance assessing, and sustainable development strategy.
Application Scenarios Design for SMUL
There are various forms of unban logistics involving different transport modes and applied in different scenarios. In this paper, we mainly focus on two classic scenarios, that is, last-mile delivery in residential communities and freight-passenger co-modality urban logistics. The former provides delivery service of parcels from a neighborhood logistics hub to its adjacent residential community or even the end users. The latter refers to providing integrated transport solutions considering passenger and freight. The delivery system concerns fleets of vehicles and robots as well as warehouses and depot, thus both transportation and facility operations are essential part for SMUL, which will be discussed in detail in the next section.
Four system functional components should be designed: - At the planning phase, logistics facilities layout is designed for economic viability, and information platform for collecting delivery demands is established to improve the collaboration between different logistics stakeholders. - As for the phase of system operation, a scheduling model is developed for the complex collaboration of multi-modal vehicles, with the objective of minimizing delivery time duration. This is a challenging problem given the size of the problem and battery constraints. - Then, the performance of SMUL (e.g., cost-benefit analysis) is to be evaluated dynamically after completing logistics delivery by system simulation and mathematical statistics analysis. - Finally, benefit distribution among stakeholders (e.g. transport providers) and resilience analysis are implemented for achieving stable cooperation between delivery providers. Therefore, we establish the framework of SMUL and structure of this paper in Figure 1.
The two application scenarios of SMUL will be discussed in detail below. Framework of SMUL and Structure of This Paper
Last-Mile Delivery for Residential Community
Last-mile delivery in the form of parcels pickup and first-mile delivery in the form of parcels dispatch are considered as the two modes of residential community logistics. According to the previous statistics (Pourmohammadreza et al., 2025), last-mile delivery is the most expensive segment in the shipping process, which accounts for 53% of the total shipping cost. According to World Economic Forum’s report in 2020, the demand for urban last-mile delivery is expected to be 78% by 2030, leading to 36% more delivery vehicles in 100 cities around the world. As a result, emissions caused by delivery are anticipated to increase by more than 30%, and commuting costs could increase by 21%, with delays of up to 11 min due to road traffic congestion. Hence, it is urgent to adopt advanced modes and technologies to improve delivery efficiency and eco-environment. In recent years, the out-of-home delivery (or alternative self-pickup) becomes available which is practiced by using parcel lockers and stores. Motivated by the previous studies on this mode (Janinhoff et al., 2024; Kundu et al., 2025), an alternative self-pickup model is proposed in this paper, which incorporates delivery options by drones, automated parcel lockers (or robots), or fixed parcel lockers (see Figure 2 for an illustration). Model of Smart Residential Community Delivery
In operation, the process is as follows: the logistics service providers firstly need to collect daily delivery demands (amount, OD matrix, and time windows) of different residential communities and determine specific transfer mode for each parcel. Parcels are generally collected to the closest origin hubs and then transferred by trucks or public transits (the so-called crowdshipping) to the corresponding destination hubs. For those parcels with strict time windows or remote destinations, they can be served by UAVs directly from an origin depot. For other ordinary parcels, they are to be delivered from destination depots by automated parcel lockers or robots for decreasing delivery cost. The differences between choosing lockers or robots will be discussed in the next section.
On the other hand, crowdshipping or crowdsourced delivery is a promising mode of self-pickup delivery, given its powerful coverage and reachability by public transportation in cities. According to the design of a crowdshipping system, commuters can register as crowdshippers and interact with each other by exchanging parcels, information, and financial transactions among bus stations, metro stations, etc. For understanding the potential economic, environmental, and societal benefits of crowdshipping, readers can refer to Mohri et al. (2023) with a systematic understanding of crowdshipping from sustainability perspectives.
Passenger and Freight Integrated Model for Urban Logistics
With the booming of e-commerce sales projected to surpass $ 7.4 trillion by 2025, the rapid-growing last-mile deliveries will increase immense pressure on urban transport infrastructure (Galkin et al., 2025). Various measures such as dynamic freight management, goods on transit, and micro hub have been adopted to address the associated challenges, including traffic congestion, environmental impacts, regulatory constraints, and last-time inefficiencies. Nevertheless, the outcome is still limited for highly complex schedule strategies with uncertain deliver. Therefore, an air-ground-underground co-modality system is proposed to utilize the capacity of the existing transportation network, as illustrated in Figure 3. Model of Smart Multimodal Urban Logistics
There are three different transport modes collaborated in this smart multimodal urban logistics system. Firstly, UAVs deliver high-valued goods with strict time constraints directly from depots to customers even during peak hours in city center. Secondly, the ground vehicles (e.g., buses and trucks) can transship goods from original depots to those destined ones, and then automated delivery robots, parcel lockers, or bicycles provide last-mile services to customers. Thirdly, underground logistics served by capsule freight pipeline system (CFP) and the metro system provide stable and scaled transport service.
There are two types of logistics facilities in this scenario, namely, micro hub and smart depot. The micro hubs in this model act as a delivery depot for its convenience to store a limited number of parcels transshipped by buses, metros, or delivery trucks. They also serve a limited number of destinations within a bounded spatial range and allow more emission or zero emission vehicles for residential community delivery. As large-scaled facilities, the smart depots are located close to public traffic infrastructure and used to store and transship large number of parcels between buses, metros, and delivery trucks away from residential communities. Smart depots (or smart warehouses) use automated guided vehicles (AGVs) for storing, shipping, and picking items in the warehouse, and IOT technology is used for tracking the flow of AGVs and goods with electronic tags. A reference architecture for smart warehouse in Industry 4.0 is proposed, and the common or variant features of smart warehouse are discussed (Van Geest et al., 2021; Zhang et al., 2021).
The innovation of SMUL model is obvious, and the infrastructure and capacity of both public transport and logistics delivery are optimized holistically. Traditionally, the infrastructures of traffic terminals, metro stations, and bus stops are always designed for passenger transport, separately from freight logistics as in most urban cites. This is not unreasonable, given the nature of the transportation and safety concerns. Nevertheless, there emerges a trend of change, for example, recent studies and practices demonstrated drone transferring packages from transit vehicles to customers (De Maio et al., 2024), autonomous delivery robots riding on or fed by public vehicles (Ghiani et al., 2025; Osorio & Ouyang, 2025). There are also reported applications of bus-based passenger and freight co-modality projects in Europe, China, and Japan (Lin & Zhang, 2025) and freight-passenger integrated metro system in European cites (Hu et al., 2025). Under the proposed SMUL model, all the emerging and advanced transport vehicles and for both freight and passenger are included to design the next generation of last-mile urban delivery system.
Transport Modes and Facilities
According to the model presented in Figure 3, there are multimodal vehicles involved in the operation and management of urban logistics system. As the functions and costs are quite different among these vehicles, it is necessary to introduce main modules of the system, including three types of transport vehicles and smart hub in this section.
Low-Altitude Drones and UAVs
Drones and UAVs are widely used in low-altitude logistics for three-dimensional cargo storage, transport, and delivery. Compared to traditional delivery methods, such as trucks, vans, and bicycles, drone delivery can improve consumer accessibility to rural or remote villages without constraint of road traffic congestion. It is predicted that, more than one million drones will be carrying out retail deliveries in 2026 (Filiopoulou et al., 2025).
Although many drone control and path planning methods have been studied in the past years (Cetinsaya et al., 2024), the operation of drone fleets for last-mile delivery is still challenging. According to Fu et al. (2025), the drone delivery system typically comprises the following five subsystems: (1) The energy and power module manage onboard electrical power and generate thrust throughout the delivery mission. (2) The perception system detects surrounding obstacles and environmental conditions using sensors to enable collision avoidance. (3) The communication subsystem exchanges the drone’s state parameters with the ground control center and receives operational commands for flight control. (4) The drone controller regulates altitude, position, and velocity by tracking a predefined (or reference) trajectory to ensure safe cargo transportation. (5) The logistics dispatching subsystem assigns delivery orders to drones in a manner that minimizes operational cost and travel time, including determining departure times, flight paths, target customers, or depots.
Despite the aforementioned advantages, the flight safety of drone delivery and its economic efficiency in the limitations of battery and cargo capacity are still big challenges for large-scaled application of drone delivery. It is recommended that drone delivery be prioritized for medical supplies and emergency materials in rural or disaster-stricken areas. For those urban cities, the integration of multiple modes with intense demands might be cost-effective to traditional delivery trucks (Garg et al., 2023).
Autonomous Delivery Vehicle and Robot
Autonomous delivery vehicles (ADVs) and robots (ADRs) are also the key representatives of new vehicle concepts in Industrial 4.0. The adoption of ADVs in last-mile services is strongly connected with the development of autonomous vehicles (AVs), which can be divided into five levels of automation from zero to full automation (namely, L0: no driving automation, L1: driver assistance, L2: partial driving automation, L3: conditional driving automation, L4: high driving automation, and L5: full driving automation). Assuming AVs of L4 (i.e., high driving automation) are used in the near future, the potential advantages are obvious. ADVs can operate 24 hr/7 days with optimized routes to save service time as well as decrease human driver and fuel costs. The advanced navigation and operation system enable ADVs to server for long distance deliveries with larger parcels along urban streets or even highways. However, there are some risks related to the adoption of ADVs, such as the technical failures or glitches and risk of cyber-attacks. Mobile parcel lockers (MPLs) with smaller capacity than trucks can also be taken as a special type of ADVs. While ADRs follow high-frequency scheduled routes starting from the micro hub to customers as they have relatively smaller capacity and lower speed within residential community. There are two types of ADRs, Sidewalk ADRs (SADRs) and Road ADRs (RADRs), based on their design and operational capabilities (Garus et al., 2024). Comparing to ADVs, SADRs are more flexible and easier to be deployed in dense urban city with narrow streets and operate mainly on sidewalks with slow enough speed. RADRs are usually larger and faster than SADRs, and they are suitable for suburban areas or less congested city streets. Note that, the ADR-based services may depend on micro hubs or robot depots close to the targeted residential communities as motherships.
It can be concluded that, ADRs are efficient in pedestrian-focused, traffic-limited areas by comparing different transport modes (i.e., ADRs, bicycles, and light commercial vehicles) in 14 varied European cities, and decision makers of delivery mode choice are suggested to consider factors such as delivery distance, duration, cost, and potential environmental impacts (Garus et al., 2024). In addition, previous study shows that trust in government and technology companies, satisfaction with public transportation, and cost-benefit calculations regarding the supporting technologies have positive effect on trust in the application of autonomous delivery service using ADVs or ADRs (Wang & Huang, 2025).
Underground Freight Transport
Underground freight transport is an intelligent freight technology for delivery solid commodities via dedicated underground facilities. It is defined as a network of underground depots and interconnected tunnels or pipelines, supporting 24 hr, all weather goods movement and automated logistics operations (Liu et al., 2025). The cooperation of aboveground and underground freight can contribute to higher efficiency of whole logistics operation. At the stage of first-mile transport, parcels are collected by ADVs or ADRs and transferred into front distribution depots close to or inside underground stations. Then, a portion of freight is transported by metro, tunnel, and capsule pipeline, and the rest is transshipped using surface vehicles. As shown in Figure 3, the underground freight is stored in large-scaled smart depot for further transshipment or can be distributed into those micro hubs close to residential community. In this way, ADVs as well as ADRs take micro hubs as their motherships for executing last-mile delivery.
Looking back at history, there have been many tested underground freight projects all over the world in the past few decades, and to the best of our knowledge, very few of these projects have continued due to high costs, long construction periods, and unpredictable investment risks. However, this barrier is beginning to break with innovated architectural technologies and intelligent transportation. Recently, underground freight projects have obtained national policy support and development plans in countries such as Switzerland, China, and Singapore (Li & Yuen, 2025). Both self-pickup and home-entry modes of last-mile delivery integrating underground logistics are designed, and the specific workflows for five modes are also investigated (Wei et al., 2024). The case study has illustrated that distribution time using underground logistics is 3.4–5.2 hr, with comparison of currently 10–12 hr. Moreover, the proposed model can reduce carbon emissions of surface transportation for last-mile delivery by 96%.
Comparison of Different Transport Modes in Urban Logistics
Smart Warehouse/Depot
The warehouse, also denoted as a distribution or fulfillment center, is defined as the intermediate storage of goods between stages in the supply chain and consists of the basic functions such as receiving, storage, order picking, and shipping (Gu et al., 2007). The design of warehouse involves five key decisions: overall structure (or conceptual design), sizing and dimensioning the warehouse, detailed layout of each department, equipment selecting, and operational strategies (task assignment, routing problem). According to the relationship between SKUs (stock keeping units) and pickers, warehouses can be classified into two modes: picker-to-parts and parts-to-picker (Boysen & De Koster, 2025). More than 80% of warehouses in Western Europe still follow the traditional picker-to parts setup, where SKUs are stored in low-level racks, and human order picker collect requested pieces by traveling from shelf to shelf. To avoid the unproductive picker walking of the previous traditional setup, online retailers such as Amazon or https://JD.com apply the parts-to-picker paradigm, in which SKUs are delivered to stationary pickers (operating at dedicated picking workstations) (Li et al., 2024).
As the next-generation warehouse system, mobile robots such as automated guided vehicles (AGVs) have been widely adopted in both warehouse modes. Both Amazon and https://JD.com have benefited from the adoption of mobile robots to reduce operation costs in the past years. Layout design of AGVs, AGV control policy (collision and deadlock avoidance), dispatching, and routing strategies are important factors of AGVs’ operations that influence the performance of warehouse operation (Qi et al., 2018). Among them, collision avoidance and deadlock resolution methods are the most challenging technology in large-scaled warehouse.
As the foundational units of delivery networks, micro hubs located close to residential communities are defined as “logistics facilities within urban area boundaries where goods are consolidated, serving a limited number of destinations within a bounded spatial range and enabling a modal shift to low-emission or zero-emission vehicles or soft transportation modes (e.g., walking) for last-mile deliveries.”
Key Technologies of SMUL System Design
Last-mile delivery logistics is really a complex system that combines different vehicles and stakeholders for smooth delivery. It’s expected that there will be different scenarios stimulated by current policy of low-altitude economy. There are very limited uniform frameworks or standards for designing and operating the proposed smart logistic systems as so far. In this section, four key technologies of SMUL system design are analyzed.
Facility Layout and Cyber System Design
As an automated and integrated logistic system, system designing of SMUL is essential for decreasing fixed cost and smoothing coordination among stakeholders. Specifically, location and layout of warehouses, vehicle depots, and parcel lockers should be considered using classic facility design methods in the field of industrial engineering. As UAVs, autonomous delivery vehicles and robots are involved in both residential community and urban logistics, and an advanced information platform (i.e., digital twin) can be designed and adopted for monitoring, tracking, and managing freight operations in real time.
Facility Location and Layout Design
Logistics-related data should be collected and analyzed for system designing, including delivery demands in targeted communities, urban road network geography data, public transit layout and schedule, passenger flows patterns, potential freight capacity, and available land for logistics.
The location and specific layout of smart depots and micro hubs are to be decided considering available land and its cost. The conventional layout for AGV-based warehouse includes unidirectional layout, bidirectional layout, and multi-lane hybrid layout. Note that traffic conflict might rise with increasing number of AGVs. So, traffic control policy should be designed for avoiding deadlock in warehouse. Well-designed workflows of hubs and depots for dealing with delivery orders are also necessary for managing warehouses with high efficiency.
Integrated Design of Freight and Passenger Transport
The number of drones, vehicles, and robots is roughly estimated according to those delivery demands at high peaks, which might influence dispatch of delivery order, customer satisfaction, and operation cost.
There are always well-developed public transit facilities in large cities all over the world, such as bus lines with high accessibility, and metro lines with high frequency and speed. Integrated design of freight and passenger transport is essential for achieving both economic and social-ecological efficiency. For instances, specific co-modality bus/metro lines and their schedule timetable should be designed, and the workflow from origin to destination depots through bus/metro stations is also included in this integrated transport design.
Digital Twin Platform Design
Digital twin platform servers as a control center of logistic system to receive delivery orders, control unmanned vehicles, and exchange information with hubs and depots. Logistics digital twin is virtual representation of physical objects and transport processes, providing visibility into the operation of their physical counterparts and facilitating decision making through simulations of various scenarios (Tao et al., 2018). Figure 4 illustrates a three-level framework of logistics digital twin including physical layer, simulation layer, and cyber layer. Digital Twin Framework of Smart Multimodal Urban Logistics
Physical layer deploys IoT sensors (GPS, LiDAR, RFID) across drones, trucks, robots, and infrastructures to capture real-time operational data. Simulation engine layer is a unity-based virtual environment enabling high-fidelity modeling of multi-modal logistics processes that include path planning and energy consumption dynamics. The Cyber System Layer acts as the system’s brain, utilizing cloud computing to analyze the simulated outcomes and generate optimized decisions based on specific algorithms (see next subsection). Such three layers form a closed-loop digital twin framework that bridges physical operations with virtual emulation for an end-to-end logistics system. Operationally, the process begins with the physical layer wirelessly transmitting real-time data to the simulation engine to map physical states into the virtual world. Based on these inputs, the simulation models dynamic scenarios such as path planning and energy consumption. Subsequently, the simulation data transmitted to the cyber system layer for analysis. The cyber system layer then generates control strategies through optimization, which are passed back to the simulation engine. Finally, the cycle concludes with the application of these optimized control strategies to the physical layer to execute operations.
Operation Optimization
Urban logistics operations consist whole procedure of parcels pickup, storing, stacking, sorting, transport, and delivery by AGVs or robots through integrated passengers and goods transportation network. The core technologies of SMUL operation mainly concentrate on the schedule optimization of different types of transport tools.
AGV Dispatch Optimization Problem
As aforementioned in Smart Warehouse/Depot section, the number, dispatching, and routing strategies of AGVs are important factors of warehouse operation. The purpose of AGV dispatching strategy is to assign an order to AGV following specific path within the given time period. The AGV dispatching problems always be formulated as mixed integer programming models, queueing network models, and travel time models. Mixed integer programming is a popular method to schedule AGVs or mobile robots in intelligent warehouse.
Consider conflicts between AGVs in large-scaled warehouse, it’s essential to optimize paths of AGVs by traffic control policy (e.g., setting waiting time at crossroad) or conflict-free coordination method (e.g., controlling velocities of AGVs) (Xie et al., 2024). In practice, dispatching and routing strategies are solved at two separated stages. Nevertheless, task assignment and routing problems can be also integrally optimized to assign tasks to multiple mobile robots (such as AGVs, rail-guided vehicles, and gantry lifting devices) and the corresponding conflict-free paths planning (Qiu et al., 2025).
Classical Vehicle Routing Problem
Vehicle-routing is a primary focus and classical problem on delivery scheduling in the e-commerce context, which aims to determine optimal routes for trucks, drones, vans, robots, or mobile parcel lockers to serve customers with the objective of lower operation cost and higher delivery efficiency. The constraints of VRP model may include capacity, energy, velocity limit of vehicles, service sequence, and delivery time windows for customers. There are plenty of studies on VRP, and readers can find the related references by their convenience.
Joint Distribution Using Multiple Vehicles
There are various forms of joint distribution, such as drone-truck, drone-bus, bus-robot, and truck-robot, as there is no single delivery mode that can outperform other modes in all scenarios. The optimization objectives of joint distribution include delivery efficiency (e.g., reducing delivery time or operation cost, and improving drop-off success rate), energy optimization (e.g., payload management and energy saving), and social-ecological factors (e.g., emission and safety). Real-world constraints consist of physical limitations (e.g., payload volume and battery limit of drone), service demands (e.g., service time window of customer and customer demand), operation limitations (vehicle arriving before drone for landing and replenishment, vehicles returning to warehouse), regulatory (e.g., legal frameworks and safety protocols related to airspace usage and no-fly zones), and traffic and infrastructure constraints (e.g., urban layout, traffic congestion, and depot location).
The challenge is that the interplay between different delivery modes is diverse or even too complex to be clearly captured by mathematic formulations. For instance, the drone-truck joint delivery model consists of single to single, single to multiple, and multiple to multiple models. Most of those joint distribution problems belong to NP-hard problems. For enhancing the efficiency of solving large-scaled joint distribution problems, machine learning-based methods can be adopted to overcoming the shortcomings of heuristic and exact algorithms (Arishi & Ahuja, 2025; Zhou et al., 2025).
Bus Scheduling Considering Freight Transport
Collaborative transportation of freight and passengers (named as urban co-modality) is recently tested and applied in metro lines and bus lines, which aims to utilize surplus capacity of public transit. The principles for executing urban co-modality include ensuring transit safety with freight, reducing negative impact on existing transit time tables, and controlling effect on passengers’ travel time. To this end, both freight trucks and selected transits will be synchronized and coordinated by predesigning specific workflows and utilizing optimized schedule strategies. The problem can be formulated as operation research (OR) model with the objective of minimizing the total operating cost by determining three decisions, such as truck size, truck route and schedule, and freight flow allocation among transit stops. In this OR model, transit timetable, passenger flow, and transit travel time are usually given as known constants.
As passenger-related cost is unchanging during urban co-modality, the total operating cost of service provider mainly consists of fixed costs for trucks, truck traveling costs, co-modality charge, freight transfer costs at co-modality stations, and reconfiguration costs for co-modality stations (Osorio & Ouyang, 2025).
Performance Assessment
Simulation Platform Using LLM-Driven Unity
One of the challenges for design and operating a SMUL system is to predict what happens under unknown demand scenarios and predict how supply improvement can improve system performances. For evaluating warehouse layout and operation strategies of first-mile pickup, road transport, and last-mile delivery, a simulation platform is designed utilizing Unity that is powered by an LLM. The platform uses real-world data for rapid scenario generation, dynamic scheduling, and operational control. Moreover, it provides performance metrics to support method evaluation and economic analysis.
The platform focuses on two complementary and critical low-altitude scenarios: warehouse operations and urban delivery. Warehouse operations determine the system’s throughput ceiling and operational stability, as the efficiency of shelf-aisle-AGV interactions directly influences order turnover and inventory dwell time. Concurrently, urban delivery governs the timeliness and energy costs of the system, since drones and trucks are constrained by terrain, congestion, and delivery time windows. These two domains are intricately coupled through the spatiotemporal constraints inherent in orders, inventory, and transportation. This integration forms a complete, end-to-end multimodal system that spans the entire process from initial pickup to final delivery.
This LLM (large language model)-driven platform aims to lower the barrier for large-scale modeling and improve simulation efficiency. It combines three main parts: LLM, MCP (Model Context Protocol), and Unity, as seen in the framework in Figure 5. Among them, LLM is adopted for understanding and interpreting natural language. MCP provides necessary tools and connects the other parts. Unity is a cross-platform tool that enables the creation of 3D simulation, real-animation, and architectural visualization. These components are connected through standard APIs and data exchange rules (JSON Schema), which allows them to collaborate smoothly for SMUL simulation. Specific functions are as follows: (1) LLM Agent is a controller designed for task comprehension and tool orchestration. It leverages a diverse range of large language models to perform complex understanding and analysis. This agent comprises two specialized sub-components: a Scene Agent and a Traffic Agent. The Scene Agent is responsible for generating the instruction plans used for static scene and resource configuration. In contrast, the Traffic Agent manages the scheduling strategies and generates the necessary parameters for multimodal simulations. (2) MCP (Model Context Protocol) Tools and Server is a Python-based component that provides structured, remotely callable interfaces to connect the LLM Agent with the Unity client. This server hosts a suite of specialized tools for specific functions. For instance, MapData_reading_tool is responsible for retrieving and processing geographic basemaps. Finally, Vehicle_dispatch_tool translates high-level inputs, such as orders or origin–destination pairs with time windows, into detailed job schedules and executable simulation scripts. Path_Algorithm_tool generates spline curves that define trajectories for ground vehicles and low-altitude aircraft, along with their operational constraints. (3) A Unity Client constitutes the platform’s execution layer, where it is responsible for both static modeling and dynamic simulation. A key function of the Client is to render the multimodal scheduling solutions in a three-dimensional environment. This visualization is essential for the comprehensive evaluation and iterative optimization of proposed strategies. Simulation Framework of SMUL Based on LLM-Agent and Unity3D

The specific workflow is structured into the following four steps.
Step 1. Semantic Parsing and Task Planning.
The workflow starts from the user’s natural language input. The LLM agent invokes the internal Scene Agent and Traffic Agent to interpret the request and extract key elements. Following predefined system rules, the complex request is decomposed into a set of interdependent sub-tasks. An explicit execution sequence is then generated.
Step 2. Tool Discovery and Initial Solution Generation.
Based on the task sequence, the MCP Server dynamically retrieves and loads the requisite Python tool scripts. Real-world Data is then inputted to drive these scripts, generating an initial solution. The solution is packaged into a standardized JSON instruction bundle and delivered to the Unity Client via MCP Bridges.
Step 3. Scene Instantiation and Simulation Execution.
The Unity Client receives and parses the JSON bundle into simulation objects and parameter configurations. It first loads static assets to construct the scene, followed by driving dynamic entities using data in solution. This process visualizes the multimodal logistics operations within a high-fidelity 3D environment.
Step 4. Result Evaluation and Algorithm Iteration.
After each simulation run, a system report containing Key Performance Indicators (KPIs) is automatically generated. If the results fail to meet preset thresholds (e.g., low efficiency or high cost), feedback is routed to the MCP Server for targeted algorithmic parameter tuning. This triggers an iterative cycle until performance requirements are met. A complete iteration log is finally returned to the LLM agent in JSON, summarizing iteration counts, per-round parameters, and solutions. The LLM agent analyzes the log and presents the user with optimized solutions alongside interpretable decision recommendations.
Cost-Benefit Assessment
Cost-benefit evaluation is a core dimension of SMUL, which quantifies and compares all costs incurred by SMUL operations against corresponding benefits under a unified time and monetary scale. By identifying key sensitivity factors, cost-benefit assessment provides clear optimization directions for scheduling and route planning. It also offers reliable grounds for scheme selection and investment decisions. System costs include infrastructure, equipment procurement, energy, maintenance, labor, and algorithm development. System benefits are collectively derived from revenues generated by collaborative platforms, drones, autonomous vehicles, and public transit.
Based on the simulation platform established in Performance assessment section, real-world data is input to conduct simulation experiments. Operational costs for vehicles and warehouses are estimated with key performance metrics (e.g., task volume, runtime, and equipment utilization). These outputs are then mapped onto a cost-benefit model to evaluate the system’s total transportation costs and potential benefits. Subsequently, the distribution ratio among different vehicles is adjusted to observe the trajectories of cost and benefit variations, with the aim of identifying the optimal trade-off point. Finally, sensitivity analyses are conducted on critical factors such as task volume and energy prices to determine decision thresholds and robustness intervals.
Sustainable Development
As is well known, environment, economic, and social dimensions are the main aspects of logistics sustainability. In this paper, we only focus on the most addressed economic dimension by analyzing economics feasibility and its resilience under business practices. In our proposed SMUL system, buses can be used to transport goods from logistics depots to bus stations, and then drones or trucks are assigned to provide out-of-home deliveries by picking up parcels from bus stations. Therefore, benefit distribution among stakeholders (e.g., transport providers) is implemented for achieving stable cooperation between delivery providers.
Benefit Distribution
Within the SMUL framework, the collaborative operation of multiple transport vehicles generates greater benefits while inevitably introducing the issue of benefit distribution (or profit balancing). Benefit distribution refers to the process of allocating profits generated by the cooperation between different types of transport vehicles and smart hubs among participating parties according to predetermined rules. As delivery tasks and costs vary across different vehicles, participant satisfaction with the distribution plan is to be considered during benefit allocation. Within a cooperative alliance, if a transport entity receives benefits below its costs, its satisfaction with the collaboration declines. This will undermine the existing cooperation and may even lead to entity’s withdrawing from the alliance.
In the benefit distribution of SMUL, incorporating cooperative game theory enables the simultaneous consideration of fairness and efficiency in allocation. The cooperative game framework (e.g., the core, the nucleolus, and the Shapley value) provides verifiable allocation criteria for collaborative decision-making among stakeholders. The core emphasizes the stability of sustained alliance cooperation. The nucleolus minimizes coalition dissatisfaction and lowers defection risks, and Shapley values are determined by computing marginal contributions, which reflect contribution-oriented equity among stakeholders. Specific allocation methods should be selected strategically based on the distinct alliance structures and cooperative characteristics within SMUL.
Resilience Analysis
Logistics resilience refers to the capacity of logistics systems to recover and sustain operations from disruptions (such as natural disasters and human made incidents), technical failures, price fluctuations, and demand changes. The implications of logistics vulnerability, robustness, redundancy, and resilience need to be further investigated quantitatively. For instance, a resilience indicator based on the service capability of urban logistics systems is introduced, and a resilience-oriented capacity reallocation strategy is developed for performing reactive measures in response to DC closure and route capacity reduction (Li & Zhou, 2024). Overall, the existing literature about logistics resilience, especially last-mile distribution resilience (Pahwa & Jaller, 2023), is rather limited.
According to the framework of SMUL, the integration of multimodal vehicles might contribute to high resilience of logistics system in several aspects. First, adoption of drones can reduce the negative effects of road network unavailability caused by traffic congestion or disasters. Secondly, public transit system can adjust its surplus capacity to transshipping freight for adapting to uncertain delivery demands. Thirdly, well-organized schedule and coordination among different unmanned vehicles (i.e., drones, ADVs, ADRs, and mobile parcel lockers) within residential communities can reduce operational cost of SMUL, which can enhance the satisfaction of customers with lower price and sustainability of logistics systems with less emissions. There are two types of strategies to enhance logistics resilience (Li & Zhou, 2024). Proactive measures are adopted at pre-disaster stage to enhance redundancy of logistics networks to resist sudden node or arc disruption. Reactive measures are used at the post-disaster stage to alleviate the functionality loss caused by unexpected events.
Conclusion
Urban logistics has been growing rapidly due to the expansion of e-commerce, in which the last-mile delivery accounts for a significant portion of logistics costs and plays a critical role in customer satisfaction. Over the past few decades, various logistics-related technologies have been developed and adopted to improve warehouse operations and last-mile delivery efficiency. For instance, different types of automated guided vehicles (AGVs) and robots are deployed in well-designed warehouses to reduce travel delays and enhance throughput in large-scale logistics facilities. Meanwhile, unmanned aerial vehicles (UAVs, i.e., drones) and autonomous delivery vehicles (ADVs) are increasingly utilized for home or out-of-home deliveries in the emerging low-altitude economy. Previous studies and practical implementations are valuable, as urban logistics is a classical interdisciplinary field encompassing logistics engineering, system design, industrial engineering, transportation engineering, automotive engineering, and computer science. However, a notable gap remains between academic research on humanity design of multimodal logistics system and its widespread real-world applications. To bridge this gap, this paper proposes a comprehensive framework to identify application scenarios, core components, and key technologies of smart multimodal urban logistics (SMUL), aiming to enhance the integration, efficiency, and sustainability of next-generation urban logistics systems.
The potential scenarios for SMUL include last-mile out-of-home delivery of residential communities, urban logistics integrating freight vehicles, and public transits. There are four obvious characteristics for the proposed SMUL, such as co-modality of freight and passengers transport by public transits, integration of UAVs, ADRs, ADVs, and MPLs for last-mile delivery, adoption of smart warehouse with robots, and smart operation driven by LLM-based agents. The features as well as limitations of UAVs, ADRs, ADVs, and underground transport mixed with passengers and goods are roughly discussed for reasonable adoption in reality. As main facilities of SMUL, the functions of smart warehouse and depots are discussed considering adoption of AGVs. After that, logistics service providers are recommended to integrate aerial, road, and underground transport modes considering local traffic policies, land use, and public trust on technology for achieving both economic and social-ecological efficiency.
Several key technologies are further introduced to enrich the proposed framework in Figure 1 and for readers’ reference. Firstly, the structure of SMUL digital twin is established for system design, VRP and integrated schedule methods are analyzed for optimizing system operation, and LLM-agent driven simulation platform is proposed for supporting economic analysis of logistics system. To address the coordination complexities and evaluation challenges in system design, this paper advocates for large-scale integrated scheduling and high-fidelity regional simulation as critical strategies. Furthermore, establishing robust service pricing and benefit distribution mechanisms is identified as a core pathway to resolve stakeholder conflicts and ensure the sustainable operation of the SMUL system.
Due to limited space, we mainly focus on the framework design, main components, and key technologies of SMUL system. There are still some interesting topics for further investigation. On the one hand, the design methodology combining both freight logistic and passenger transportation for smart logistics system needs to be further explored. On the other hand, the applications of LLM-agent for simulating SMUL system and scheduling a large number of AGVs in smart warehouses, different vehicles in the air-ground-underground logistics system are also meaningful.
Footnotes
Author Contributions
Hui FU conceived the study, developed the methodology, conducted experiments, and wrote the original draft. Hongpeng LI, Youbin CHEN, Jiahong ZHAO, and Zhenqi LIANG contributed to investigation and methodology. Zhixuan WU created visualizations. Nan ZHENG validated results and reviewed the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is supported by Science and Technology Planning Project of Guangdong Province, China (Serial No.: 2024B1212100007), Natural Science Foundation of Guangdong Province, China (Serial No.: 2025A151501020), Opening Fund of Civil Unmanned Aircraft Traffic Management Key Laboratory of Sichuan Province, China (Serial No.: 2025UASKLSP01), and Natural Science Foundation of Sichuan Province, China (Serial No.: 2025ZNSFSC0394).
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Hui FU is the member of the editorial board of this journal. However, the author did not participate in the peer review process of this manuscript. We hereby declare that there are no conflicts of interest.
