Abstract
Nowadays, logistics service providers (LSPs) increasingly consider using a crowdsourced workforce on the last mile to fulfill customers’ expectations regarding same-day or on-demand delivery at reduced costs. The crowdsourced workforce’s availability is, however, uncertain. Therefore, LSPs often hire additional fixed employees to perform deliveries when the availability of crowdsourced drivers is low. In this context, the reliability versus flexibility trade-off which LSPs face over a longer period, for example, a year, remains unstudied. Against this background, we jointly study a workforce planning problem that considers salaried drivers (SDs) and the temporal development of the crowdsourced driver (CD) fleet over a long-term time horizon. We consider two types of CDs, dedicated gig-drivers (DDs) and opportunistic gig-drivers (ODs). While DDs are not sensitive to the request’s destination and typically exhibit high availability, ODs only serve requests whose origin and destination coincide with their own private route’s origin and destination. Moreover, to account for time horizon-specific dynamics, we consider stochastic turnover for both SDs and CDs as well as stochastic CD fleet growth. We formulate the resulting workforce planning problem as a Markov decision process whose reward function reflects total costs, that is, wages and operational costs arising from serving demand with SDs and CDs, and solve it via approximate dynamic programming. Applying our approach to an environment based on real-world demand data from GrubHub, we find that in fleets consisting of SDs and CDs, approximate dynamic programming (ADP)-based hiring policies can outperform myopic hiring policies by up to
Keywords
Introduction
In recent years, on-demand home delivery services experienced significant growth, especially in urban areas. Global e-commerce sales grew by
To account for heterogeneous CD behavior, we consider two predominant types of CDs: the first type of CDs are dedicated gig-drivers (DDs), whose request acceptance behavior is not sensitive to the request’s destination and who typically exhibit high availability. Example companies relying on this type of workforce are Postmates, Instacart, DoorDash (in the US), or Rappi (in South America). DDs install an app on their phone and receive a notification when a new delivery request arises. They can then either accept the request or wait for another request. When serving the request, they receive compensation, typically proportional to the distance of the request’s route. Second, we consider ODs, which only serve requests whose origin and destination coincide with their private route’s origin and destination. For example, the company Roadie relies on this type of driver. As such a concept leverages pre-existing routes, it potentially reduces delivery traffic, emissions, and costs.
One major challenge of a mixed fleet of SDs and CDs is to ensure minimum service levels, which the LSP achieves by hiring the right number of SDs based on the expected demand and the uncertain CD supply unfolding throughout the planning horizon. While initially hired SDs might become obsolete if the number of CDs grows, not hiring enough SDs early negatively impacts the LSP’s service level in early time periods. Hence, the main focus of this article is to examine the trade-off between hiring a reliable workforce supply via SDs, who receive compensation regardless of their utilization during the contract period, and the uncertain supply from CDs, whose compensation is proportional to their utilization. Hiring obsolete SDs poses an issue especially if the contract duration for SDs is long. Such long contract durations are particularly prevalent in regions with stringent labor regulations, for example, the European Union. If the contract duration is rather short (cf. Amazon, 2023) the LSP is less committed to SDs and can decide ad-hoc whether to prolong the SDs’ contract or not.
Since the level of required SDs to match the demand depends on operational aspects, for example, request route patterns, we develop a framework that integrates decision-making on two planning levels: the strategic level, where the LSP makes hiring decisions, and the operational level, where the LSP decides on how to route its SDs and which request to outsource to CDs. In the remainder of this section, we relate our work to the existing literature (Section 1.1), state our contribution (Section 1.2), and outline the article’s structure (Section 1.3).
Related Literature
Three streams of literature relate to our work: vehicle routing problems (VRPs) with CDs, strategic workforce planning problems with conventional employees, and studies combining workforce planning and crowdsourcing for general on-demand service platforms and last-mile delivery companies. We detail these streams in the following.
A large body of literature emerged in the field of VRPs with CDs. First studies consider an LSP that optimizes route plans for deliveries from a single depot for SDs and expected CDs in a static day-ahead manner (Archetti et al., 2016; Gdowska et al., 2018; Torres et al., 2022). Other papers study delivery with CDs in a multi-depot (Sampaio et al., 2019) or many-to-many network (Raviv and Tenzer, 2018; Voigt and Kuhn, 2022). Further works consider a dynamic delivery problem (DDP) in a crowdsourced context (Arslan and Zuidwijk, 2019; Dayarian and Savelsbergh, 2020; Mak, 2020), which is a special class of a dynamic pick-up and delivery problem (Berbeglia et al., 2010), wherein drivers do not change their route or pick-up another request once they began serving the current one. More recently, the meal delivery routing problem (MDRP) led to an increased focus on the DDP with CDs. In the MDRP, requests arise dynamically at random regions and must be delivered instantly to their destinations. In this context, some works consider random CD supply (Reyes et al., 2018), while others assume that CDs’ availability is known to the LSP (Ulmer et al., 2021; Yildiz and Savelsbergh, 2019). Our problem corresponds to a DDP in a many-to-many network using CDs, and we refer to the literature review of crowdsourced delivery in Alnaggar et al. (2021) and Savelsbergh and Ulmer (2022) for a comprehensive overview. So far, studies on the dynamic delivery setting consider relatively small instance sizes to benchmark their order matching and courier routing policies, for example, 24 drivers (Ulmer et al., 2021). Moreover, to the best of our knowledge, all works considering the dynamic delivery problem with CDs envision CDs to behave like DDs and neglect the potential of synchronizing demand with ODs.
In the strategic workforce planning problem with conventional employees, the objective minimizes costs from hiring, compensating, promoting, and operating a workforce over a certain time horizon. Several studies model employee hiring, training and learning, and turnover dynamics as a sequential decision-making problem formalized as a Markov decision process (MDP). Gans and Zhou (2002) considered the employee hiring problem of a service organization that wants to serve uncertain demand. They model hiring decisions, up-skilling transitions, employees’ turnover rates and formulate a total cost minimization objective, including an operational cost element. Similar studies include firing decisions (Ahn et al., 2005), propose heuristics to solve large instances (Song and Huang, 2008), consider worker heterogeneity (Arlotto et al., 2014), account for inter-departmental worker mobility (Dimitriou et al., 2013), model decisions on multiple organizational levels (Guerry and De Feyter, 2012), or focus on a specific application case, for example, healthcare (Hu et al., 2016). Further works use multi-stage stochastic programming combined with linearizations, Bender’s decomposition, or conic optimization (cf. De Feyter et al., 2017; Jaillet et al., 2022; Zhu and Sherali, 2009). Similar to these studies, we aim at finding total cost minimizing SD hiring policies over a long-term planning horizon. None of these works, however, considers the presence of a partially uncertain workforce whose size cannot be controlled. Incorporating such an uncertain workforce in our long-term SD hiring problem is the focus of our work.
Some studies investigate workforce management in a crowdsourced context. One stream of works analyzes general on-demand platforms controlling the supply of crowdsourced workers indirectly by adjusting the compensation offered for a service. Gurvich et al. (2019) studied such a platform and consider self-scheduling agents that decide to work based on expected compensation and their availabilities. Similar works focus on surge pricing to balance demand and supply (Cachon et al., 2017), on the influence of agents’ independence and customers’ delay sensitivity (Taylor, 2018), and platform commission schemes (Zhou et al., 2019). Similarly to these works, we consider self-scheduling agents as part of our workforce. However, our problem formulation differs significantly from existing works, as we consider them jointly with conventional employees (the SDs) and control our workforce solely through the hiring process of SDs. Finally, studies combining workforce planning and crowdsourced delivery are closest to our work. Dai et al. (2017) studied a problem with in-house drivers (equivalent to permanent employees), part- and full-time CDs, and derive optimal in-house driver and CD staffing levels at different depots and times of one day based on a deterministic demand scenario. Similarly, Behrendt et al. (2022a), Cheng et al. (2023), and Goyal et al. (2023) considered hybrid crowdsourced fleets with joint SD fleet-sizing and operational decision making, respectively focusing on warehouse allocation decisions, robust workforce management, and order pricing. All of these studies are restricted to a time horizon of one day, similar to Ulmer and Savelsbergh (2020) and Behrendt et al. (2022b), who focus on pure CD fleets and consider two types of CDs: scheduled CDs that announce their availability prior to the operational time horizon and unscheduled couriers, that arrive ad-hoc while the LSP already operates. They aim to find the optimal set of schedules for one day to minimize fixed costs associated with scheduled CDs and operational costs. While the former employ a classical value function approximation approach, the latter use neural networks to find the optimal set of shifts. Finally, Lei et al. (2020) also considered a one-day planning horizon and an entirely crowdsourced delivery platform and study mechanisms to reduce demand-supply imbalance by outsourcing excess requests to drivers willing to prolong their scheduled shifts. While these works study joint SD acquisition and operational planning, they consider short-term planning horizons. Hence, these works do not account for long-term dynamics, for example, workforce turnover or stochastic CD fleet growth. Moreover, the LSPs’ contractual commitment when hiring SDs reduces itself to one day in the studies above. However, in many legislative systems, contracts for fixed employees must have a minimum duration of a year, even when considering temporary contracts. Changing demand levels or increasing CD supply might make these fixed employees obsolete before their minimum contract duration terminates. Our work will address this untouched issue by considering long-term time horizons.
In conclusion, our work closes three gaps in the literature, combining crowdsourced delivery and workforce management. First, to the best of our knowledge, no work considers joint SD fleet sizing and operational decision-making on a long-term time horizon, thereby neglecting dynamics such as workforce turnover or stochastic CD fleet growth. Second, all works considering the dynamic delivery problem with CDs envision CDs to behave like DDs, hence disregarding the potential to synchronize demand with ODs. Third, studies on the dynamic delivery setting with CDs consider relatively small instance sizes. Yet, the instant delivery market, especially in urban areas, is expected to grow significantly, thus calling for studies accounting for large demand scenarios and large delivery fleets.
Contribution
To close the research gaps outlined above, we develop a novel framework to study the long-term workforce planning problem in the context of hybrid crowdsourced delivery fleets. To account for the interplay between workforce planning and operations, we integrate hiring decisions for a long-term time horizon with operational decisions regarding SD relocation and outsourcing of demand to CDs. Moreover, we consider two CD types, DDs and ODs, which exhibit distinct request acceptance behaviors. While the former is less sensitive to a request’s origin and destination and typically exhibits higher availability, the latter only accepts requests whose origin and destination coincide with their private route’s origin and destination.

Different temporal entities and their dependency. Here,
Specifically, our contribution is threefold. First, we formalize the strategic level planning problem as a novel stochastic workforce planning problem, wherein the LSP needs to decide on how many SDs to hire or fire while taking into account uncertain CD supply. We model the strategic level as a finite-horizon MDP. Here, the objective is to minimize total costs arising from SD wages and operational costs. To obtain the latter term for large fleets within reasonable computation times, we approximate the operational problem with a fluid model. Second, we prove the value function’s convexity along the SD dimension and use this property to develop a look-ahead policy based on piecewise linear value function approximation (PL-VFA), which approximately solves our strategic problem. Third, we conduct numerical studies based on real-world data provided by Grubhub (2018), wherein we benchmark our PL-VFA against a myopic policy and a lookahead policy with perfect knowledge of future information. Furthermore, we evaluate sensitivities of strategic and operational levels’ parameters, for example, joining and resignation rates of CDs. Our main findings are as follows: (i) A hiring policy obtained from PL-VFA can yield up to
We structure the remainder of this work as follows. In Section 2, we introduce our problem setting, describing decisions and events on the strategic level and the dynamics of the operational level. In Section 3, we formalize the strategic level as an MDP and introduce a closed queueing network to model the operational level. Moreover, we introduce our PL-VFA for finding the optimal number of SDs and derive a fluid approximation for our operational planning problem. We detail the design of experiments for our numerical study in Section 4 and discuss results in Section 5. We conclude this article with a short synthesis in Section 6.
Problem Setting
In the following, we introduce our problem setting. First, we provide a high-level descriptive overview in Section 2.1, before we formalize and detail the problem dynamics and objectives in Section 2.2.

Sequence of events and decisions in time step
In this article, we focus on an LSP providing on-demand delivery services in an urban area
Mathematical Formalization
We start by describing the sequence of events and decisions during one time step
To obtain
We denote the costs of serving a request with delivery option
Characteristics of SDs, DDs and ODs.
SDs = salaried drivers; DDs = dedicated gig-drivers; ODs = opportunistic gig-drivers.
We now describe the set of constraints we need to fulfill in each time step
Constraints (2.4b) ensure that
Let us for now assume that we obtain some approximation for
Second, we assume that there are no SD supply shortages, as we restrict our problem to urban areas that typically have an abundant workforce supply, especially in the gig economy sector.
Third, severance payments implicitly prescribe the minimum duration for which the LSP must hire SDs. If
Finally, we consider a finite time horizon on the strategic level since LSPs’ strategic workforce planning process relies on finite horizons, for which they can leverage a robust forecast. This is in line with works in the strategic workforce planning literature (cf. Gans and Zhou, 2002).
This section formalizes the problem setting presented in Section 2. We model the strategic level’s problem as an MDP (Section 3.1) and the operational level’s problem as a closed queueing network (Section 3.2). Finally, we present an approximate dynamic programming approach to solve large instance sizes in Section 3.3.
Strategic Level Workforce Planning
In this section, we formalize the LSP’s workforce planning problem, outlined in Section 2, as an MDP. In the following, we will successively describe the state, the feasible actions, the state transition, the policy, and the objective function.
Given the necessity to solve the operational level’s problem in each time step of the strategic level, an efficient approximation is essential to preserve computational tractability. Current research on mobility on demand (MoD), similar to the on-demand last-mile delivery problem examined in our study, show that greedy matching heuristics are only marginally surpassed by lookahead policies based on methods such as model predictive control or deep reinforcement learning (Enders et al., 2023). Therefore, we rely on greedy driver-to-request matching and use a forward-looking SD relocation policy. This approach allows us to employ a fluid approximation model for the operational costs as suggested by Braverman et al. (2019). First, we formalize the operational level’s problem as a closed queueing network. We base our formulation of the queueing network on the model from Zhang and Pavone (2016). Herein, we assume that requests can only be served by drivers in the same region as the request’s origin. This is plausible if the discretization of the area
We now introduce the fluid approximation of the presented closed queueing network as proposed by Braverman et al. (2019). Herein, we consider steady-state conditions, that is,
The fluid approximation reformulates the closed-queueing system as a network flow problem, whose counterparts to the closed-queueing system’s queue lengths
The optimal objective of the LP described by equations
We use the solution of the LP 3.4 to approximate the operational costs obtained in
In this section, we present a stochastic dynamic programming approach to solve the workforce planning problem on the strategic level. Section 3.3.1 shortly discusses a standard backward dynamic programming (BDP) procedure to find the optimal policy

Piecewise linear approximation along the salaried driver (SD) dimension for a fixed
We start by describing a BDP approach, which allows us to determine the optimal workforce planning policy
In this section, we introduce an algorithm that approximates the value of being in post-decision state
We denote the number of SDs in the post-decision state by
Then, we observe samples of
This section describes our experimental setup for a subsequent managerial analysis. In the first part, we present the setup for the operational level’s problem, which bases on a real-world data set describing spatial and temporal order patterns for on-demand food deliveries. In the second part, we discuss the parameter settings on the strategic level and sensitivities to be analyzed.

Spatial and temporal request patterns in instance 0o100t75s1p100. (a) Origin and destination pairs of orders; (b) no. of orders between
To account for a real-world scenario, we consider a data set provided by Grubhub (2018), which describes anonymized food delivery orders. The data set consists of 10 different instances. Each instance represents one US metropolitan area. Each order is characterized by its origin and destination coordinates. Moreover, each order is described by its placement and ready time. The former is when a customer orders through the Grubhub platform, and the latter is when the order is ready to be delivered. To keep the computational complexity of our experimental evaluation manageable, we randomly chose the instance type with initial digits “0o100” and only used the order information contained within them. The “0” encodes the metropolitan area on which the order information bases. The “o100” describes that 100% of orders are used. The remaining digits encode driver schedules, that is, times and locations at which and where drivers start and end their shift. As we do not model driver shifts on the operational level, we ignore the remaining digits. Figure 4 highlights its spatial and temporal order distribution. Orders occur from minute
We set the strategic time horizon to a year and divide it into
Description of Base Case Parameters and Variations
We now present the parameter settings required for the strategic level MDP and the fluid approximation on the operational level. We start by describing a base case and then present parameter variations.
We motivate the base case resignation probability and joining rate by a statistical evaluation initially made for Uber drivers between 2012 and 2016 (Hall and Krueger, 2018). On the strategic level, we consider a constant resignation probability of
To model the joining process, we assume the number of newly joining CDs
We consider a homogeneous demand growth rate of roughly
In the base case (see Table 2), we set wages for SDs to
Base case parameters and their variations.
Deviation from optimal solution,
PL-VFA = piecewise linear value function approximation; MY = myopic policy; DD = dedicated gig-driver; SD = salaried driver; OD = opportunistic gig-driver.
To assess the results, we evaluate the quotient
In the first part of this section, we validate our PL-VFA (Section 5.1) before analyzing the structural properties of a policy derived by PL-VFA in the base case (Section 5.2). In Section 5.3, we study the policies’ and parameter variations’ impact on total costs from an LSP perspective. Finally, we take the CDs’ perspective and compare different behavioral assumptions. We implemented the strategic level’s MDP in Python and used Gurobi 9.1.2 to solve the operational problem. We performed all experiments on a workstation with a GHz i9-9900 CPU at 16
Validation of PL-VFA
To validate the PL-VFA approach, we evaluate PL-VFA on smaller instances and consider only DDs. In these instances, we can compute a solution with BDP. We study three demand scenarios: constant demand, growing demand, and peak demand. Moreover, we vary the initially available numbers of DDs. We benchmark the results obtained by PL-VFA with a myopic policy (MY), which always hires enough SDs to serve the demand in the current time step
Hiring Policy Comparison in the Base Case
We begin this section by studying the difference in the number of SDs hired by a policy obtained from PL-VFA and MY, which we denote by

Difference between
The number of overhired SDs in
Figure 6 shows

Variation of CD joining rate. (a) Variation of the DD joining rate

The advantage of
To provide a better intuition on how
We now study the effect of

Combined impact of varying joining rates on

Variation of CD costs. (a) DDs’ costs per km
The increase of
Figure 8(b) shows the driver and penalty cost split as a percentage of total costs for varying
In mixed fleets, SDs are the main total costs driver with a cost share of up to

Variation of severance payment and SD fix costs. (a) Variation of severance payment
In Figure 9(a), we report DD costs per km. The cost-saving potential decreases with increasing
The cost saving potential is more sensitive to DD costs than to OD costs. The advantage of
Finally, to understand the impact of firing flexibility in workforce planning, we vary the severance payment
Compared to
Varying the share of CDs being active within

Variation of
When the availability of CDs increases to
To understand the conditions under which an operator prefers certain drivers over others in the delivery process, we perform a scenario analysis: we first construct scenarios that vary by one parameter compared to the base case. Then, we analyze the average share of requests delivered by each driver type over the time horizon
Scenarios and preferred delivery option.
Scenarios and preferred delivery option.
SD = salaried driver; DD = dedicated gig-driver; OD = opportunistic gig-driver; CD = crowdsourced driver.
aShare of delivered requests of
Figure 12 shows the average share of requests served by each driver type over the time horizon

Average share of requests delivered (in %) over
In the base case, when SD wages are low, and when CD supply is high, the delivery options are balanced. However, when payments to DDs and CD supply are low, the share of requests surpasses
In the previous analysis, we observed that ODs play a minor role in the request delivery process. Apart from payments to ODs, two more factors influence ODs’ share in request matches: temporal patterns (

Share of requests delivered by ODs (in %) in
ODs’ share in requests delivered increases to more than
So far, we assumed that CDs leave the LSP according to a fixed resignation rate. In the next section, we study the effect of a resignation probability that depends on the number of unmatched CDs.
Figure 14 reports the average percentage of unmatched DDs and ODs for different joining rates over the entire time horizon. We observe that the percentage of unmatched DDs (cf. Figure 14(a)) is higher than the percentage of unmatched ODs, especially when

Percentage of unmatched CDs when the resignation probability does not depend on the number of unmatched CDs. The
As the joining rate grows, CD supply surpasses demand, and the LSP can no longer outsource demand to CDs. Furthermore, the higher quotient of unmatched DDs is plausible since more DDs are active on the operational level than ODs due to their higher
Figure 15 shows the number of unmatched CDs now assuming that CDs’ resignation probability depends on

Percentage of unmatched CDs when CDs’ resignation probability depends on number of unmatched CDs. The

Difference between the total number of drivers when resignation probability depends on the number of unmatched CDs and when it does not. The
The LSP has to hire
In this article, we studied the strategic workforce planning of an LSP providing on-demand delivery services with a mixed fleet of couriers consisting of SDs and CDs. We integrated long-term strategic SD hiring decisions and short-term operational decisions regarding driver dispatching. We formalized the strategic hiring and firing problem as an MDP and solved it with approximate dynamic programming based on piecewise linear value function approximation, which allows us to study large-scale instances. We incorporated operational costs in the MDP’s cost function using a fluid approximation to account for delivery operations.
We conducted a case study based on a real-world data set from Grubhub for food delivery in a metropolitan area located in the US. Herein, our studies led to several findings, which we synthesize in the following.
Total costs obtained with PL-VFA are either equal to or up to
SDs and DDs are the main cost drivers in the total cost mix with up to a 50% and a 30% share in total costs, respectively. The significance of SD costs in the total cost mix stresses the importance of finding good SD hiring policies, which minimize the number of SDs hired. DDs have the second highest contribution to total costs, when
The LSP has to hire more than
This work opens up a promising new research avenue in the field of crowdsourced deliveries, by combining the study of crowdsourced delivery fleets with long-term workforce planning. Specifically, this work provides a foundation for follow-up studies. Firstly, the SD workforce planning problem could be extended by accounting for SDs with different contract durations or working schedules. Moreover, one could introduce uncertainty in the demand dimension on the strategic workforce planning level. Finally, one could implement behavioral components for CDs, for example, discrete choice models based on real-world data, to more accurately represent the CDs’ behavior, for example, regarding resignation processes.
Supplemental Material
sj-pdf-1-pao-10.1177_10591478241268602 - Supplemental material for Strategic Workforce Planning in Crowdsourced Delivery With Hybrid Driver Fleets
Supplemental material, sj-pdf-1-pao-10.1177_10591478241268602 for Strategic Workforce Planning in Crowdsourced Delivery With Hybrid Driver Fleets by Julius Luy, Gerhard Hiermann and Maximilian Schiffer in Production and Operations Management
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
How to cite this article
Luy J, Hiermann G and Schiffer M (2024) Strategic Workforce Planning in Crowdsourced Delivery With Hybrid Driver Fleets. Production and Operations Management 33(11): 2177–2200.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
