Abstract
We investigate an ambulatory care scheduling problem derived from a real case in Ontario, Canada that offers multi-appointment, multi-class, multi-priority treatments in geographically distributed campuses with multiple resources. We consider a dynamic setting with uncertain patient arrival and use of the emergency department. This problem is formulated as an infinite-horizon Markov decision process model. Since we cannot solve large-sized instances via conventional approaches, we hybridize this model with a neural network to simplify feasibility constraints while respecting all assumptions. Given the curse of dimensionality, we use an affine approximation architecture to estimate the value function. An equivalent linear programing model is solved through column generation in order to compute approximate optimal policies and derive two easy-to-implement scheduling policies. Simulation results demonstrate that the approximate optimal policy and heuristics outperform alternative scheduling policies. Finally, we demonstrate that the application of our methodology can enhance performance metrics in a large ambulatory care center in Canada. We show that a template-based scheduling rule can result in high resource utilization but poor scheduling decisions. However, an efficient scheduling policy equips a booking clerk with intelligent scheduling rules that are difficult for her to predict in real-time and work well in comparison to scheduling templates.
Keywords
Introduction
Healthcare providers are under pressure to enhance service quality and reduce costs (Papanicolas et al., 2018). Given the greater emphasis on preventive medicine, outpatient services have become an increasingly important element in our health systems (Cayirli and Veral, 2003). Delays in outpatient services are the consequence of insufficient capacity as well as inefficient scheduling policies (Sauré et al., 2012). Thus, the optimal scheduling of outpatients is crucial to decrease wait times and increase the utilization of limited and expensive personnel.
Ambulatory care (AC) denotes health services offered on an outpatient basis (Comerford and Shah, 2019), including diagnosis, observation, consultation, treatment, and rehabilitation services. AC services aim to reduce avoidable inpatient stays, equipping patients with the knowledge to manage their treatment backed by a multidisciplinary team (Cooper and de Lord, 2018). AC is one of the largest-volume health services in Canada, making it a significant component of the health system (CIHI, 2022).
Historically, complex treatments were administered in an inpatient setting due to their uncertainties (Cooper and de Lord, 2018). This put healthcare providers in unviable positions due to increasing expenditures and ever-rising demand. Thus, AC has become the primary provider of various treatments in recent years (e.g., chemotherapy, infusion therapy) (Comerford and Shah, 2019). Despite these advancements, wait times remain excessive, a direct consequence of earlier diagnoses, an aging population, advances in diagnostics, increased survival rate and more effective care procedures (Cooper and de Lord, 2018; Fung-Kee-Fung et al., 2022). Delayed treatments can have a substantial negative impact on outcomes (Fung-Kee-Fung et al., 2022). Long wait times are one of the main sources of patient (and staff) dissatisfaction, depression and anxiety (Frick et al., 2007). Apart from affecting patient and staff satisfaction, long wait times may adversely affect patient adherence to scheduled appointments (O’Neill et al., 2012).
In this paper, we investigate an AC scheduling problem for a hospital with geographically distributed campuses, where all new arrivals are added to a unique waiting list. Patients on the waiting list may end up either using the Emergency Department (ED) or transferring to another AC center due to long wait times (or medical abnormalities). The daily challenge facing the booking clerk is to assign the available capacity between various patient classes in order to decrease the number of patients whose wait time exceeds a preset threshold with greater importance given to any late bookings of higher-priority demand (Patrick et al., 2008). Thus, our research objective is to design a decision support tool to schedule patients in a distributed center with patients’ arrival and use of the ED being uncertain. While we develop our problem assumptions based on a partnership with a hematology and oncology AC center in Canada, our methodology remains practical for various other AC scheduling applications, simply by relaxing some of our assumptions.
The remainder of this paper is organized as follows. Section 2 presents a review of the literature and specifies the contributions. Section 3 provides the problem description and introduces the chief components of a dynamic program for our distributed AC scheduling problem, and Section 4 develops an approximate dynamic programing (ADP) algorithm to solve large-sized instances of this problem. Section 5 offers computational results and managerial implications. Finally, Section 6 provides conclusions and future research directions.
Literature Review
Outpatient scheduling problems have received considerable attention from scholars and practitioners (see Ahmadi-Javid et al. (2017) and Cayirli and Veral (2003) for broad literature reviews). Typically, outpatient scheduling problems are divided into two levels: (i) Advance scheduling (allocating capacity to future demand), and (ii) appointment scheduling (specifying patients’ appointment times) (Pan et al., 2020). In this section, we first review the literature on outpatient scheduling using ADP algorithms. Second, we review chemotherapy appointment scheduling papers. Finally, we discuss the main contributions of our research.
Dynamic Programming for Outpatient Scheduling
The exponential growth of complexity in dynamic programs is known as the “curse of dimensionality,” one of the main downsides of dynamic programing. ADP is a growing research area that aims to overcome this obstacle primarily through three streams: (i) Linear programing-based algorithms, (ii) simulation-based algorithms, and (iii) aggregation-based algorithms. Linear programing-based ADP algorithms have attracted the most attention in the AC scheduling literature. Patrick et al. (2008) studied a multi-priority advance scheduling problem for a diagnostic facility in a public healthcare setting. They proposed a Markov decision process (MDP) model that minimizes wait time, deferral and diversion costs. The authors developed a linear programing-based ADP algorithm with linear basis functions and solved it through column generation (CG). Sauré et al. (2012) addressed a multi-priority multi-appointment advance scheduling problem where patients can receive treatment across multiple days and for irregular lengths of time. The authors formulated this problem as a discounted infinite-horizon MDP and developed an ADP algorithm using an equivalent linear programing formulation and an affine architecture to deal with the intractable numbers of variables and constraints.
Zhou et al. (2022) studied a multi-class advance scheduling problem for a diagnostic facility considering heterogeneous wait time targets and access equity. They developed a finite-horizon MDP model which minimizes the total expected costs. To deal with the curse of dimensionality, the authors reformulated the MDP as a multi-stage stochastic programing model and proposed a modified Benders decomposition algorithm based on new dual integer cuts. Diamant et al. (2018) studied an advance scheduling problem where patients require multiple appointments without any pre-specified order. Due to patient no-shows, Diamant et al. considered overbooking to ensure full capacity utilization. The authors proposed an infinite-horizon MDP that maximizes the discounted profit and solved it through an equivalent linear programing model with the value function approximation and decision variable aggregation. Likewise, Diamant (2021); Göçgün (2018a); Göçgün and Puterman (2014); Parizi and Ghate (2016); Sauré et al. (2020); Wang et al. (2018); Wang and Fung (2015a, 2015b) developed linear programing-based ADP algorithms to address outpatient scheduling problems.
Another stream in the ADP literature is simulation-based algorithms. Sauré et al. (2015) proposed a simulation-based ADP algorithm for advance patient scheduling. This algorithm was based on policy iteration and a post-decision state formulation, and used a non-linear logistic value function approximation. Sauré et al. demonstrated that policies obtained through their proposed algorithm outperform other scheduling policies in most cases. Göçgün (2018b) studied a multi-appointment advance scheduling problem for radiation therapy with stochastic arrivals and appointment cancelations. The author formulated an MDP model and deployed a simulation-based ADP algorithm that relies on least-squares-based approximate policy iteration. Computational results demonstrated better performance of the proposed ADP algorithm in comparison to the myopic heuristic decision rule. Schuetz and Kolisch (2012) considered a multi-class advance scheduling problem for service industries where the service company can either confirm or reject arriving demand. They let the service duration be stochastic and allowed for no-shows and cancelations. Schuetz and Kolisch designed a simulation-based ADP algorithm based on a post-decision state formulation and showed the efficiency of their methodology regarding the objective function, solution time and memory requirement.
Wang and Fung (2015a) addressed a multi-class sequential appointment scheduling problem while respecting patient preferences for the appointment time and solved it using a simulation-based ADP algorithm. The authors demonstrated that a well-designed appointment system can enhance the resource utilization and the revenue of outpatient departments. Lu et al. (2018) studied an advance scheduling problem where all requests receive estimated appointment dates upon their arrival. The authors formulated an MDP model and solved it through a simulation-based ADP algorithm that aggregates multiple days into a single period. This research showed that informing patients of their appointment dates in advance is a cheap yet effective solution to improve their experiences. Similarly, Geng et al. (2011); Li et al. (2018); Lin et al. (2011) deployed simulation-based ADP algorithms to solve outpatient scheduling problems.
Aggregation-based algorithms are the last stream in the ADP literature for AC scheduling problems. Wang et al. (2018) studied an appointment scheduling problem considering patient preference for the appointment time. While appointment requests arrive sequentially within a booking horizon, patients choose between a number of time slots (specified based on the availability of physicians). The authors proposed a dynamic program that maximizes the expected revenue for a given service day. They applied an aggregation-based approximation method which aggregates several time slots into a single slot and used a complete-set policy to decrease the number of system states as well as decisions. Li et al. (2018) also investigated a sequential appointment scheduling problem where patient choices for physicians and time slots are considered. They proposed an MDP model that maximizes the expected patient satisfaction and solved the problem through an aggregation-based ADP algorithm. Li et al. found that different aggregation levels lead to different results and computation times. Gedik et al. (2017) investigated a patient admission planning problem for a proton therapy facility in order to maximize the total expected number of treatment sessions delivered to patients in a planning period with stochastic patient arrivals. The problem was formulated as an MDP model that was simplified using a state aggregation technique developed based on the fixed-weight method.
Hematology and Oncology Appointment Scheduling
A hematology and oncology AC center offers various treatments, including—but not limited to—intravenous immunoglobulin, chemotherapy, iron infusion, paracentesis, phlebotomy, blood transfusion, as well as some diagnostic procedures. Patients of these clinics typically go through an initial consultation session, followed by some regular treatment sessions. Haghi et al. (2023) investigated the simultaneous scheduling of consultation and treatment sessions for different patient classes. The authors evaluated decisions regarding the assignment and sequencing of patients for nurses and beds. They assumed that treatment duration is uncertain and consists of setup and infusion times, where each nurse can only set up one patient at a time and can monitor several patients during the infusion. According to Garaix et al. (2020), the daily care in a hematology and oncology AC center includes (i) consultation, (ii) drug preparation, and (iii) injection. In this paper, the authors addressed all three decisions simultaneously, assuming a series of fixed injection plans once patients are booked for their first injection appointment. They allow nurses to work overtime with a penalty while injection duration uncertainty and patient punctuality.
On the other hand, some papers focus only on the drug injection scheduling decisions. Gul (2023) assumed that each patient has a primary nurse who should be served by the same nurse throughout their course of treatment, which can be violated with a penalty. Their approach decides on patient sequencing, appointment time setting, and patient assignment to nurses and beds. Their model minimizes patient clinic wait time, nurse overtime, and violations from the primary nurse assignment under infusion duration uncertainty. Demir et al. (2021) examined chemotherapy scheduling, taking into account the availability of nurses and beds amid service duration uncertainties. The authors considered a two-component service duration (premedication and infusion). Nurses can handle only one patient for premedication, however, they can proctor the infusion process of multiple patients. Their method sequences patients on a daily appointment list, determines appointment times, and assigns patients to nurses and chairs in order to minimize a weighted combination of patient waiting time, chair idle time, and nurse overtime. Similarly, Benzaid et al. (2020); Karakaya et al. (2023) investigated the drug injection scheduling decisions.
Another stream of studies is the use of scheduling templates for hematology and oncology AC appointment scheduling. Given a fixed scheduling template for a long planning horizon, Hesaraki et al. (2023) focused on the online, daily appointment scheduling for a hematology and oncology AC center. The main purpose of this paper is to minimally adjust the input scheduling template in order to accommodate new and returning patients, optimizing the interests of multiple stakeholders. Hahn-Goldberg et al. (2014) similarly focused on daily appointment scheduling for a hematology and oncology AC center. New appointment requests are scheduled according to a predefined scheduling template. If a request does not fit the template, their proposed approach updates the template using an optimization model. To handle last-minute additions and cancelations, the authors proposed a shuffling algorithm that adjusts appointment start times within a predefined time limit. Their findings show that dynamic template scheduling can improve makespan by up to 20% compared to current practices. Likewise, Condotta and Shakhlevich (2014); Faridimehr et al. (2021); Huang et al. (2019) studied the scheduling templates for hematology and oncology AC patients.
Main Contributions
Our main contributions are four-fold. To the best of our knowledge, our paper is the first to develop an ADP algorithm for a distributed multi-appointment, multi-class, multi-priority, multi-resource AC scheduling problem. Not only do we assume heterogeneity of resources (beds and nurses) and different resource requirements of patient classes, but we also consider that service durations associated with beds and nurses have different settings (i.e., a bed is fully busy during a patient appointment while a nurse may not be). Second, nearly all the literature focuses on either advance scheduling problems or appointment scheduling problems because handling both at the same time is complex (Sauré et al., 2020). Considering both decision levels requires binary variables to formulate the problem, introducing computational challenges (i.e., we cannot solve large-scale instances of the problem within a reasonable time via conventional approaches). To address this, we propose a hybrid methodology that combines ADP with a neural network (NN). This approach effectively incorporates patient allocation and sequencing considerations, enabling us to solve large-scale advance scheduling problems that are more constrained than those typically addressed by infinite-horizon ADP algorithms in the literature. Moreover, the methodology is versatile and can be applied broadly to other optimization problems under complex constraints (e.g., machine scheduling, storage location assignment, crew scheduling, facility layout), where a component of the problem can be partitioned in a similar fashion. Third, we assume multi-component service durations as described in Section 3. Such an assumption is rare in the literature, and to the best of our knowledge, our research is the first to consider three unique components associated with hematology and oncology treatments in the service duration. This assumption is applicable to a wide range of other problems, such as (i) operating room scheduling where surgical procedures have pre- and post-operative phases, (ii) manufacturing scheduling problems with setup, processing and breakdown phases, (iii) home healthcare visits where some procedures involve initial setup by a nurse when they can perform other tasks, (iv) logistics problems with the notion of loading and unloading in freight or transportation. Fourth, we propose two easy-to-implement heuristics for our problem that significantly outperform benchmark policies and perform well in comparison to our primary solution approach. The policies provide generalizable managerial insights that could benefit similar scheduling problems.
Dynamic Programming
This section describes all the assumptions and objectives of the problem under consideration and, then formulates the MDP model.
Problem Description
We focus on a distributed hematology and oncology AC (i.e., the center offers services on multiple campuses), that aims to avoid hospital admissions during the course of treatment by providing timely care. We investigate a setting with nearby campuses and assume patients are indifferent to the campus they are allocated to, as long as they receive all their treatment at the same campus (continuity of care). Each campus in the hematology and oncology AC has a different but known availability of human and physical resources. All campuses have the resources (nurses and beds) to serve benign treatments, and each is also equipped to serve a specific group of malignant treatments. Each campus has fixed operating hours with no planned overtime, meaning our planned schedule includes no built-in overtime. However, if the solutions are implemented, some overtime may still occur due to uncertainties involved in practice.
Demand is prioritized (on a scale from one to five—the lower the priority, the higher the urgency) and may require a single- or multi-appointment treatment (each treatment on a particular day) with a specific treatment frequency. Each patient is associated with a wait time threshold considering her treatment class (or type) and priority level. The wait time is an important performance metric for hematology and oncology AC centers given its impact on health outcomes and patient satisfaction. A deferral cost occurs each time a patient on the waiting list is left unscheduled. The deferral cost ensures that scheduling decisions are not delayed unnecessarily. For example, if a patient arrives on day 25 and has a three-day wait time threshold, scheduling her for day 28 on either day 26 or day 27 will incur no wait time cost. However, making scheduling decisions earlier is preferable and is incentivized by adding the deferral cost. Patients may utilize the ED due to their health condition, which is costly for the health center. Similar to the wait time, each class-priority is associated with a deferral threshold. If a patient is not scheduled within the deferral threshold, she will be transferred to another hematology and oncology AC center (or rejected), incurring a large cost. Therefore, the objective function in this study minimizes a weighted sum of wait time, deferral, ED use, and transfer costs.
Each patient has a ready-to-treat date or earliest permissible day for scheduling, which is determined based on factors such as readiness of test results and personal preparation. Once the first session of a multi-appointment treatment is scheduled, the follow-up appointments are automatically scheduled based on the treatment frequency. Treatments in our hematology and oncology AC may have different service durations but each includes three components: Initialization, passivity and finalization. In the initialization period, a nurse serves the patient and cannot do any other activity (e.g., the nurse prepares the equipment and attaches the blood transfusion set to the patient to start the treatment). The passivity period follows the initialization period and does not require the full involvement of the nurse (e.g., the time blood transfusion takes). In the finalization period, the same nurse is again busy completing the treatment (e.g., the nurse detaches the blood transfusion set, discharges the patient and cleans the bed). The patient occupies a bed during her entire treatment. Note that the nurse can handle multiple patients at the same time, limited to a maximum, but she cannot serve the initialization and/or finalization periods of two patients at the same time.
Finally, we consider two sources of uncertainty. First, we assume that patient demand is uncertain and follows a truncated Poisson distribution for each class-priority. New patients are added to the waiting list until they are scheduled or transferred to another hematology and oncology AC center. Second, we consider the uncertain use of the ED, which follows a truncated Poisson distribution for each class-priority. For patients on the waiting list, the probability of using the ED increases by their priority and wait time. For patients who are already scheduled, whether they have started treatment or not, we assume that this probability increases according to their priority. Patients on the waiting list or already scheduled may have to use the ED due to their health condition. The higher the priority level and the larger the wait time, the higher the chance of using the ED.
Markov Decision Process Model
This sub-section defines the components of an infinite-horizon MDP model for the advance scheduling problem.
Decision Epochs, Decision Horizon and Planning Horizon
The decision epochs correspond to the end of each day. At each decision epoch, we make a decision for the next
State Variable
The state variable takes the form
Decision Variables
The primary decisions are the allocation of patients to campuses such that the total costs are minimized over the decision horizon. Thus,
Transition Probabilities
Two sources of uncertainty occur at each decision epoch: (i) daily arrival of new patients, and (ii) use of the ED. The former source of uncertainty does not depend on the state of the system while the latter does. The state of the system for the next decision epoch, denoted by
Transitions are governed by the following equations.
New arrivals are added to the waiting list (
The set of feasible decisions compatible with state
Constraint set (7) restricts the total number of patients on the waiting list with class
It is noteworthy to mention that we had initially modeled the MDP feasibility constraints via binary variables to make both patient allocation and scheduling decisions. The use of binary variables enables us to directly ensure the feasibility of solutions with regard to the limited availability of beds and nurses, using no overtime (instead of using function
To build a predictive model for function
We build function

Illustration of steps for building function
Data set generation: To be generalizable, NNs should be trained over large-sized and balanced data sets. We create a large data set of inputs and outputs through simulation. The simulation generates random patient allocations for each campus on a single day. For each patient allocation, heuristics are deployed to find feasible patient schedules such that overtime is minimized across all campuses. The simulation may include thousands of patient allocations and finding the minimum overtime value using mathematical programing for each scenario requires significant CPU time. Therefore, we use two different heuristics to increase robustness in computing high-quality solutions. Algorithm 1 outlines this simulation model.
In Heuristic 1, for each campus, we sort eligible patient classes based on the number of beds available (the fewer beds available, the higher the priority). Then, we schedule all allocated patients of each class to feasible “bed-nurse” combinations with the earliest service start times. Here, a “bed-nurse” for a patient is a feasible combination when (i) the bed is appropriate for the patient and is free for her entire service duration, (ii) the nurse is not busy with the initialization or finalization period(s) of other patients during her initialization and finalization periods, (iii) the nurse must not serve more than a predefined number of patients at the same time. Once patient scheduling is complete for a campus, the algorithm calculates the overtime/under-utilization of all beds. If at least one of the beds experiences overtime, the heuristic returns one; otherwise, zero. Algorithm 2 and Figure EC.1 in the E-companion provide a pseudocode and an illustrative example for Heuristic 1, respectively.
Unlike Heuristic 1, Heuristic 2 does not necessarily schedule patients of one class in a row. For campus
Using heuristics and the NN introduces a potential source of approximation in our algorithm, which may misclassify feasible solutions as infeasible and vice versa. In other words, the algorithm may assume a solution is feasible (no overtime), while in fact it would require overtime. However, the effectiveness of our approach remains intact if the error at the design phase is small. It is worth noting that the primary purpose of our approach is to derive scheduling policies rather than directly implementable solutions. While approximation errors may impact the quality of these policies, smaller errors yield policies that are closer to the optimal. Once these approximate policies are developed, they can either be integrated into a mathematical model or used to create simple scheduling rules, ensuring that all overtime constraints are explicitly accounted for (to guarantee strict feasibility).
Network architecture and linearization: Let us assume a fully connected feed-forward NN with
Since costs of earlier stages have more importance for our problem, we use a discounted value function based on pre-decision state variables. As mentioned earlier, we aim to minimize a weighted sum of the wait time, deferral, ED use and transfer costs. The total immediate cost of being in state
It should be noted that we compute the discounted wait time cost
The state, decision and outcome spaces grow exponentially in MDP models as the problem size increases, which inevitably makes such models intractable. This section provides an approximate methodology to find computationally feasible solutions for large-sized instances of the proposed MDP model. An infinite-horizon ADP algorithm is proposed through which we determine appropriate coefficients for basis functions. Once the coefficients of the basis functions are computed (Phase 1), we can use them to determine patient scheduling (Phase 2). Note that we do not need to run the first phase each time we run the second phase. However, we can use the first phase and update the coefficients if the system configuration changes (e.g., demand mix variations).
Policy Determination
To implement our ADP algorithm, we reformulate the optimality equation as the following equivalent linear dynamic programing (LDP) model (Powell, 2007):
Model (24) has a tractable number of variables, but is still intractable due to the number of constraints (one for each state-decision pair). We therefore formulate the dual of Model (24), as the restricted master problem (RMP), and solve it using CG.
Model (26) looks for a state-decision pair with the most violated primal constraint. We add this state-decision pair to Model (25) in order to re-optimize the master problem and update the approximation coefficients. The CG procedure terminates when the algorithm is sufficiently close to the non-violation of primal constraints (Model (26) with objective function value of
Once the approximate optimal policy (AOP) is determined, we use it to compute the cost of scheduling a patient with class
Then, we can use the action coefficient
Despite the use of the NN, our original ADP algorithm is unable to derive policies for medium- and large-sized instances (
Computational Results
This section presents comprehensive numerical analyses to evaluate the proposed solution approach and derive managerial implications. We explain the settings in Section 5.1, introduce new scheduling rules in Section 5.2, and assess the AOP in Section 5.3. Finally, Section 5.4 investigates a case study, and Section 5.5 provides managerial insights. Both master- and SPs of the CG in the ADP algorithm are solved via IBM Ilog CPLEX. The ADP algorithm and all other programs are coded in PyCharm 2020.3.5, in which the interpreter is Python 3.7, and TensorFlow 2.8.0 is used to train the NN. All experiments are run on an Intel(R) Core(TM) i9-10900X CPU @ 3.70GHz with 64 GB of DDR4 RAM.
Analysis Settings
This sub-section describes instance generation, hyperparameter tuning, and benchmark scheduling rules.
Instance Configurations
We assume that a day includes a six-/eight-hour shift (no overtime available), and time is discretized into time slots of 15, 30, or 60 minutes. The decision horizon includes 7 or 14 days, and the planning horizon includes 7 to 70 days. The service durations vary between three and eight time slots. Treatments are associated with 1 to 12 classes and one to four priority levels, and may require one to nine treatment sessions. The treatment frequency for all classes is one week, two weeks, or a month. Similar to patient safety guidelines in our partner hospital, we do not allow a nurse to handle more than four patients at the same time. The wait time and deferral thresholds may be 2 to 12 days and 1 to 24 days, respectively.
We let both the transfer and ED utilization costs be greater than the maximum possible sum of wait time and deferral costs (i.e., for both
Configurations of test instances.
Configurations of test instances.
The proposed ADP algorithm relies on a NN to classify patient allocations based on overtime. To train the NN, our loss function penalizes errors equally for both “predicting feasibility when infeasible” and “predicting infeasibility when feasible.” While this balanced approach may not suit all applications, our method is flexible and can be easily adapted by modifying the loss function to better align with specific preferences. Furthermore, the NN contains hyperparameters that must be tuned before use, including the numbers of hidden layers and their neurons, number of epochs, activation function, optimizer and
Parameter tuning using the iRace package.
ReLU: rectified linear unit; LReLU: leaky rectified linear unit.
To evaluate the NN using the elite configuration, we employed the 10-fold cross-validation method ten times across Instances 6–10. Instead of using the entire validation set, we focus on patient allocations for a campus-day that utilizes between 75% and 125% of the total available bed capacity

Overall performance of the neural network (NN) against logistic regression. (a) Accuracy; (b) area under the receiver operating characteristic curve.
We compare the performance of the AOP with three well-known heuristic scheduling rules.
Myopic policy: This policy ignores the impact of today’s decision on the future and books patients as soon as possible based on the immediate cost function (20). The Myopic policy defers a patient only when no capacity is available within the decision horizon and transfers a patient if the maximum deferral threshold is reached. Adapted Patrick, Puterman, and Queyranne (APPQ) policy: Patrick et al. (2008) suggests booking patients in priority order within their wait time thresholds. We adapt this policy to our problem as follows. For each priority level, we sort classes based on the ascending order of their wait time thresholds. Then, for each class-priority, we prioritize scheduling patients with a greater wait time and fewer treatment sessions required. Then, we book as many patients as possible into days Adapted days with minimum number of booking (ADMB) policy: In the order of priority, this policy books as many patients as possible on the first day and then the day with the minimum number of bookings within the wait time threshold (Sauré et al., 2015). We weigh treatments by their service duration to identify the day with the minimum number of bookings and adapt this policy to our problem similar to the APPQ policy.
New Scheduling Rules
One approach for creating schedules is to solve a mathematical model incorporating AOP in its objective function. However, implementing this approach might not be straightforward. Observing the AOP for various instances and based on insights derived in Section 5.5, we found two generalizable mechanisms that could be deployed in easy-to-implement scheduling rules to improve their performance for our problem. Unlike the APPQ and ADMB policies, which book patients strictly within the wait time threshold, we first extend the booking horizon for the highest-priority patients (e.g., by one day in our instances). Second, we limit capacity for lower-priority patients when the transfer rate for higher-priority patients exceeds a specific threshold (e.g., 5% in our instances). We set maximum allocation limits for each class-priority in our numerical analyses to implement the second mechanism. If the average rejection rate for higher-priority patients exceeds this threshold, we reduce the maximum allocation for lower-priority patients by a specified percentage (e.g., 10% reduction) until either their maximum capacity reaches zero or the rejection rate for higher-priority patients falls below the threshold.
We deploy these two mechanisms in the APPQ and ADMB policies in order to generate two new scheduling policies, called Enhanced APPQ (EPPQ) and ADMB (EDMB). These policies are also evaluated in the following sub-sections.
Theoretical Instances
In this sub-section, we interpret the AOP and compare its performance with established scheduling rules in the literature.
Comparison of Scheduling Rules
We use a simulation model to evaluate scheduling rules over Instances 1–10 with two demand settings. The simulation model consists of 200 warm-up iterations and 1000 main iterations. We run the simulation model 100 times for each instance and present the results in Table 3. Each cell under “Heuristic scheduling rules” includes a 99% confidence interval for the average cost. Highlighted confidence intervals are significantly worse than the ones related to the AOP. Each cell under “Gap (%)” demonstrates the minimum and maximum gaps of the scheduling policies in comparison to the AOP,
Comparison of performance metrics for heuristic scheduling rules.
Comparison of performance metrics for heuristic scheduling rules.
Each cells under “Heuristic scheduling rules” includes a 99% confidence interval for cost. Highlighted intervals are significantly worse than the one related to the AOP. Each cell under “Gap (%)” demonstrates the smallest and largest gaps of the scheduling policies in comparison to the AOP, respectively. AOP: approximate optimal policy; APPQ: adapted Patrick, Puterman, and Queyranne; ADMB: adapted days with minimum number of booking; EPPQ: enhanced APPQ; EDMB: enhanced ADMB; .
The APPQ policy usually outperforms the ADMB policy. Unlike APPQ, which schedules patients from the end of the wait time threshold backward (except for the highest priority levels), ADMB books patients on the day with the fewest bookings within their wait time threshold. This may result in booking lower-priority patients earlier (even when later days have free capacity), using space needed for higher-priority patients and potentially leading to their transfer in the next epochs. The EPPQ and EDMB policies outperform their APPQ and ADMB counterparts (over Instances 4–10) with performance gaps of 30.13% and 63.59%, respectively, as they proactively alleviate excessive demand pressure. Notably, EDMB slightly outperforms EPPQ when extra demand is intelligently handled (as explained in Section 5.2), similar to the observations for the original version of these policies (Sauré et al., 2012).
According to Table 3, the AOP consistently outperforms other policies across Instances 4–10, with the EDMB policy providing the closest performance (a gap of 5.30% and no significant difference in 7 out of 14 instances). In particular, EDMB maintains its performance in larger instances, with a maximum gap of 6.32% compared to AOP, and a gap of only 1.97% in Instance 10, a high-demand setting. Comparing normal and high-demand settings, we observe that performance gaps decrease under high demand (e.g., for EDMB, 8.63% gap in normal demand vs. 1.98% in high demand). This is because higher demand pressures policies to handle transfers more effectively, reducing AOP’s flexibility and bringing the more conservative EPPQ and EDMB policies closer to the AOP’s performance.
Our partner hospital offers treatments on three campuses: Campus 1 with four nurses and eight beds, Campus 2 with three nurses and nine beds, and Campus 3 with three nurses and 13 beds. These campuses serve a total of 15 classes, each with five priority levels. Priority level 1 has a two-day wait time threshold and a five-day deferral limit while priority level 5 has a 28-day wait time threshold and a 45-day deferral limit. All campuses are open for eight hours or 32 time slots (time is discretized into 15 minutes time slots) with no overtime. Service duration varies between four and 28 time slots. For all classes, the initialization/finalization period is one time slot, and the passivity period ranges from two to 26 time slots. The booking clerks of our collaborator hospital suggested assuming a 14-day decision horizon and a 56-day planning horizon (no patient should be scheduled later than two months). Treatments can be provided on a weekly, bi-weekly, or monthly basis (varies based on class), therefore, they may need one to six appointments. The rest of the parameters are defined in our data set repository.
It should be noted that for some treatment classes, the length of treatment may exceed the 56-day planning horizon. In such cases, if the total treatment length is known in advance for a patient, the booking clerk can readily allocate her as soon as the rolling planning horizon includes her future appointment dates. On the other hand, if the total treatment length is unknown, and the patient needs to consult with their physician to assess the necessity of additional appointments, the patient can be added to the list of scheduled patients based on their previous appointments. If this approach is infeasible (e.g., due to lack of capacity), the booking clerk has the flexibility to manually schedule her or add her to the waitlist of high-priority arrivals.
We construct an instance for our case study and compute its AOP. Then, we run a simulation model to compare it against the Myopic, APPQ, ADMB, EPPQ and EDMB policies. We also compare all policies against a template-based scheduling (TBS) approach (i.e., fixed allocation of patient classes to beds over the planning horizon) currently used in our partner hospital. A mathematical program is used to build the scheduling templates (see the E-companion for further details). We utilize the scheduling template and perform advance scheduling similar to the APPQ policy, but with two differences: (i) the capacity for each class is fixed on each day-campus, (ii) new arrivals are scheduled within the decision horizon regardless of their wait time thresholds. Those who cannot be scheduled within the deferral threshold are transferred. Summary results for our case study are presented in Table 4.
Comparison of the scheduling policies for the case study.
Comparison of the scheduling policies for the case study.
Each cells under “Total cost” includes a 99% confidence interval for average cost. Highlighted confidence intervals are significantly worse than the one related to the AOP.AOP: approximate optimal policy; ED: emergency department; APPQ: adapted Patrick, Puterman, and Queyranne; ADMB: adapted days with minimum number of booking; EPPQ: enhanced APPQ; EDMB: enhanced ADMB; TBS: template-based scheduling.
In line with our earlier findings, Table 4 shows that the Myopic policy performs the worst by a large margin, followed by the TBS policy. These policies incur the highest deferral and wait time costs, leading to increased ED use. Additionally, they incur the highest transfer costs due to their failure to account for the dynamic nature of new arrivals, greedily utilizing capacity that could have been reserved for higher-priority patients. The EPPQ and EDMB policies outperform their counterparts and perform significantly better than the Myopic and TBS policies. While the EPPQ and EDMB handle patient transfers better, they incur higher ED costs compared to APPQ and ADMB policies as they more frequently schedule higher-priority patients who utilize the ED more often. The AOP policy performs significantly better than other policies, except for EDMB. It rarely defers patients or books them beyond their wait time thresholds. Under this policy, 94.76% of new arrivals are served on-time, with the EPPQ and Myopic policies offering the best and worst rate at 95.13% and 66.68%, respectively.
This sub-section presents managerial insights on (i) the higher performance of the AOP, (ii) the reasons for the TBS policy’s inferiority, (iii) the impact of considering patients’ campus preferences, and (iv) the effect of objective function coefficients on the performance metrics. Further insights from the AOP are provided in the E-companion.
Superiority of the AOP
We analyze performance metrics for all scheduling policies in Instance 5 with normal demand (where the AOP exhibits the largest gap compared to EDMB policy for instances with one campus—see Table 3). This analysis is depicted in Figure 3. Here, wait time refers to the number of days patients waited to receive their first treatment (excluding transferred patients) and deferral denotes the number of times patients are deferred until they are either scheduled or transferred. We calculate the transfer rate as patient transfers divided by all arrivals. For each priority, the metrics represent average values over all treatment classes. Note that the bed utilization rate is limited to

Performance metrics associated with the scheduling policies for Instance 5 with normal demand. (a) Cost; (b) Utilization; (c) Waiting time; (d) Deferral; (e) Transfer.
According to Figure 3, the Myopic policy performs the worst on most metrics, except for average utilization. The APPQ and ADMB policies show similar performance, never booking late but varying across different priority levels (e.g., waiting time). These policies struggle with transfer rates, transferring around 3% and 4% of the highest priority patients, respectively. They are designed for settings that allow overtime, enabling patient service in overtime to prevent transfers or late bookings. Without overtime, these policies provide lower-priority patients with more booking opportunities due to their longer wait time thresholds, reducing future capacity for higher-priority patients. This leads to higher transfer rates for high-priority patients. The AOP, on the other hand, defers and allows slightly late bookings, especially for higher-priority patients, to reduce transfer rates and total costs. Although late bookings might not be obvious from Figure 3(c) as the figure depicts average values, the AOP offers a slightly longer average wait time for Priority 1 than other policies. Surge capacity is a crucial resource for healthcare centers facing demand that exceeds their capacity (Demirbilek et al., 2019; Kaji et al., 2006). In our setting, transferring patients serves as a surge capacity measure, with lower-priority patients incurring smaller costs. The AOP intelligently uses this surge capacity for lower-priority patients, resulting in much smaller transfer costs.
Based on such observations, we developed the EPPQ and EDMB policies. These policies significantly outperform their counterparts, with EDMB performing very close to the AOP. However, they are more conservative in transferring lower-priority patients than the AOP, leading to unnecessary transfers, lower utilization rates, and higher total costs. A critical parameter affecting their performance is the transfer threshold. A larger transfer threshold may lead to inadequate transfers of lower-priority patients, causing high-priority patient transfers. Conversely, a smaller transfer threshold can reserve too much capacity for high-priority patients. Optimizing the transfer threshold could improve their performance.
Despite our partner hospital utilizing the TBS policy, our experiments revealed its poor performance. The TBS policy allocates fixed spots to different patient classes, alleviating the greedy use of resources seen with the Myopic policy. However, it fails to adapt to the dynamic nature of patient arrivals and disaggregates the decision-making process by separating capacity allocation from actual patient allocations. This separation reduces the flexibility of the optimization process and lowers solution quality (Aringhieri et al., 2015; Moosavi and Ebrahimnejad, 2020). Furthermore, the performance of the TBS policy heavily relies on two bounds that specify the number of permissible spots for each patient class (Constraint set EC.11). Evaluating the impact of these bounds is challenging due to the vast number of combinations and the computational expense involved. Therefore, selecting appropriate bounds is not straightforward.
Patient Preference for Campus
This research examines a hospital with nearby campuses, initially assuming that patients are indifferent to their allocated campus. Alternatively, we consider a scenario where patients have campus preferences. We adapt the heuristic scheduling rules to accommodate these preferences, allocating patients to a campus upon arrival based on a probability distribution reflecting their preference. As a result, all of a patient’s appointments must be scheduled at their chosen campus. In the E-companion, we compare the scheduling policies for scenarios where patients can choose their campus versus scenarios where they cannot.
According to our observation, all heuristic scheduling rules perform worse when patients can choose their campus. This is expected since allowing patient choice reduces the flexibility of the solution process. Although the percentage difference in performance between the best and worst policies narrows when patients can choose their campus, the absolute total increase in costs is more significant for underperforming policies. This means that while the relative performance gap decreases, the overall expenses rise more when patients have the option to choose their campus. In summary, it is preferable to make both campus allocation and appointment scheduling decisions simultaneously in a multi-campus setting, provided there are no strict constraints on campus allocation for patients, such as their preferences or medical eligibility.
Objective Function Coefficients
Objective function (20) incorporates four coefficients ranked by ascending magnitude: Deferral, waiting, ED use, and transfer costs. Initially, we assumed that the ED use and transfer costs were substantially larger compared to the deferral and waiting costs. However, in the E-companion, we explore a scenario where the ED use and transfer costs are significantly reduced. The results reveal that our proposed policies—EPPQ, EDMB, and AOP—maintain superior performance despite the narrowing of the gap difference between the policies. This finding underscores the robustness of our policies across different cost structures and highlights the critical importance of a strategic approach to managing high demand even when transfer costs are comparable to waiting time costs.
Conclusions and Remarks
This research studies a distributed multi-appointment AC advance scheduling problem with heterogeneous resources. We consider multiple patient classes and priority levels, where each class-priority is associated with a wait time threshold and a deferral threshold. Taking into account uncertain patient arrivals and use of the ED, we formulate the problem as an infinite-horizon MDP model. Since we cannot solve this model via conventional methods, we hybridize this model with a feed-forward NN to simplify feasibility constraints while respecting all assumptions. To alleviate the curse of dimensionality, we rely on an affine approximation architecture to approximate the value function and solve an equivalent linear programing model through CG to compute AOPs. We implement two necessary acceleration techniques, including dual stabilization and decomposition in the CG component.
Experiments demonstrated that the NN achieves over 95% accuracy on validation datasets for theoretical instances. We compare the AOP against three benchmark scheduling policies and two proposed heuristics (EPPQ and EDMB). The AOP performs the best, while the myopic policy performs worst, with EDMB closely matching AOP. Using realistic data from an Ontario AC center, we found that while their current TBS policy outperforms the myopic policy, it lags behind all others. EDMB and AOP met wait time thresholds for about 95% of patients, compared to 80% for TBS.
In summary, our work primarily contributes to the literature by introducing a hybridized ADP and NN algorithm. The methodology is versatile, with broad applicability to other optimization problems with complex constraints (e.g., machine scheduling, storage location assignment, crew scheduling, facility layout), where a component of the problem can be similarly partitioned. Moreover, the proposed heuristic scheduling rules not only outperform benchmark policies but also offer generalizable managerial insights that could instruct similar scheduling challenges.
An extension of our problem could explore integrating medical and operational decisions, such as determining oncology and hematology dosages to enhance health outcomes and patient convenience while managing limited operational resources. We utilize a fully-connected classifier NN to solve large-sized instances; however, this model may be unnecessarily complex for certain problems. Future research could examine alternative predictive models and assess the value of interpretability in these contexts. Additionally, our assumptions may not fully capture specific healthcare settings. Future work could address other uncertainties, such as service duration and punctuality, and investigate more flexible appointment frequency schedule beyond fixed weekly, bi-weekly, or monthly intervals.
Supplemental Material
sj-pdf-1-pao-10.1177_10591478251331143 - Supplemental material for Dynamic Distributed Ambulatory Care Scheduling
Supplemental material, sj-pdf-1-pao-10.1177_10591478251331143 for Dynamic Distributed Ambulatory Care Scheduling by Amirhossein Moosavi, Onur Ozturk and Jonathan Patrick in Production and Operations Management
Footnotes
Acknowledgment
The authors gratefully thank the reviewers of Production & Operations Management.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
How to cite this article
Moosavi A, Ozturk O and Patrick J (2025) Dynamic Distributed Ambulatory Care Scheduling. Production and Operations Management 34(10): 3173–3192.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
