Abstract
Aim:
No-shows are patients who miss scheduled specialist outpatient clinic (SOC) appointments. A predictive scoring model for the risk stratification of no-shows was developed to improve the utilisation of resources.
Method:
The administrative records of new SOC appointments for subsidised patients in 2013 were analysed. Univariate analysis was performed on 16 variables comprising patient demographics, appointment/visit records and historical outpatient records. Multiple logistic regression (MLR) was applied to determine independent risk factors of no-shows. The adjusted parameter estimates from MLR were used to develop a predictive model for risk stratification of no-show. Model validation was performed using 2014 data.
Result:
Out of 75,677 appointments in 2013, 28.6% were no-shows. Univariate analysis showed that 11 variables were associated with no-shows. Six variables (age, race, specialty, lead time, referral source, previous visit status) remained independently associated with no-shows in the MLR model, and their odds ratios were used to develop the weighted predictive scoring model. Weighted scores were 0 to 19, and five levels of no-show risk were derived: extremely low (score: 0–4; odds ratio (OR): 1.0); low (5–6; OR: 2.5); medium (7–8; OR: 5.6); high (9–10; OR: 9.2); and extremely high (11–19; OR: 16.7). The predictive ability of the model was tested using receiver operation curve analysis, where the area under curve (AUC) was 72%. AUC remained at 72% upon validation with 2014 data.
Conclusion:
The prediction model developed using only administrative data was robust and can be used for the risk stratification of SOC no-show for better resource utilisation to improve access to care.
Introduction
No-shows are patients who miss scheduled specialist outpatient clinic (SOC) appointments that were not cancelled by the patients. No-shows are a common problem internationally. 1 In Singapore, where specialist healthcare resources are limited in the public hospitals and the lead times to see a specialist in public hospitals are being tracked by the Ministry of Health as a quality indicator, non-attendance is a pressing problem to be addressed. A total of 4,921,017 outpatient attendances were made in the public hospitals in 2016.
Internal statistics of Changi General Hospital (CGH) that were reported to the Ministry of Health revealed long appointment lead times. In 2017, the median and 95th percentile of new consultation appointment lead times ranged from seven to 106 days and 21 to 260 days respectively, depending on the specialities. On the other hand, reports also showed that CGH had a high no-show rate. In 2017, out of 96,502 new appointments made at CGH subsidised clinics, 25,069 did not show up. The no-show rate was 26%. No-shows can lead to resource wastage, scarce resources not being optimised and long appointment lead time.
Several studies2–8 suggested that overbooking to compensate for no-show is a method to reduce long lead time without incurring additional resources. Overbooking refers to booking appointments more than available slots. Currently, CGH practises overbooking in a uniform way, whereby each service provider receives the same number of overbooking slots to compensate for no-show. Implementing an average overbooking percentage to every clinic session equivalent to overall no-show percentage is not optimal because the number of no-shows are not constant in every clinic session. For sessions where all patients, including the overbooked ones, turned up, the resources will be over utilised, thus resulting in long patient wait time, clinic overtime, and patient and staff dissatisfaction. During other sessions when no-shows exceed the expected average number of no-shows, resources would be wasted. The performance of overbooking can be improved by predicting the likelihood of no-show when each appointment is booked, as no-shows do not occur randomly. 1 In this way, overbooking for each session varies according to the likelihood of patients not showing up and a more reliable overbooking strategy can be achieved.
Other than CGH, many healthcare service providers locally9,10 and worldwide1,11 face similar no-show problems. Various statistical and predictive methods2–8 have been implemented to predict no-shows. Samorani and LaGanga 2 used Beta Function to estimate no-shows while Alaeddini et al. 4 used a hybrid probabilistic model based on logistic regression and empirical Bayesian inference to predict the probability of no-shows in real time. However, Alaeddini et al. 4 stated that the drawback is that a stochastic program is only efficient if one scenario is considered, and as the number of scenarios increases, the computational times explode.
This study aimed to develop a scoring tool to predict the risk of no-shows among subsidised new appointments so as to better manage appointment scheduling.
Materials and methods
Setting
CGH is an acute public hospital with over 1000 beds caring for a community of 1.4 million people in eastern Singapore. CGH offers a comprehensive range of medical specialties and services, and operates 14 medical SOCs. CGH provides a mixed public and private outpatient specialist service whereby patients who are Singapore residents can choose to receive subsidised care or private care. Care provided at subsidised clinics is heavily subsidised by the government and makes up 80% of the SOC workload. Patients receiving care in the subsidised clinic are managed by a rotating team of doctors. Private patients do not receive any subsidy for the care provided, and are able to receive care from their specialist of choice. The demand for private care is lower and has a shorter appointment lead time. As a public hospital, the priority is to ensure that the Singapore residents requiring subsidised care have timely access to healthcare services.
Appointment-making process
Patients who have not been previously managed by a particular medical specialty are considered a new case, and appointments will be made on the next earliest available slot. The visit type will be considered a consultation visit. When patients turn up for their scheduled appointments, the visits are classified as actualised. Whereas when patients do not show up for their scheduled appointments, the visit status is classified as no-show.
Study design
This is a retrospective study of CGH SOC visits in January–December 2013. Outpatient and inpatient administrative data from hospital-wide appointments and patient transaction systems were used. Data inclusion criteria were actualised, no-shows, new subsidised cases and consultation visits. Cancelled appointments, walk-ins and consultations related to pre-operations were excluded.
The data used in the study were patient demographics, appointments records and historical records. Patient demographics comprised gender, age, race, nationality and whether the patient lives in the eastern region of Singapore near CGH. Appointment records included specialty, appointment lead time, appointment month, appointment day of week, “was preceding day of appointment a public holiday?” and “was day of appointment a school holiday?”. Historical appointment records considered the referral source, previous visit type and previous visit status, which referred to the latest SOC visit that the patient had to any CGH specialty during the past year.
Descriptive analysis was performed for each variable. Continuous variables such as age and appointment lead time were categorised for further analysis based on the findings of the descriptive analysis.
Bivariate analysis was performed to determine the significant factors associated with no-shows. Missing data of each variable were excluded from the bivariate analysis. Logistic regression analysis was used to develop the predictive model. The odds ratios (ORs) of each predictor were determined and then normalised into weighted scores. The weighted scores were fed into logistic regression to derive the scoring model.
Due to the large data set, the level of statistical significance of all analysis was set at 0.01. To avoid over-fitting of the predictive model, only 30% of randomly selected 2013 data were randomly selected for the training of model, while the rest were used for validation. The model was further tested using SOC visit data in 2014.
SPSS version 22 was used to develop the model.
Results
A total of 75,677 new subsidised appointments were made in January–December 2013. Among them, 21,618 did not turn up, with a no-show rate of 28.6%. The significant factors predicting no-show were age, race, nationality, specialty, appointment lead time, appointment month, appointment day of the week, referral source, previous visit type and previous visit status (Table 1).
List of factors considered in the model
S/N; Serial number; CGH: Changi General Hospital; SG: Singapore; A&E: Accident and emergency; SOC: specialist outpatient clinic.
The largest age group of the patients were less than 25 years old (19.6%) and the no-show rate was higher for those less than 45 years old (Figure 1). No-show rates were significantly different across different specialities (Figure 2). A total of 33.9% of appointments waited for 8–30 days. The no-show rate linearly increased as the appointment lead time increased (Figure 3). The majority of the referrals were from polyclinics (subsidised primary care centres), but referrals from the ward and A&E had higher no-show rates (Figure 4). The no-show rate was highest among those who had previously been recorded to no-show (Figure 5).

Appointment volume and no-show rate by age.

Appointment volume and no-show rate by specialty.

Appointment volume and no-show rate by appointment lead time.

Appointment volume and no-show rate by referral source.

Appointment volume and no-show rate by previous appointment visit status.
Table 2 shows the results of the logistic regression model. The sensitivity and specificity were 80.2% and 51.8% respectively. Applying the model to the validation (70%) data, the sensitivity and specificity were 80.2% and 52.6% respectively. The model had an area under Receiver operating characteristic (AUC) of 0.74. There was no multi-collinearity among the significant factors in the model as tolerance < 0.1 or Variance inflation factors (VIF) > 10, as shown in Table 3.
List of significant factors
A&E: Accident and emergency; SOC: specialist outpatient clinic; B: Parameter Estimates; SE: standard error; CI: confidence interval; OR: odds ratio.
Test of collinearity among the significant factors
B: Parameter Estimates; SE: standard error; VIF: Variance inflation factors.
Table 4 shows the method of deriving the weighted scores using the OR generated by the logistic regression (in Table 2). First, the insignificant factor was excluded and the smallest OR was found to be 1.265. Next, the weighted score of each factor was obtained by dividing the OR by 1.265. Finally, these scores were assigned into individual variables in the original data to obtain the predictive score for each appointment. For example, an appointment made by a 20-year-old Malay patient who requested to see Specialty 1’s doctor, was given a slot two weeks later, had a referral from a polyclinic and had a previous appointment actualised in any specialty, will have a score of 3 (0 + 1 + 1 + 1 + 0 + 0).
The method to derive weighted score
OR: odds ratio; A&E: Accident and emergency; SOC: specialist outpatient clinic.
The derived predictive score had a range of 0–19, as shown in Figure 6. Scores of 0–6 had higher proportions of actualised than no-shows, while scores of 7–19 had higher proportions of no-shows than actualised.

Distribution of predictive scores.
Using the predictive score as the predictor, the logistic regression model was generated, as shown in Table 5. The AUC was 0.729.
The logistic regression model using weighted score
B: Parameter Estimates; SE: standard error; CI: confidence interval; OR: odds ratio.
For ease of use, the 19 predictive scores were grouped into risk groups. The groupings were guided by three factors: statistical significant level and OR of the scores with respect to Score 0; the training set population size of each score; and whether the percentage of no-shows is higher than actualised (Table 5).
Finally, Table 6 presents the five risk groups – Extremely low risk, Low risk, Medium risk, High risk and Extremely high risk – which are used for creating the predictive scoring tool. The AUC of the scoring tool is 0.72.
The final model presenting the 5 risk groups
B: Parameter Estimates; SE: standard error; OR: odds ratio; CI: confidence interval.
The scoring tool was tested using January–December 2014 data, as shown in Table 7. The results showed that the percentage of no-shows for the test data remains similar to the model-building data. The test data achieved an AUC of 0.718, which shows that the model is robust.
Validation of final model
Discussion
Strength
This study benefitted from large sample sizes of real-world appointments drawn from the hospital database. The large data set allowed for sufficient samples to be divided into data sets for training, testing and validation of the predictive model. The strength of our model is in its use of real-world appointment data for model development, unlike Samorani and LaGanga 2 who used simulated data.
Stochastic programming 2 is efficient only if one scenario is considered, and the computational times explode as the number of scenarios increases. Our study demonstrates that the use of logistic regression to develop a risk scoring algorithm to risk stratify patients by their likelihood of no-show can develop a prediction model of good performance and is efficient for operations. Alaeddini et al. 4 used individual patient records of past attendance that may not be available for new patients, which is the subject of interest in our study. Our model allows ease of operations by using only routine administrative data and basic patient demographic data, which are gathered by the call centre during appointment bookings, even for new patients.
Limitation
A limitation of this study is that patients’ socioeconomic status, education level and medical conditions, 1 which can be important factors contributing to no-shows, are not available in our study.
Conclusion
Our study has shown that it is possible to develop a prediction model to stratify patients according to their risk of no-show for appointments using routinely collected administrative data that is available in the system with good performance parameters. This no-show risk scoring system can be used to improve clinics’ resource utilisation so as to improve care access by recommending the overbooking of slots to be occupied by patients at high risk of no-show using a data-driven approach.
Currently, we are working on incorporating the model into a discrete event simulation model to examine the impact of evidence-guided overbooking on appointment scheduling and appointment lead times.
Footnotes
Acknowledgements
The authors thank Lee Lian Chin, Assistant Director (Specialist Clinics), CGH, for her support of this project and for granting access to anonymous attendance data; and Vicki Tan, Assistant Manager (Specialist Clinics), for her time and effort in sharing the clinic appointment setup with the authors.
Authors’ contributions
SL Chua was responsible for the study design, data analysis and writing of the manuscript. WL Chow was responsible for the conceptualization of the study design, data interpretation and critical editing of the manuscript.
Declaration of conflicting interests
None declared.
Ethical approval
Ethics approval was given by SingHealth Centralised Institutional Review A (CIRB) (Reference number: 2016/2621).
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Informed consent
Waiver of informed consent was approved by CIRB as this study involved the analysis of retrospective patient utilization data and has minimal risk to patients.
