Abstract
Objective
To investigate whether centrality bias is one of the contributing factors for patient mistriage in the emergency department.
Methods
A randomized, controlled, single-blinded trial was conducted in an emergency department triage station between April 1 and November 3, 2021. Experienced triage nurses were divided into control and treatment groups. The control group triaged patients using the Canadian Triage and Acuity Scale 1–5 triage scale, and the treatment group used a four-level triage scale (created by removing level 3 from the original Canadian Triage and Acuity Scale). Neither group was exposed to the other’s ranking and the control group determined the patient’s actual triage ranking. The accuracy of each group’s ranking was determined by triage experts. Triage nurses’ levels of confidence was investigated, as was the correlation between triage ranking accuracy and confidence level.
Results
After excluding 58 patients with missing data, 146 assessments were analyzed. Statistical analysis was performed to compare three different aspects of triage rankings between the nurses’ groups and the control group. In the first and second analyses, accuracy levels of 49% and 68% (p = 0.003), 43% and 68% (p < 0.0001) were found for the control and experimental groups, respectively. The third aspect showed no significant differences. Within the control and experimental groups, the difference in accuracy rate at levels 2 and 5 was the most significant, with 13% and 75% (p = 0.40), 29% and 67% (p = 0.009), respectively.
Conclusions:
Central tendency has the potential to affect the accuracy of ranking among triage nurses in the emergency department. Further research is needed.
Introduction
Background: Many emergency departments (EDs) worldwide use a triage system to prioritize patients for treatment. 1 In emergency care, the modern triage system is based upon the concept that patients with the most severe injuries or illnesses should be treated first.2,3 The decision-making process of triage is based on a balance between objective information (e.g. search for information, assessment of vital signs) and human judgment (e.g. experience, intuition, observation, and formation of an opinion). 4 Since human judgment is a critical element in these decisions, under conditions of uncertainty, decision-makers are more susceptible to cognitive bias. 5 Cognitive biases mainly occur when people process and interpret information in the world around them, often unconsciously. 6 Like most people, medical staff are also prone to cognitive bias.7–12
Centrality bias, a cognitive bias, is the tendency of raters to rate objects mainly around the middle/mean of a rating continuum while avoiding the selection of the extremes. 13 Evaluators may fall prey to centrality bias when they do not know how to differentiate sufficiently between high and low performers and tend to compress their rating. 14 Another indication of centrality bias is the rater’s subjective interpretation of the midpoint level; some raters might use the midpoint as a “dumping ground” due to a misinterpretation of the true meaning of the midpoint.15,16
Objective: The aim of the study is to investigate the impact of removing the midpoint (Canadian Triage and Acuity Scale (CTAS) level 3) from the five-level triage scale on mistriage rates and nurses’ accountability in a large-scale hospital ED. The study hypothesizes that removing the midpoint (CTAS level 3) will reduce mistriage rates. An additional hypothesis of the study is that eliminating the midpoint option (level 3) will increase nurses’ accountability, as it forces them to make a definitive choice rather than selecting a level that indicates “no opinion.” 16
Importance: The working environment in EDs is often chaotic, and medical staff are required to function under intense time pressure, with frequent interruptions and high levels of uncertainty while making rapid and complex decisions, which can contribute to triage errors.17–20 Mistriage in EDs has been associated with acute outcomes, while cases of under-triage have been found to increase the rate of patients hospitalized and critical outcomes.21–25 In contrast, cases of over-triage have been linked with overcrowding in the ED waiting room, overuse of ED resources and unneeded hospitalizations.26,27
Interestingly, the highest distribution rates of patients are at the midpoint score in five-level triage scales (the Centers for Disease Control and Prevention (CDC) reported that over 50% of triage cases in the U.S. alone are ranked at the midpoint 28 ), and also has the highest relative instances of mistriage.29,30 Based on this information, it appears that patients’ compressed distribution rates around the midpoint level in a five-level triage scale are influenced by centrality bias.28–30 The current study hypothesizes that removing the midpoint level (CTAS level 3) from the five-level triage scale will reduce mistriage rates.
Methods
Informed consent
The Institutional Ethics Committee of Shamir Medical Center (SMC) approved the study and waived the requirements for written informed consent from patients. Nurses and gold standard participants provided written informed consent (reference 0007-21-ASF).
Selection of participants
Certified ED triage nurses at SMC participated in the experiment. A gold standard evaluator panel consisted of seven gold standards, all of whom are triage specialists, evaluators, and instructors. The panel also included members with expertise in academic triage research, triage debriefing, triage events investigation, acting ED directors, emergency medicine physicians, and nurse administrators.
Patients were randomly selected and participated in the experiment. The patient’s selection was based on the regular workflow routine of the ED. Patients’ participation was denied if they were minors (under 18 years old) or had a case that required immediate aid.
Location
The study was conducted in the ED of SMC, an academic medical facility that provides care for over one million residents of Israel’s central region 31 and treats about 160,000 patients each year, with 350–500 visits/day.
Sample size
No previous studies provided a basis for the power analysis of this trial. Therefore, we aimed for a statistical power of at least 0.8 with a moderate effect size of 0.5. We conducted a power analysis using output from StatsKingdom’s power analysis calculator 32 to calculate sample power for the Pearson’s correlation test between nurses in group 2 and the gold standard evaluator. Based on these parameters, the required sample size was estimated to be ~200 triage rankings. We were able to collect 204 observations during the study period. Ultimately, 146 of these observations were usable for statistical analysis. Based on a sample size of n = 146, an observed correlation coefficient of r = 0.545, and a significance level of α = 0.05, the analysis yielded a priori statistical power of 0.96 (96%). This calculation was based on Fisher’s Z-transformation method, which is commonly used to test the significance of Pearson’s correlations. This result indicates a very high likelihood (96%) of detecting a true correlation of this magnitude in the population, thereby reducing the risk of a type II error (false negative).
Design and procedure
A randomized, controlled, single-blind design study was conducted between April 1, 2021 and November 3, 2021, at the ED of the SMC. Prior to the initiation of the study, nurses were assigned randomly to the treatment or control group using the Excel RAND() function. Research participants were blinded (i.e. participants consented and received only general information about the study purpose). Nurses in the control group were asked to rank the patients using the SMC ED standardized version of CTAS (the standard five-level scale), and nurses in the treatment group were asked to rank the patients using a modified version of CTAS (a four-level scale, by removing level 3; Figure 1).

Workflow of the study.
Participants in both groups triaged patients arriving at the ED triage station and a gold standard evaluator assessed whether a mistriage occurred. In order to check whether removing the midpoint score increased nurses’ accountability, nurses were asked to indicate their confidence level for each case on a scale from 0 to 100. The study data were extracted from the hospital’s electronic medical records.
In order to ensure patient safety, only the control group’s CTAS score was used to determine the prioritization of care. In order to minimize workflow interference and maximize the ease of use, a mobile digital app was designed for this specified goal.
A gold standard evaluator accompanied each assessment to evaluate the accuracy of the two evaluations. The gold standard evaluator was required to indicate the triage level of the patient, with the same standardized CTAS version used in the control condition. In addition, the gold standard evaluator was asked to collect information regarding the patient’s case. This information included the ward to which the patient should be assigned, the actual ward the patient was assigned, the patient’s age, sex, arrival method (e.g. ambulance or a private car), chief complaint, conscious state, respiration state, pain level, emotional state, mobility state, and the patient fever state (Figure 1).
Data analysis
We applied various statistical tests based on the specific research question. Independent-samples t-tests were used to compare continuous variables (e.g. differences in CTAS level 3 selections) between two independent groups (Control and Gold Standard). One-sample t-tests were used to compare group means (e.g. the proportion of CTAS level 3 selections) against known hospital population values (e.g. Gold Standard versus Hospital Population). Chi-square tests of independence were used to examine differences in accuracy rates between groups for categorical outcomes, such as correct versus incorrect triage (e.g. the match rate between nurses’ triage scores and gold standard evaluations). Fisher’s exact test was used instead of the chi-square test when expected cell counts in a 2 × 2 contingency table were below 5 (e.g. for CTAS level 2 analyses), to ensure result reliability. Two-way Analysis of Variance (ANOVA) tests were conducted to examine the effects of two independent variables—experimental group (Control versus Treatment) and accuracy (Match versus Mismatch)—on nurses’ confidence levels. This approach allowed for the assessment of both main effects and potential interaction effects between group and accuracy. All statistical tests were two-tailed unless otherwise specified, with a significance level of α = 0.05 applied to all hypothesis testing.
Results
The treatment group included 25 triage nurses and seven triage gold standard evaluators who received 204 individual patients audited during 8 months of research. Fifty-eight assessments were excluded from the final analysis due to missing triage score values in one of the medical staff participating groups (Table 1), and thus, the final sample included 146 patients (Tables 2 and 3).
Medical staff demographics and statistics (n = 32).
Patient demographics.
Patient medical presentation.
CTAS triage-level distribution in the population and experimental groups
Distribution of CTAS triage-level among the general population in the ED
In the CTAS distribution analysis (Figure 2), the CTAS scores of all of the ED visiting population during the 8-month study period were collected (in addition to experimental groups’ CTAS distributions). The total population from which data was collected was 43,584. Fifty-five percent of cases of the total population were scored at CTAS level 3, 31% were scored at CTAS level 4, 9% were scored at CTAS level 5, 5% were scored at CTAS level 2, and <1% were scored at CTAS level 1.

Patients’ CTAS level distribution in the population and the experimental groups.
Distribution of CTAS triage-level ranking among study groups
The gold standard practitioners ranked CTAS level 4 as the most common at 42%. In the control group, CTAS level 3 accounted for 44% of cases. Finally, treatment group practitioners ranked CTAS level 4 most often at 64%. The CTAS level 3 proportion of the gold standard evaluators group (36%) and the control group (44%) were significantly lower than from the entire population (55%; t (145) = −4.759, p < 0.0001) and (t (145) = −2.598, p < 0.0103), respectively (Figure 2).
Nurses’ CTAS accuracy rates
CTAS scores for each case in the sample were calculated separately by checking the experimental group’s assigned CTAS scores and comparing to that of the gold standard evaluator group’s CTAS scores. All gold standard CTAS level 3 assessments were excluded from the analysis when calculating the treatment group’s match rate in order to ensure that the treatment results were consistent with the nature of the modified CTAS method (e.g. in the cases where the gold standard evaluators ranked CTAS level 3, the treatment group was bound to be 100% wrong).
Analysis results for accuracy as a function of experimental groups (Table 4) show that the control group had an average match rate of 49% with the gold standard evaluator group, whereas the treatment group had 68%. A Chi-Square test of independence performed to assess the difference between the experimental groups showed a significant difference between the groups (χ2 (1, 240) = 8.79, p = 0.003). When replicated, excluding both groups’ gold standard CTAS level 3, the analysis revealed that the control group had a 43% match rate (relative to 49% with the CTAS level 3 cases). The chi-square test showed the difference between the groups remained significant (χ2 (1, 188) = 12.39, p < 0.0001). When the analysis was conducted again, this time excluding all CTAS level 3 from both the control and gold standard evaluator groups, the control group was found to have a 66% match rate. The chi-square test suggested no significant difference in accuracy between the control and treatment groups (χ2 (1, 155) = 0.11, p = 0.745).
Accurate assessment rates of the CTAS as a function of analysis condition and experimental group.
CTAS: Canadian Triage and Acuity Scale.
Next, the accuracy rate of the experimental groups was checked for each CTAS level separately (excluding CTAS level 3; Figure 3). The results showed that at CTAS level 2, the control group had a 13% match rate, whereas the treatment group had a 75% match rate. Fisher’s exact test showed that this difference was significant (p = 0.040). At CTAS level 4, the control group had a 52% match rate, whereas the treatment group had a 68% match rate. However, the difference was only marginally significant (χ2 (1, 124) = 3.35, p = 0.067). Finally, at CTAS level 5, the control group had a 29% match rate, whereas the treatment group had a 67% match rate. The difference was significant (χ2 (1, 48) = 6.76, p = 0.009).

Nurses’ accuracy rates in CTAS levels 2, 4, and 5 as a function of the experimental groups. The treatment group presented significantly higher accuracy at CTAS levels 2 and 5 than the control group.
Nurses’ accountability
To assess to what degree the experimental groups were accurate and when they might be wrong, the mean confidence level of the experimental groups was calculated, and results were aggregated separately as a function of match and mismatch evaluations (by the gold standard evaluators). In the match cases, the control group reported a mean confidence of 95.2% (SD = 7.17), while the treatment group reported 95.1% (SD = 10.64). In the mismatch cases, the control group reported a mean confidence level of 94.2% (SD = 10.21), whereas the treatment group reported a mean confidence level of 91.0% (SD = 14.75; see Figure 4). A two-way ANOVA test performed to examine the effect of the experimental group (control/treatment) and accuracy (match/mismatched) on the confidence level revealed a main effect for accuracy on the confidence level (F (1, 288) = 4.13, p = 0.043.), but not for the experimental condition (F (1, 288) = 2.18, p = 0.140.). In addition, no significant interaction was found between the two factors (F (1, 288) = 1.18, p = 0.278).

Nurses’ average confidence level as a function of match and mismatched evaluations.
Discussion
The practice of triage is essential in ensuring patient well-being and optimizing the use of ED resources when executed correctly; however, since human judgment plays a critical role in triage, the influence of cognitive bias can affect patient outcomes. 5 The current study aimed to investigate the influence of centrality bias behavior on triage accuracy and accountability among two experimental groups of triage nurses in an ED environment. The gold standard group was found to have reported a lower percentage mean of CTAS level 3 than that of the nurse population. This suggested that among the nurses, choosing CTAS level 3 was not exclusively explained by the fact that most cases were indeed CTAS level 3, and confirmed the hypothesis that ED nurses indeed exhibit centrality bias when using a five-scale triage score. Interestingly, the midpoint score in a five-level triage scale has been shown to be the most highly utilized (the CDC reported that over 50% of triage cases in the United States alone are ranked at the midpoint 28 ) and to have the highest relative instances of mistriage.29,30 Given the risk of increased morbidity and mortality imparted by mistriage, the impact of centrality bias upon mistriage rates warrants deeper exploration. 33
Our study also found that removal of the midpoint resulted in higher mean accuracy rates for nurses using the modified CTAS compared to nurses who used the full standardized CTAS rank system (control), amounting to a 19.0% more accurate triage evaluation (Table 4). Even when the midpoint accuracy evaluation of the gold standard was excluded from the group’s analysis, the accuracy gap increased to 25%, and both differences were significant, indicating that the treatment group was more likely to match the CTAS ranking of the gold standard (excluding CTAS level 3) than the control group. In addition, excluding CTAS level 3 from both the control and gold standard groups increased control accuracy from a 49% to a 66% match rate, and no significant differences in accuracy between the control and treatment groups were found. When the accuracy rate of the experimental groups was checked for each CTAS level separately (excluding CTAS level 3) and excluding CTAS level 3, a better match rate resulted at edge values for the treatment group than the control group. These findings regarding accuracy are of particular importance and may increase the risk of mistriage in cases requiring urgent care (i.e. CTAS level 2), potentially leading to critical outcomes.21,23,24
The literature on triage nurse accuracy indeed highlights numerous internal and external factors which can affect decision-making. In a small study based on focus groups conducted with triage nurses, Reay et al. identified three main themes which contributed to decision-making complexity experienced by nurses, including the hospital systems, variability in patient volume, and triage fatigue. 34 In another qualitative study, nurse’s decision-making was found to rely upon use both of intuition and early judgements together with the current status of the ED environment, as well as nurses’ own confidence in communicating with presenting patients. 35 Furthermore, other findings indicate that triage Registered Nurses consistently strive to balance the needs of the individual patient along with those of the ED, a broader challenge that requires both supportive environments and appropriate technology for the best outcomes. 36
Studies that evaluated three-, four-, and five-level triage scales have demonstrated that the five-level triage scale showed the highest agreement among rankers, thus exhibiting the highest reliability. However, the three- and four-level triage scales are not as extensively researched as the five-level triage scale, such as the CTAS and ATS, which are supported and mandated by governments. 37 A recent large-scale qualitative study on the five-level ESI triage system identified level three as having the highest rate of mistriage, thereby further supporting our study’s findings.
Our findings support the rejection of the null hypothesis associated with the first research question, indicating that removing the midpoint from the triage scale significantly reduced mistriage rates and supported the presence of centrality bias.
In contrast, the null hypothesis for the second research question—stating that removing the midpoint would not affect nurses’ accountability—was not rejected, as no statistically significant effect on self-reported confidence levels was observed.
With regards to nurse accountability, removal of the midpoint was not found to increase this measure. There was a main effect for accuracy on the confidence level but not for the experimental condition, and no significant interaction was found between the two factors, indicating that nurses, regardless of the experimental group, reported lower confidence in mismatch cases than in match cases. Although the difference is significant, it is weak (between 1% and 4%), and reported confidence levels for all cases (matched and mismatched) were generally remarkably high (mean = 94%, SD = 11.28), suggesting that nurses trained to make ED assessments could be more confident and calibrated with their actual accuracy level.
Implications and future directions
Our novel findings may provide valuable insights for future research and could be considered in the development of new triage scales, potentially using a four-level system rather than a five-level system. First, reducing mistriage through the removal of midpoint options may improve patient flow and reduce critical delays in high-urgency cases. This can be particularly impactful in high-volume EDs, where resource allocation hinges on accurate acuity assessments. Implementing modified triage tools with reduced scale centrality may support better prioritization without increasing the cognitive burden on triage staff.
Moreover, our results suggest that triage accuracy at extreme values (levels 2 and 5) significantly improved when the midpoint was removed, indicating that forced differentiation might help uncover clinically meaningful distinctions between cases. This insight is relevant for all domains where categorization bias and compressed scoring are prevalent (e.g. pain assessments, mental health evaluations).
Future research should evaluate whether these effects generalize across diverse hospital settings (e.g. pediatric, rural, trauma centers), and explore long-term outcomes such as morbidity, patient satisfaction, and ED overcrowding. In addition, it may be valuable to investigate how digital triage support tools or AI-based decision aids interact with cognitive biases like centrality and whether these tools can further enhance decision calibration without overriding clinical judgment.
Limitations
First, 44% of the control group was found to be ranked with CTAS level 3. This percentage was significantly lower than the CTAS level 3 proportion in the population. However, our analysis revealed that the difference was only marginally significant. Presumably, the relatively low percentage of CTAS level 3 cases in the control group stemmed from the fact that nurses knew their rankings were monitored. As a result, they may have paid more attention, invested more cognitive effort, and made more accurate (less biased) decisions.
Since the study lasted for several months, it was hoped that this so-called “Hawthorne effect” would diminish; however, due to the relatively small number of observations included in our sample, it remained strong.
Second, to measure triage accuracy, we used a gold standard evaluator. This is widely regarded as a proper tool, since there is no objective system to measure the triage level in real time. 33 In addition, our gold standards are also used to evaluate real people with actual medical issues and guide medical staff about their treatment. However, biased thinking among evaluators could have been a limitation since we could not rule out the possibility that our gold standard group had acted in a biased manner.
Finally, an additional methodological limitation stems from excluding CTAS level 1 patients, who were not eligible for participation due to ethical and clinical considerations. While this exclusion reduces the opportunity to evaluate potential extremity avoidance at the most urgent end of the triage spectrum, it was applied uniformly across all study groups. Therefore, although the absence of level 1 cases may constrain the full conceptualization of centrality bias, including avoidance of high-urgency extremes, it is unlikely to have introduced systematic bias between the treatment and control conditions. Further studies, including all acuity levels, may provide broader generalizability.
Conclusion
The current research extends our understanding of biased thinking in a complex decision-making process, in which the decisions directly impact the well-being of all people seeking emergency medical treatment. While this study did not find that increased accountability leads to better triage decisions, the results support the hypothesis that central tendency is a cognitive bias that exists among triage nurses in the ED, findings that may be crucial for developing a safe and effective patient prioritizing system. Given the potential impact of this bias on triage and patient outcomes, further study is recommended to identify tools and approaches to address centrality tendency.
Footnotes
Ethical considerations
Ethical approval for this study was obtained from The Institutional Ethics Committee of Shamir Medical Center (SMC; reference 0007-21-ASF).
Consent to participate
The Institutional Ethics Committee of Shamir Medical waived the requirements for written informed consent from patients. Nurses and Gold Standard participants provided written informed consent (reference 0007-21-ASF).
Author contributions
D.S. was involved with conceptualization, methodology, data curation, data analysis, and drafting, reviewing and editing the article. A.C. was involved with project administration, data curation, and drafting, reviewing and editing the article. G.H. was involved with methodology, supervision and visualization, and drafting, reviewing and editing the article. G.P. was involved with data curation, and drafting, reviewing and editing the article. D.T. was involved with conceptualization, methodology, data curation, supervision, visualization, and drafting, reviewing and editing the article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data supporting the findings of this study are available from corresponding author upon request.
Trial registration
The purpose of the study was to examine the effect solely on healthcare providers, without involving or impacting patients directly. As the study did not interfere with clinical care, alter the existing triage process, or include any patient-level intervention, clinical trial registration was not required.
