Abstract
The Surprise Question is an effective tool to identify patients in need of palliative care. But it is unknown whether the Surprise Question can effectively predict adverse outcomes in Emergency patients. Therefor this study is to determine the utility of the modified Surprise Question for risk stratification in emergency patients. And assessed if the modified Surprise Question could be used by different healthcare personnel. Nurses and patients’ families were asked to respond as “Yes” or “No” to the modified Surprise Question for each patient. The outcome was resuscitation unit admission. Logistic regression was used to determine covariant significantly associated with resuscitation unit admission. The area under the curve for the second Surprise Question response by nurses was 0.620, which improved to 0.704 when the responses of nurses and patients’ families were in agreement. The clinical impression of nurses is a valuable tool to predict altered conditions for medium-acuity patients, and the diagnostic accuracy improves when responses of patients’ families and nurses agreed. The clinical impression of nurses is a valuable tool to predict altered conditions for medium-acuity patients, and the diagnostic accuracy improves when responses of patients’ families and nurses agreed.
The Surprise Question: “Would I be surprised if this patient died in the next 12 months?” is an effective tool to identify patients in need of palliative care and was found to predict adverse outcomes in the emergency department.
This paper assessed if the modified Surprise Question could be used by nurses.
This paper evidences that the Surprise Question is a valuable tool to predict altered conditions for medium-acuity patients in emergency department.
Background
With the rapid development of emergency medicine, visits to the emergency department have increased dramatically worldwide. 1 Important diagnostic and treatment decisions must be made within a short period after emergency department admission. However, it is more important to assign a matched care setting for patients based on the disease severity. 2 Therefore, a triage system is crucial to optimize emergency department resources, which mainly classifies patients who do and do not need immediate intervention based on urgency. 3
The Emergency Severity Index (ESI) is a widely adopted triage system with proven practicability and accuracy.4,5 The ESI includes five levels; Levels 1 and 2 are based on patient acuity and Levels 3 through 5 on resource allocation. 6 To create treatment space for high-acuity patients in the crowded environment of the emergency department, the “fast track” model is now generally accepted. With these lower-acuity patients handled first, the medium-acuity (ie, ESI 3) cases now wait for the longest for care and are often a lower priority for regular emergency department resources,7,8 despite the variation of required resources, complexity, monitoring intensity, and nursing grades. Further, the initial ESI system may omit potentially critically ill patients, and the condition of emergency patients is prone to rapid changes. Importantly, in previous, large emergency department cohort studies, most patients were classified as medium-acuity, ranging from 48% to 65%.9-11 Currently, there is no useful screening tool to identify medium-acuity patients who are likely to deteriorate to a life-threatening status since the fast-paced emergency environment requires something easy to operate and interpret in a short period.
The Surprise Question: “Would I be surprised if this patient died in the next 12 months?” is an effective tool to identify patients in need of palliative care12,13 and was found to predict adverse outcomes in the emergency department,14-16 although the accuracy widely varied. 17 Most studies used the Surprise Question for patients with cancer and organ failure and long-term mortality as the primary endpoint.18-20 However, it is unknown whether the Surprise Question can effectively predict adverse outcomes in patients with Emergency Severity Index Level 3 in the emergency department. Nurses and patients’ families are might find it easier to open up to about patients’ preferences and could be good assessors of the Surprise Question. Therefore, the aim of this study is to assess the predictive value of the SQ when used by nurses working on the emergency department and patients’ families for risk stratification in medium-acuity patients.
Methods
Study Design
The PSQS study was conducted in the emergency department observation unit of a general hospital in Sichuan, China from December 2020 to May 2021. All emergency patients were first triaged by the Emergency Service Index system. Patients with Emergency Service Index Levels 1 or 2 were admitted to the resuscitation or intensive care units for treatment. Patients with ESI 3 to 5 were evaluated by doctors in the general diagnosis unit. If patients need further intervention, monitoring, or hospitalization, they were stayed in the emergency department and accompanied by family members.
Study Population
In total, 1411 patients with ESI 3 who visited the emergency department from December 1, 2020, to February 28, 2021 were consecutively recruited for this study. Patients were excluded if they were less than 18 years old, pregnant, left the emergency department within 24 hours, had communication disorders, or refused to participate. Finally, 1186 patients were included. Figure 1 presents the study’s participant flow chart. Based on our pilot experiment, the resuscitation unit admission incidence rate was about 10% among the nurses who answered “No” to the modified Surprise Question, and about 2% among nurses who answered “Yes.”

Study flow chart.
The Modified Surprise Question
The original Surprise Question is: “Would I be surprised if this patient died in the next 12 months?.” Patients in our study were relatively stable and had very low mortality (less than 1%). Thus, nurses who cared for the patients were asked to answer the modified Surprise Question: “Would you be surprised if the patient’s condition deteriorated to life-threatening?” within two hours after admission to the observation unit. The answer choices were “Yes,” “No,” or “Unclear.” We also asked the modified Surprise Question to the family members. The families of patients who were unwilling to answer this question were allowed to remain silent. Although the family members lack relevant professional knowledge, they are the main caregivers outside the hospital and one of the most important sources of information on the patient’s condition. The modified Surprise Question was asked a second time approximately 24 hours later. We could only dynamically evaluate the modified Surprise Question twice since most patients stayed in the observation unit for less than 72 hours. We classified the answer “No” as a positive response, meaning that the nurse or family member predicts that the patient will experience a serious deterioration in the emergency department.
Competency Inventory for Registered Nurses
There were 65 nurses in the emergency department, of which 58 participated in this study. Age, seniority, professional title, and core competence scores of nurses who answered the modified Surprise Question were obtained. The nurse core competence scale used the Competency Inventory for Registered Nurse, designed by Liu et al. 21 The overall reliability of Cronbach’s α was 0.89 and included seven dimensions: critical thinking and scientific research ability, clinical nursing, leadership ability, interpersonal, ethical and legal practice, professional development, and educational consultation. In total, 55 items were scored by Likert 5 score. The total scale score was 220, and the higher the score, the stronger the core competence of the nurse. A total score of 164 to 220 (average item score >3) indicated strong core competence, 110 to 164 (average item score 2-3) indicated moderate core competence, and <110 (average item score <2) indicated weak core competence. The nurses were divided into two groups based on the cut-off value of 164 as only one nurse had a score <110.
Primary Endpoint
The patients with Emergency Service Index Level 3 primarily wait for admission after the necessary treatment in the emergency department and are considered at relatively low mortality risk. However, patients with serious illnesses may not be identified at triage, and a patient’s condition is often complex and rapidly changing in the emergency department. Therefore, if the patient showed unstable vital signs and life-threatening symptoms in the observation unit (ie, Emergency Service Index Level 1 or 2), they were transferred to the resuscitation unit after confirmation by emergency physicians. Thus, the primary endpoint was resuscitation unit admission.
Data Collection
The socio-demographic characteristics collected include the age, sex, and marital status of the patients and their accompanying family members. Vital signs (including temperature, pulse, respiration, blood pressure, and blood oxygen saturation), chief complaints, diagnosis, Charlson Comorbidity Index, and the length of stay were recorded by two independent, trained researchers. The Charlson Comorbidity Index is a validated and reliable tool for measuring the number and severity of comorbid diseases, with a total score between 0 and 31 that positively correlates with mortality. 22 The Emergency Service Index levels were recorded based on triage nurses’ evaluations. The National Early Warning Score was calculated by heart rate, blood pressure, respiration rate, oxygen saturation, body temperature, and consciousness. 23 The Modified Early Warning Score replaces oxygen saturation with urine volume based on the National Early Warning Score. 24 Finally, the quick Sepsis-related Organ Failure Assessment consisted of blood pressure, respiration rate, and consciousness. 25 Differences between the two data checks were judged by another researcher.
Statistical Analyses
Parametric variables were reported as means ± standard deviations and compared using variance analysis. Nonparametric variables were reported as medians and interquartile ranges (25th-75th percentiles) and compared using the Mann–Whitney U test. Categorical variables were reported as frequencies and percentages and compared using the chi-square test.
Because only approximately 1% of answers from nurses or family members were “Unclear,” our analyses did not consider this response. A logistic regression model was used to examine the association of the modified Surprise Question responses with resuscitation unit admission. To determine whether these relationships were independent risk factors, the model was adjusted for age, sex, vital signs on admission, emergency department length of stay, the Charlson Comorbidity Index, common chief complaints, and complicating malignant tumors. A univariate logistic regression were used to calculated odds-ratio. A multivariable logistic regression model was used to assess the interactions between the modified Surprise Question responses and the Competency Inventory for Registered Nurse score and resuscitation unit admission. The areas under the receiver operator characteristic curve were calculated to estimate the judgment efficiency of the modified Surprise Question for resuscitation unit admission, sensitivity, specificity, positive negative predictive values, and accuracy.
A two-tailed P < .05 was statistically significant for all tests. All statistical analyses were performed using SPSS version 26.0 (IBM Corp, Armonk, NY, USA) and R software 3.5.0 (R Foundation for Statistical Computing, Vienna, Austria).
Ethical Approval
This study complied with the Declaration of Helsinki, and the local Human Ethical Committee approved the study protocol. All participants provided written informed consent.
Results
Baseline Characteristics
In total, 1186 participants (average age 51.1 ± 18.9 years) were included (Figure 1). Among them, 682 (57.5%) were men. Sixty-seven patients (5.6%) were admitted to the resuscitation unit, nine of which (0.8%) died during hospitalization. Participants transferred to the resuscitation unit were older, had higher Charlson Comorbidity Index scores, Modified Early Warning Scores, and emergency department length of stays than those who were not transferred. Table 1 and Supplemental Table 1 compare the baseline characteristics.
Baseline Characteristics of MA Patients.
Note. Values are expressed as N (%), mean ± SD, and median (25th-75th).
SBP = systolic blood pressure; DBP = diastolic blood pressure; ED = emergency department; NEWS = National Early Warning Score; MEWS = Modified Early Warning Score; qSOFA = quick Sepsis-related Organ Failure Assessment; RUA = resuscitation unit admission; MA = medium-acuity.
Nurses’ Characteristics
There were 58 participating registered nurses (median age 28 [range, 26-31] years); 10 (17.2%) were males. The nurses had a median work experience of 7 (range, 2-9) years and a median Competency Inventory for Registered Nurse score of 157 (range, 136-169). Of all nurses, 20 (37.8%) had a Competency Inventory for Registered Nurse score of ≥164.
The Modified Surprise Question and Resuscitation Unit Admission
For the first modified Surprise Question (Figure 2), there was a similar incidence of resuscitation unit admission regardless of the answer from nurses (“No” 6.9% vs “Yes” 4.6%, P = .084) or patients’ families (“No” 7.2% vs “Yes” 5.3%, P = .309). Notably, nurses with a Competency Inventory for Registered Nurse score of ≥164 who answered “No” had a significantly higher percentage of patients admitted to the resuscitation unit than the “Yes” group (10.1% vs 3.9%, P < .001).

Associations between the first Surprise Question and resuscitation unit admission incidences.
For the second modified Surprise Question (Figure 3), nurses (“No” 8.3% vs “Yes” 3.2%, P < .001) and patients’ families (“No” 9.4% vs “Yes” 4.5%, P = .008) who answered “No” had a significantly higher percentage of patients admitted to the resuscitation unit than the “Yes” group. These results were consistent regardless of the nurses’ Competency Inventory for Registered Nurse score.

Associations between the second Surprise Question and resuscitation room admission incidences.
Logistic Regression Analyses
The odds ratio for resuscitation unit admission in the multivariable adjusted logistic regression models (Table 2) comparing the “Yes” and “No” responses to the first modified Surprise Question was 1.452 (95% confidence interval: 0.871-2.419, P = .152) for all nurses, 0.851 (95% confidence interval: 0.252-1.686, P = .582) for nurses with Competency Inventory for Registered Nurse scores of <164, 2.802 (95% confidence interval: 1.384-5.672, P = .004) for nurses with Competency Inventory for Registered Nurse scores of ≥164, and 1.179 (95% confidence interval: 0.441-3.148, P = .743) for patients’ families.
Adjusted and Unadjusted Odds Ratios for the Association of Surprise Question and Resuscitation Unit Admission by Logistic Regression Analysis.
Note. Multivariable logistic analysis was adjusted by age, sex, systolic blood pressure, diastolic blood pressure, heart rate, respiratory rate, oxygen saturation, common chief complaints, malignant tumor, emergency department length of stay, Emergent Severity Index, and Charlson Comorbidity Index. OR = odds ratio; SQ = Surprise Question; CIRN = Competency Inventory for Registered Nurses.
P < .05 for interaction of response to SQ, CIRN, and resuscitation room admission.
After adjusting for confounding factors, the assessment of the second modified Surprise Question (Table 2) “No” response found that all nurses (odds ratio: 2.801, 95% confidence interval: 1.620-4.844, P < .001), nurses with Competency Inventory for Registered Nurse scores of <164 (odds ratio: 2.759, 95% confidence interval: 1.116-6.822, P = .028), nurses with Competency Inventory for Registered Nurse scores of ≥164 (odds ratio: 4.461, 95% confidence interval: 1.904-10.45, P = .001), and patients’ families (odds ratio: 2.325, 95% confidence interval: 1.250-4.323, P = .008) were independent predictors of resuscitation unit admission.
Discrimination Analyses
At admission, the area under the receiver operating characteristic curve (Table 3) was 0.543 (95% confidence interval: 0.473-0.612, P = .240) for the National Early Warning Score, 0.574 (95% confidence interval: 0.501-0.647, P = .043) for the Modified Early Warning Score, and 0.547 (95% confidence interval: 0.473-0.622, P = .194) for the quick Sepsis-related Organ Failure Assessment for resuscitation unit admission. The first modified Surprise Question response from nurses and patients’ families did not have predictive value for resuscitation unit admission (P > .05). However, the area under the curve for the second modified Surprise Question from all nurses was 0.620 (95% confidence interval: 0.553-0.686, P = .001), which was greater than the National Early Warning Score, the Modified Early Warning Score, and the quick Sepsis-related Organ Failure Assessment (P < .05). The modified Surprise Question answered by all nurses had 70% sensitivity, 54% specificity, 8% positive predictive value, and 97% negative predictive value. Notably, when the modified Surprise Question responses by patients’ families and nurses agreed, the area under the curve improved to 0.704 (95% confidence interval: 0.629-0.779, P < .001), and the diagnostic accuracy improved to 84%.
Discrimination, Sensitivity, Specificity, Accuracy, PPV, and NPV of Surprise Question and Emergent Severity Index for Resuscitation Room Admission.
ROAUC = area under the receiver operator characteristic curve; SQ = Surprise Question; CIRN = Competency Inventory for Registered Nurses; NEWS = National Early Warning Score; MEWS = Modified Early Warning Score; qSOFA = quick Sepsis-related Organ Failure Assessment.
Compared with second SQ by all nurses, P < .05 for ROAUC comparison. The ROAUCs of SQ were calculated by bivariate model, that of NEWS and MEWS were calculated by continuous variable, and that of qSOFA was calculated by three-variable model. Sensitivity and specificity of NEWS, MEWS, and qSOFA for resuscitation room admission were calculated according to the cut off value of the largest Youden index.
Discussion
In this study, the associations between the answers to the modified Surprise Question from nurses and patients’ families changed with resuscitation unit admission events. The answers to the first modified Surprise Question asked within two hours after observation unit admission indicated that only nurses with a strong core competence (ie, Competency Inventory for Registered Nurse scores of ≥164) could predict rapid deterioration by their clinical intuition. Importantly, as the time in the observation unit increases (eg, after 24 hours), the modified Surprise Question answer “No” from nurses and patients’ families had independent predictive value regarding resuscitation unit admission.
To our knowledge, there is no specific assessment tool for the prognosis of medium-acuity patients in the emergency department. We compared the discrimination, sensitivity, specificity, and accuracy of the modified Surprise Question with that of the National Early Warning Score, the Modified Early Warning Score, and the quick Sepsis-related Organ Failure Assessment, which are used in different settings to detect at-risk patients for deterioration, intensive care unit admission, and mortality in the emergency department.26-29 Our results suggested that only the initial Modified Early Warning Score has poor predictive power for the deterioration of patients with Emergency Service Index Level 3. These results demonstrate that it is difficult to identify the prognosis of medium-acuity patients with stable vital signs using a screening tool comprised of only vital signs. Further, the modified Surprise Question responses from nurses the next day could better discriminate resuscitation room admission than the Modified Early Warning Score, suggesting that the clinical intuition of nurses considers more factors than vital signs, making it a more valuable tool to predict deterioration. These suspected at-risk patients also need more attention and monitoring so that treatment plans can be changed in time.
Previous research has asked the Surprise Question regarding patients and validated a positive predictive value.30,31 In our pilot experiment, the degree of cooperation from patients was very poor, likely generating fear. Therefore, we conducted this study by questioning the patients’ family members and also found a negative to positive change in the predictive value, similar to the nurses. This result may be partly owing to physician-patient communication, but further study is required to explain this phenomenon. Similar to previous studies, our results also found lower discriminatory values for assessments by nonprofessionals than professionals. More importantly, the diagnostic accuracy improved when professionals and nonprofessionals agreed.30,31 The specificity was also higher for answers provided by the patients’ families than by nurses in our study, supporting the importance of shared clinical decision-making. 32
The biggest difference between this study and previous studies is the primary endpoint. In this study, the primary endpoint is worsening disease rather than mortality owing to the differences in the clinical setting and the disease severity.13,14,18,33,34 A meta-analysis analyzed the association between the Surprise Question and 6 to 18-month mortality and found that the pooled prognostic characteristics of sensitivity, specificity, and the area under the curve were 0.67, 0.80, and 0.81, respectively. 12 However, our results show that sensitivity was greater than specificity when nurses judge the likelihood of worsening disease, perhaps because medical providers are more cautious when judging death versus disease progression. This also explains why more than 50% of nurses answered “No” in our study. The discrimination in our study (area under the curve: 0.62) is generally weaker than in other studies.12,20,35,36 However, a previous study also found lower discriminatory values for endpoints other than mortality. 30 Another study about the Surprise Question conducted at the emergency department measured acute morbidity as the endpoint and also resulted in a lower area under the curve for nurses (0.68) and patients (0.54), similar to our results. 31
Few studies have explored interactions between the Surprise Question responses, Competency Inventory for Registered Nurse scores, and outcomes. Our study demonstrated that nurses with a strong core competence (Competency Inventory for Registered Nurse of ≥164) have a stronger predictive ability for prognosis than nurses without. Nurses with higher scores also had significantly more work experience in our study (data not shown). Consistently, previous studies found that nurses with more experience were more accurate in their Surprise Question prediction.13,37 These results emphasize the importance of empowering junior nurses to acquire higher core competencies through education and training programs to provide better patient care. 38
There are limitations to this study. First, we used a modified Surprise Question and used resuscitation unit admission as the primary endpoint, both of which are not validated. However, some other studies have also used a modified Surprise Question in other clinical settings. 15 Second, the participating nurses had only a few years of professional experience, and we reported that clinical intuition has more predictive value in more experienced professionals. Thus, the finding in this study may differ in other emergency departments. Third, a previous study found that the patients’ prognosis accuracy was higher when both nurses and physicians agreed.15,30 However, physicians did not participate in this study due to the rapid rotation of emergency department unit physicians in our hospital. Thus, the importance of teamwork cannot be investigated. Fourth, the next-day answers by nurses and patients’ families had higher predictive power, but that may be because the answers changed based on history, physical and laboratory examinations, and diagnostic tests. As such, we will explore this concept in a future study.
Conclusions
In the very early stage of emergency department treatment, the judgment of nurses with relatively few years of clinical experience regarding patient prognosis was not accurate. However, emergency department nurses should be aware that their clinical intuition is a valuable tool for predicting altered conditions of patients with stable vital signs. Thus, individualized nursing plans can be made in advance. The diagnostic accuracy of the modified Surprise Question improved when the responses of the patients’ families and the nurses were aligned, highlighting the importance of shared decision-making. Future studies are needed to verify the effectiveness of the modified Surprise Question used in this study and to confirm if implementing the modified Surprise Question in the emergency department changes medical decisions and improves patient prognosis.
Supplemental Material
sj-docx-1-inq-10.1177_00469580231163089 – Supplemental material for Usability of the Surprise Question by Nurses and Patients’ Families for Risk Stratification in Emergency Patients: A Prospective Cohort Study
Supplemental material, sj-docx-1-inq-10.1177_00469580231163089 for Usability of the Surprise Question by Nurses and Patients’ Families for Risk Stratification in Emergency Patients: A Prospective Cohort Study by Yi Liu, Dongze Li, Yu Jia, Jing Yu, Fanghui Li, Yongli Gao, Yan Ma, Zhi Wan, Zhi Zeng and Wei Zhang in INQUIRY: The Journal of Health Care Organization, Provision, and Financing
Supplemental Material
sj-docx-2-inq-10.1177_00469580231163089 – Supplemental material for Usability of the Surprise Question by Nurses and Patients’ Families for Risk Stratification in Emergency Patients: A Prospective Cohort Study
Supplemental material, sj-docx-2-inq-10.1177_00469580231163089 for Usability of the Surprise Question by Nurses and Patients’ Families for Risk Stratification in Emergency Patients: A Prospective Cohort Study by Yi Liu, Dongze Li, Yu Jia, Jing Yu, Fanghui Li, Yongli Gao, Yan Ma, Zhi Wan, Zhi Zeng and Wei Zhang in INQUIRY: The Journal of Health Care Organization, Provision, and Financing
Footnotes
Acknowledgements
We would like to thank all the participants of this project and investigators for collecting the data.
Author Contributions
YL, ZZ, and WZ conceived of the study design. DL, YJ, JY, FL, YL, YG, and ZW collected the epidemiological and clinical data. DL, YJ, YL, and YM summarized data, and performed the statistical analysis. DL, YJ, YM, and YC interpreted the data, and drafted the manuscript. All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Data Availability Statement
The corresponding author will provide the data that support the results of this research upon reasonable request.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported financially by grants from National Key Research and Development Program of China (No. 2020AAA0105000, 2020AAA0105005), Sichuan Science and Technology Program (No. 2022YFS0279, 2021YFQ0062, 2022JDRC0148, 2023YFS0240), Sichuan Provincial Health Commission (ZH2022-101), Sichuan University West China Nursing Discipline Development Special Fund Project (HXHL20017, HXHL20046, HXHL21016)
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
