Sage Journals: Discover world-class research

Abstract

Clinical reasoning and decision-making in prehospital contexts are complex, and patient assessments may be influenced by stress or biases, thus potentially risking patient safety. Previous research has shown mixed results regarding cognitive interventions designed to counteract biases and improve decision-making. In educational settings, there are no tools that assess clinical reasoning while also measuring important decision-making outcomes. This study employed a mixed-methods design and a novel assessment model to evaluate clinical reasoning and decision-making among Swedish prehospital nurse specialists. Additionally, the effect of the metacognitive TWED mnemonic was investigated. Thirteen participants were randomly assigned to two groups and assessed patients in simulation settings, with groups switching cases after brief training on the TWED mnemonic. The primary outcomes included point-based scoring on decision-making, grading of potential risks of patient harm, and analysis of clinical reasoning through reflections. The results showed large variation, without overall differences between groups or demographics. A complex case presentation resulted in lower scores and greater risks of potential patient harm. Qualitative analysis highlighted participants’ ability to handle conflicting data, which correlated with better outcomes. The use of the TWED mnemonic may have increased commission bias. Further research is needed to validate and understand these findings.

Keywords

clinical reasoning decison making prehospital cognitive models topics cognitive processes

Background

In Sweden, roughly 1.3 million individuals a year require the attendance of emergency medical services. Most of these attendances are a response to an emergency call, and result in a prehospital nurse performing a patient assessment. The most frequent complaints are shortness of breath, abdominal pain, and chest pain (Swedish National Board of Health and Welfare, 2023a). After the prehospital patient assessment, 10–14% of patients are not considered to require conveyance to an emergency department (Höglund et al., 2020; Lederman et al., 2020). From an international perspective, studies on various populations have indicated significant variation in non-conveyance rates, with figures ranging from around four percent to as high as 98% of patients not requiring conveyance to an emergency department. There are concerns regarding competencies, support, and patient safety in non-conveyance decision-making (Ebben et al., 2017).

The Swedish emergency medical services are considered advanced life support units, with registered nurses or nurse specialists expected to deliver advanced medical care (Hjalmarsson et al., 2020). The staffing policy, as outlined in the Swedish regulations on prehospital care (SOFS 2009, p. 10), mandates that every ambulance must be staffed with at least one registered nurse. A nurse specialist, as defined by the Swedish Higher Education Regulation (SFS 1993, p. 100), is a registered nurse with an additional one-year master’s degree. There are a variety of nurse specialist subtypes, such as intensive care or anaesthesiology, but most prehospital nurse specialists have a degree in prehospital emergency care. In some regions of Sweden, there are directives mandating that emergency medical services are staffed with a nurse specialist working together with either a registered nurse or an emergency medical technician (Swedish National Board of Health and Welfare, 2023b). Experienced nurse specialists sometimes work as single responders, performing individual patient assessments without the support of a team member (Carlström & Fredén, 2017).

Patient Assessments in a Prehospital Context

Performing a patient assessment in a prehospital context is complex. Prehospital nurses describe the assessments as influenced by patient-related factors such as vital signs, symptoms, and medical history, with associated issues regarding paradoxes in findings and difficulties obtaining reliable information from intoxicated patients. Other contextual factors, such as home and social situation, or the absence of available options other than transport to an emergency department, also influence the decision-making (Nilsson & Lindström, 2016). Prehospital nurses also address various contributors to stress, including traumatic or chaotic situations involving children or upset relatives, as well as the influence of fear and physical or emotional reactions. Furthermore, challenges in cooperation with team members or disrupting circumstances (e.g. limited resources, adverse weather conditions, the influence of family members, or malfunctioning technology) are highlighted as having a detrimental impact (Glawing et al., 2023).

During an ambulance attendance, the patient assessment relies on the gradual collection of variable and fragmented information. The attendance can be divided into several phases, starting with dispatch information and finding the address, followed by first contact and scene assessment. Primary and secondary surveys are then performed, generating a field diagnosis and possible interventions, followed by a decision on the level of care and transport. The final phases include handover and documentation (Andersson et al., 2022). Throughout the attendance, clues are identified and analysed, and hypotheses are defined that render solutions, actions, and evaluation (Gugiu et al., 2022).

Clinical Reasoning and Decision-Making

In order to conduct high-qualitative patient assessments in complex contexts, prehospital nurses need to develop skills in clinical reasoning and decision-making. The reasoning process could be described as the analysis that leads to a decision (Gugiu et al., 2022). Clinical reasoning is a central and complex process that involves defining of the patient problem, therapeutic decisions, and prognosis (Yazdani & Hoseini Abardeh, 2019). In systems using prehospital nurses, it has been found that nurses are sometimes unaware of their clinical reasoning, and situational signals of lower severity can have a negative impact on the reasoning process (Andersson et al., 2022). Clinical decision-making is a term that highly correlates to clinical reasoning and is often defined as the decision-making process of data gathering, interpretation, evaluation, and consideration (Tiffen et al., 2014). Clinical decision-making is at best based on rational, logical, and evidence-based thinking, but it can be affected by knowledge gaps, the incorrect use of decision aids, or different biases (Croskerry, 2017). Since reasoning and decision-making are highly interdependent, this study will observe and measure clinical reasoning as it occurs during the process of analysis and justification, leading up to the decision-making and its important outcomes.

Dual-Process Theory and Biases

To guide patient assessments, prehospital nurses use subconscious and conscious thinking in the decision-making process (Perona et al., 2019). This corresponds well to dual-process theory, which divides thinking into system 1 and system 2. System 1 is intuitive, fast, subconscious, automatic, requires low effort, and is used in situations that are familiar. System 2 is slow, conscious, controlled, requires higher effort, and is used in new situations (Evans, 2008). In patients with failing vital functions, system 1 is primarily used, while system 2 is more common in unclear patient presentations (Andersson et al., 2019).

Biases can be defined as systematic errors due to inappropriate mental models or processing limitations, and are more common in intuitive system 1 thinking (Hammond et al., 2021). Biases may negatively affect decision-making in healthcare, with stressful emergency care settings being more vulnerable (Daniel et al., 2017; Saposnik et al., 2016). Cognitive biases that influence the thought process include anchoring, availability bias, confirmation bias, diagnostic momentum, and representativeness restraint. All of these biases may lead to search satisficing and premature closure. Other common biases, which are more related to the clinician’s affective response, include the framing effect and visceral biases (Croskerry, 2003; Prakash et al., 2017). Of all biases, overconfidence is most frequently reported, not only in medical literature but in all fields of decision-making research (Berthet, 2022).

Croskerry et al. (2013) discuss a variety of interventions to counteract the influence of biases. Some research has demonstrated potential benefits of cognitive strategy interventions (Al-Khafaji et al., 2022; Lambe et al., 2016; Mamede et al., 2020; Prakash et al., 2019; Staal et al., 2022), but there are also studies that failed to prove the effect of such interventions (Gopal et al., 2021; O’Sullivan & Schofield, 2019; Sherbino et al., 2011; Sherbino et al., 2014). Furthermore, most of these previous studies on the topic were carried out in a non-clinical setting, with the primary measure being performances on written clinical vignettes. Hence, there are challenges in interpreting and validating these results (Berthet, 2022; Croskerry & Campbell, 2021; Graber et al., 2011; Kahneman & Klein, 2009).

One example of an intervention with the potential to reduce bias is the TWED mnemonic. It is a checklist for metacognitive support, developed and tested by Chew et al. (2016b). The letters in TWED stand for ‘Threat’, ‘What else/What if I’m wrong?’, ‘Evidence’, and ‘Dispositional factors’. The TWED mnemonic has shown potential in increasing diagnostic accuracy (Chew et al., 2016a; Chew et al., 2017, 2017, 2019).

Patient Safety

Improving prehospital assessments and decision-making have a great impact on patient safety (Bigham et al., 2011). Moreover, there is increasing pressure on the emergency medical services, particularly in terms of assessing patients with mental health issues, substance abuse, or socio-economic vulnerabilities (Andrew et al., 2020). The complexity in assessing patients within these categories is well-known, with higher risk of biases such as psych-out-error, and additional difficulties regarding intoxicated patients with acute medical conditions (Croskerry & Campbell, 2021; Hoban, 2017; Singh et al., 2017). In Sweden, an estimated 1%–3% of non-conveyance decisions concern patients with an overlooked time-sensitive condition (Magnusson et al., 2022). Concerning patients conveyed to an emergency department with the highest priority (i.e. suspected life-threatening condition), protocol deviations and lapses in care have been identified, potentially compromising patient safety (Hagiwara et al., 2019).

To increase patient safety in emergency care, different types of checklists have been studied, with varying results (Ko et al., 2011). Evidence supports the use of checklists for specific situations or patient groups (Chen et al., 2016; Dryver et al., 2021; Kerner et al., 2017). Similarly, there is a link between correctly assessed prehospital conditions and a higher degree of correctly administered treatment (Ramadanov et al., 2019). However, studies have shown wide variation in the degree of agreement between the prehospital diagnosis and the final diagnosis (Wilson et al., 2019). Hence, checklists for specific conditions or situations may be perfectly designed for aiding decision-making, but they still rely heavily on the assumption that the clinical reasoning process is flawless. If the analysis of patient risk factors and symptom presentation is flawed, and the incorrect medical condition is considered, well-designed specific checklists may misguide the decision-making.

Patient Assessments in Educational Settings

A final consideration pertains to the evaluation and measurement of clinical reasoning and decision-making, within the context of patient assessments. When evaluating medical staff in simulation training, several evidence-based tools are available. A large portion of research has been conducted on the Lasater Clinical Judgement Rubric and Script Concordance Tests (Brentnall et al., 2022). These are both based on Likert-scale observational assessments from one or more observer. The Lasater Clinical Judgement Rubric do not evaluate measurable outcomes, and have been associated with reliability issues (Lee, 2021). On the other hand, Script Concordance Tests produce reliable results, but are advanced, time-consuming, and require extensive panels of experts to ensure high quality scoring (Fournier et al., 2008; Lubarsky et al., 2011). Another well-studied tool is the Health Sciences Reasoning Test, a 33-item multiple choice tool that is mainly used as a pre- and post-test examination (Macauley et al., 2017).

A widely accepted tool for assessment of clinical skills in different health care settings is the Objective Structured Clinical Examination. It was designed to overcome issues with subjective evaluations (Harden et al., 1975). However, this tool offers more of an overall competency assessment than an evaluation of reasoning and decision-making skills, and studies using it also face challenges regarding variations in validity and reliability (Brannick et al., 2011; Daniel et al., 2019; Walsh et al., 2009).

In a prehospital context, the Paramedic Global Rating Scale has recently been translated and validated for use in the Swedish context (Bremer et al., 2020). This Likert-scale observational tool focuses on different dimensions of clinical reasoning and decision-making, but does not assess measurable outcomes (Tavares et al., 2013). Hence, in order to evaluate the effect of a cognitive strategy intervention, such as the TWED mnemonic, on clinical reasoning and the decision-making outcomes, there remains a need for a simple and user-friendly tool.

Methods

Aim

The aim of this study was to assess prehospital nurse specialists’ clinical reasoning and decision-making skills, using a novel combined qualitative and quantitative assessment model, and to investigate the effect of the metacognitive TWED mnemonic.

Study Design

This was an experimental simulation study using a mixed methods convergent parallel crossover design. The quantitative and qualitative results were analysed separately and then discussed in an integrated analysis to demonstrate convergence or divergence (Creswell & Creswell, 2018). Triangulation of the data allows for exploring the divergence of results in mixed-methods research (Williamson, 2005).

Population and Setting

The study was conducted in a region in southern Sweden. Due to regional directives, most of the emergency medical services are staffed with a nurse specialist, working in team with a registered nurse or an emergency medical technician. At the time of the study, the region also employed experienced nurse specialists as single responders. Nurse specialists therefore play a crucial part in patient assessments within the regional prehospital emergency response.

Triage System, Guidelines, and Educational Courses

The emergency medical services in the region uses the Rapid Emergency Triage and Treatment System (RETTS) to support clinical decision-making. RETTS is based on emergency symptoms and signs (ESS), which could be viewed as the patients’ chief complaint (e.g. chest pain, dyspnoea, back pain, vomiting, and intoxication). ESS, in combination with vital signs, assigns each patient with a triage. Triage colour, or priority level, is based on the severity and time-criticality of the patient’s condition, where red is the highest level, followed by orange, yellow, green, and blue (Widgren & Jourak, 2011). This five-level triage system sorts patients into stable and unstable, with red and orange (i.e. priority levels 1 and 2) representing the unstable, time-critical conditions (Wireklint et al., 2018).

Prehospital nurses in the region are expected to work according to RETTS and digitally available treatment guidelines. The treatment guidelines consist of care programmes, information sheets, and directives, based upon the national treatment guidelines for the ambulance organisations (Föreningen för Ledningsansvariga inom Svensk Ambulanssjukvård [FLISA], 2017). The regional guidelines also provide guidance on when to consult a physician for additional decision-support, but the nurse is expected to conduct advanced initial patient assessments. Regarding non-conveyance decisions, some guidelines exist, but nurses are not obligated to consult a physician as a general rule. Employees gradually receive training in specific courses, such as Pre-hospital Trauma Life Support (PHTLS) and Advanced Medical Life Support (AMLS) (NAEMT, 2017, 2020).

Data Collection

The acquisition of data was carried out at a regional training unit during the spring of 2023. On two occasions during a week, 15 nurse specialists were observed in simulations performing two assessments each, and upon completion of each assessment they received a reflection assignment. A total of 30 assessments and 30 reflections were performed. Two participants were excluded due to protocol deviations; thus, information was collected from 26 observed simulations and 26 reflections from 13 nurse specialists. The participants represented nearly all the ambulance districts of the region.

The training unit provided a staged apartment and equipment necessary to carry out this experiment as close to a real-life simulation as possible. Actors were recruited to play the roles of patients and relatives, explicitly instructed on how to respond to the nurses’ questions and assessments. As in reality, the simulation started with a short dispatch notice, all to recreate an atmosphere as close as possible to a real-life patient assessment.

Experimental Design

To address the research question, an experimental crossover design was implemented. A summary of the study background and aims was distributed to all ambulance stations in the region, inviting participation in a research study as part of a regional educational training day. A convenience sample of 15 nurse specialists, employed by the regional ambulance organisation or any of its private partners, was included in the study. Any specialist subtype was accepted, but registered nurses without a specialist degree were excluded from participation.

On-site, a stratified block-randomisation process was conducted, initially dividing participants into two groups with similar prehospital clinical experience. Each participant completed two simulation scenarios with two different patient cases. After completing the first patient case, all nurses received brief training on the metacognitive TWED mnemonic, after which the groups switched cases (Figure 1).

Figure 1.

Experimental flow chart. * One participant per group excluded due to protocol deviations (final sample, N = 13). ** Case 2 with support from a metacognitive mnemonic. AMI = Acute myocardial infarction. DKA = Diabetic ketoacidosis.

The Patient Cases

The acute myocardial infarction (AMI) case consisted of a patient with chronic back pain and risk factors for cardiac disease, presenting with a change in his back pain due to a posterior myocardial infarction. The key to solving the case was to make a second attempt to obtain a readable ECG, since the first ECG was poorly executed due to motion artefacts. The diabetic ketoacidosis (DKA) case consisted of a patient with type 1 diabetes who was vomiting and experiencing diabetic ketoacidosis. The key to solving the case was to check glucose and ketone levels. Potential biases were induced to interfere with the nurses’ reasoning processes, with more surrounding distractions in the diabetic ketoacidosis case (Appendix 2). The scenarios were inspired by recurrent assessment situations with risks of delayed treatment and where the final diagnosis correlated to a specific treatment guideline. The regional chief medical physician provided anonymous data for the synthesis of the cases.

The participants started off with receipt of the dispatch notice and were asked to proceed to the assessment as usual. The notice contained both relevant and distracting information (Appendixes 1 and 2). There was bedside digital access to RETTS and other implemented guidelines.

Educational Setup and Use of the Metacognitive Mnemonic

After completion of the first case assessment and written reflection, the participants were all introduced to the concepts of clinical reasoning and decision-making, as well as cognitive and emotional biases, along with the Swedish translation of the TWED mnemonic (Appendix 3). The short introduction lasted for 20 minutes and included definitions of clinical reasoning and decision-making, causes of distortion of thinking and reasoning (based on Croskerry [2017], regarding cognitive miserliness, mindware gaps, and mindware contamination), an introduction to cognitive and emotional biases, as well as contextual factors and stressors, and the metacognitive TWED mnemonic. The participants all received a pocket card with the TWED mnemonic on it. They were asked to use this when they assessed the patient in their second case. If the use of the pocket card was not observed, the participants were asked to reassess their final decision with the support of the TWED mnemonic. In most cases, the nurses were observed to explicitly use the TWED mnemonic in the later part of the assessment situation, whereas a few nurses used it earlier in the process.

Instrument and Outcome Measures

Qualitative Outcome Measure

The qualitative outcome measure involved analysing written reflections following completion of the respective patient case assessments. At this stage, participants were unaware of their performance. They were asked to describe the rationale and justify their choice of field diagnosis/ESS, priority, level of care, actions, and treatment. To triangulate the assessments, the questions were chosen to act as analogues for the quantitative measure described below. The Swedish translation of the Structure of Observed Learning Outcomes (SOLO) taxonomy (Swedish National Agency for Education, 2011) was used as support for the analysis. It was adapted to a clinical reasoning context by replacing ‘student’ with ‘nurse’ and by adding a short explanatory text providing the context of clinical reasoning (Appendix 4). The SOLO taxonomy consists of five levels of understanding: prestructural, unistructural, multistructural, relational, and extended abstract. From a clinical reasoning perspective, the emphasis is placed on the ability to effectively manage data, particularly in navigating conflicting evidence. It has been used in various types of research related to the assessment of knowledge, critical thinking, and clinical reasoning (Chrismawaty et al., 2023; Hazel et al., 2002; Ilgüy et al., 2014; Lucander et al., 2010; Pinto et al., 2016).

Quantitative Outcome Measure

The primary quantitative outcome measure was a point-based scoring of clinical decision-making, objectively measured under observation through predefined answer keys (Appendix 5). The 5-item scoring model was based on (1) differentials and field diagnosis/ESS, (2) priority, (3) level of care, and (4) actions and treatment. The maximum score was 12 in the acute myocardial infarction case and 10 in the diabetic ketoacidosis case, with the possibility of a negative score in both cases (Table 1).

Table 1.

Quantitative Point-Based Scoring Model.

Moment	−1 point	0 points	1 point	2 points	Answer key^a
Differentials	—	Less than 3 differentials	At least 3 differentials	In addition, connects actions or clinical reasoning to at least 3 differentials	1
Field diagnosis/ESS	—	Incorrect field diagnosis/ESS	ESS correct to final diagnosis	Same as final diagnosis	2
Priority	Incorrect priority block +/− ≥ 2 levels	Incorrect priority block +/− 1 level	Correct priority block 1–2/3–4	Correct priority	3
Level of care	Non-conveyance; true priority 1 or 2	—	Correct level of care	Specific pathway, if applicable	4
Actions & treatment	Per harmful action	—	Per action according to treatment guideline	—	5
Participant:Case				Maximum score on case =	Score =

^aPredefined answer keys for the respective cases are available in Appendix 5.

To further validate and strengthen the reliability of the scoring model, it was supplemented with an ordinal grading system representing the risks of potential patient harm. This grading was based on the major elements of the point-based model – that is, field diagnosis/ESS, priority, and level of care (Table 2).

Table 2.

Ordinal Risk Colour Grading Model.

Risk colour patient	Conveyance decision	Risk of potential harm to the
Red	Non-conveyance (when true priority level was 1–2 and final diagnosis required immediate actions)	High risk
Orange	Conveyance to ED, but incorrect field diagnosis/ESS and insufficient priority level (3–5 when true priority level was 1–2)	Moderate risk
Yellow	Conveyance to ED+correct field diagnosis/ESS or correct priority level	Low risk
Green	Correct level of care^a + correct field diagnosis/ESS+correct priority level	No risk

^aConveyance to ED in the DKA case and specific care pathway to percutaneous coronary intervention in the AMI case.

AMI = Acute myocardial infarction.

DKA = Diabetic ketoacidosis.

ED = Emergency department.

ESS = Emergency symptoms and signs (i.e. chief complaint).

Rationale Behind the Point-Based Scoring Model

One of the main reasons for developing a novel assessment tool was to measure important outcomes. Instead of Likert-scale evaluation, or other observational methods, the use of clinical outcome measures may be more relevant when assessing true effects of an intervention. However, no scientific consensus exists on what is considered important outcomes (Kersting et al., 2020). The authors decided on outcomes related to clinical decisions and patient safety. This resulted in the point-based scoring being complemented by the ordinal risk colour grading model. The scoring model included differential diagnosis, determination of correct medical condition, assessment of severity and level of care, as well as decisions on actions and treatment.

The model was based on the assumption that the initial assessment affects the level of care and treatment (FitzGerald & Hurst, 2017). Diagnostic agreement was graded in correlation with the final diagnosis as incorrect, correct ESS (i.e. chief complaint), or correct diagnosis, inspired by Magnusson et al. (2018). Priority level (i.e. the severity of the medical condition) was based on the distinction between stable and unstable conditions in priority blocks. This distinction includes priority level 3 to 5 versus priority level 1 or 2 according to RETTS (Wireklint et al., 2018). It was complemented with the Lund Outcome Set for Evaluation of Triage (LOSET), developed by Johansson et al. (2023), which is a set of 49 clinical outcomes linked to priority level 1 and/or 2 (e.g. receiving acute percutaneous coronary intervention, or that the patient’s pH was measured under 7.3, as examples of priority level 1 correlating to the myocardial infarction and diabetic ketoacidosis cases in the study). The level of care was linked to the priority level according to RETTS/LOSET, as well as to implemented regional guidelines for specific care pathways (i.e. care pathway to percutaneous coronary intervention). Actions and treatment were based on the implemented regional guidelines for the correct diagnosis in each patient case, as well as potentially harmful treatments available in other sections of the guidelines. The elements of the scoring model were reviewed by the main author of the LOSET study and the regional chief medical ambulance physician.

The Metacognitive Mnemonic

The TWED mnemonic (Chew et al., 2016a) was translated into Swedish following the general steps recommended for translating instruments. This included two separate translations, a discussion leading to synthesis, a blind back-translation to the original language, and a comparison with the original version. Subsequently, consensus was reached on a pilot-tested and refined version (Sousa & Rojjanasrirat, 2011). Contextual and linguistic support throughout the process was provided by the main author of the LOSET study, an English-born nurse, and an American-born PhD (Appendix 3).

Pilot Test

Prior to completing the combined qualitative and quantitative assessment model, as well as to test the translated TWED mnemonic, a pilot test was conducted at an emergency department as a small randomised controlled experiment (N = 6). Two groups performed the same patient case assessment in a simulation setting, with Group 2 having support from the metacognitive TWED mnemonic. The case consisted of a patient with several risk factors for cardiac disease presenting at the emergency department with a sudden onset of fatigue. The nurses had bedside access to the implemented triage guidelines. The maximum total score was 11, and there was the possibility of a negative score. The SOLO taxonomy was used in the analysis of the clinical reasoning from the written reflections. The reflections were first analysed separately and then compared, followed by a discussion to reach consensus.

The median total score differed between the two groups (3 vs. 7) in favour of Group 2, but it did not reach statistical significance. Nurses with a higher level of clinical reasoning (i.e. a higher SOLO taxonomy) had a superior clinical decision-making (i.e. higher total score). The correlation was strong and statistically significant (r_s = .91, N = 6, p = .012). Similarly, nurses with a greater level of diagnostic agreement to the final diagnosis had a higher total score (r_s = .89, N = 6, p = .017). The SOLO taxonomy did not differ between the groups and ranged from a multistructural level 3 to a relational level 4a. These results were used to perform a power calculation of the study design using repeated measures, with group differences regarding the metacognitive mnemonic in mind. The calculated total sample size was 15 using GPower (version 3.1). It was based on matched pairs and the Wilcoxon signed-rank test, with a large effect size of 0.8, a standard deviation of 3, alpha level of .05, and a power of 80%.

After the pilot, three points were included for determining the correct care pathway to secure vertical and horizontal consistency in the point-based scoring model. This was due to three points being given if the field diagnosis was in line with the final diagnosis, which correlates to three points for the care pathway to percutaneous coronary intervention in the acute myocardial infarction case. Points were also added for differentials and associated reasoning/examinations, and the questions in the reflection task were clarified to generate more depth. The same main questions remained (rationale and justification of field diagnosis/ESS, priority, level of care, and lastly actions and treatment), but they were worded otherwise in order to build upon each other (e.g. ‘based on your choice of field diagnosis/ESS, justify your choice of priority level; Based on your field diagnosis and priority level, justify your choice of care level’). These changes were made to generate deeper reasoning, since the pilot test generated a limited text content for analysis. In the main study, the number of differentials with associated actions and reasoning, needed to be included with the qualitative data on the case report forms. Otherwise, a limited text content could produce a less reliable assessment of the nurses’ actual reasoning capabilities regarding differential diagnosis.

Data Analysis

The SOLO taxonomy of the written reflections was assessed individually by O.F. and U.H., and then compared. If the assessment differed, a discussion took place that led to a consensus. Kappa statistics on inter-rater agreement were calculated. Inter-rater agreement on the SOLO taxonomy of the 26 written reflections was moderate to substantial (.586 and .668 on the sublevels and main levels, respectively). After discussion, a complete consensus was reached on all levels. Observations and scoring were made by O.F. and U.H.

The data analysis was carried out using SPSS version 28. Demographic data were analysed by comparing medians, due to skewed distribution in continuous variables, and counts between groups. Total score, risk colour, and levels of SOLO taxonomy and diagnostic agreement were considered ordinal variables; therefore, non-parametric tests were used. The Wilcoxon signed-rank test and McNemar’s test were used for within-group analysis, while the Mann–Whitney U test, Kruskal–Wallis test, chi-square, or Fisher’s exact test were used for between-group analysis. Spearman’s rank test was used for the correlation analysis.

Results

Demographics

Data were collected from 26 simulations and reflections. The 13 participants, divided into 2 groups depending on whether they started with the acute myocardial infarction case or the diabetic ketoacidosis case (AMI/DKA vs. DKA/AMI), had a median age of 46 years and a median of 11 years of prehospital experience. There was a notable but non-significant gender difference between the groups (Table 3).

Table 3.

Demographic Characteristics of Participants.

Characteristics	AMI/DKA		DKA/AMI		Full sample
Characteristics	n	%	n	%	n	%
Age
30–38 years old	1	14.3	3	50.0	4	30.8
43–49 years old	4	57.1	1	16.7	5	38.4
51–55 years old	2	28.6	2	33.3	4	30.8
Years of experience in prehospital care
4–7 years	3	42.9	1	16.7	4	30.8
9–11 years	1	14.3	3	50.0	4	30.8
17–25 years	3	42.9	2	33.3	5	38.4
Years as a registered nurse
6–11 years	2	28.6	2	33.3	4	30.8
12–18 years	3	42.9	2	33.3	5	38.4
22–30 years	2	28.6	2	33.3	4	30.8
Gender^a
Female	1	14.3	4	66.7	5	38.4
Male	6	85.7	2	33.3	8	61.5
Active as single responder	3	42.9	5	83.3	8	61.5
Specialist subtype
Prehospital care	6	85.7	4	66.7	10	76.9
Other^b	1	14.3	2	33.3	3	23.1
Completed educational courses
AMLS	4	57.1	4	66.7	8	61.5
PHTLS	5	71.4	3	50.0	8	61.5

Note. N = 13. Continuous variables were binned into equal-sized groups. Median age was 46 years.

(Range = 25) and did not differ significantly between groups. Neither did years of experience in.

prehospital care (Median = 11, Range = 21) or years as a registered nurse (Median = 16, Range = 24).

^aNon-significant notable difference between groups (p = .103).

^bOther subtypes represented were intensive care and anaesthesiology.

AMI = Acute myocardial infarction.

AMI/DKA = The group that started with the AMI case, followed by the DKA case.

DKA = Diabetic ketoacidosis.

DKA/AMI = The group that started with the DKA case, followed by the AMI case.

AMLS = Advanced medical life support.

PHTLS = Prehospital trauma life support.

Qualitative Results

The SOLO taxonomy of the written reflections ranged from a single unistructural reasoning to 10 relational reasonings. The remaining 15 reflections were on a multistructural level. In the diabetic ketoacidosis case, nurses with 9–11 years of experience in prehospital care reasoned on a relational level, compared to multistructural-level reasoning among the other nurses (p = .01). No other group or demographic differences were found. Regarding the investigated effect of the metacognitive TWED mnemonic, there was a slight overall decrease in the median taxonomy level from a multistructural level 3a to a multistructural level 3 from the first to the second case (p = .124). This decrease occurred in both groups, with no regard to order of patient cases.

Table 4 contains quotes representing the different levels revealed. Note participant #13, whose quote is representative for multistructural level 3 reasoning, where conflicts were ignored and a decision on non-conveyance was made. Also note participant #5, whose quote is representative for the awareness of, but inability to interpret and explain, conflicting data (i.e. multistructural level 3a). The ability to handle conflicting data represents the relational reasoning levels (see participants #9 and #4).

Table 4.

Quotes Representing SOLO Taxonomy Levels Revealed in the Results.

Participant	SOLO	Quote
#7	2a	‘Back pain without trauma […]. Describes a slow onset […]. After palpation muscular pain without radiation […]. No loss of control over bladder or bowel, no saddle anaesthesia’.
#13	3	‘Diffuse symptoms but the patient’s perceived discomfort is vomiting […]. Alternatively abdominal pain […]. ESS abdominal pain or vomiting will make the patient a yellow priority, but respiratory rate 28 makes the patient orange […]. Relatively unaffected patient, but vital signs stand out and a somewhat diffuse medical history […]. I assess that primary care can be a good start […]. Affected breathing, mainly high respiratory rate, no action taken […]. Pulse slightly elevated, due to dehydration from vomiting/withdrawal/drugs? Due to pain? Hyperglycaemia? Alert but affected? […]. High blood glucose […]. No fever’.
#5	3a	‘I chose between ESS nausea/vomiting (ESS 8), hyperglycaemia (ESS 49), intoxication (ESS 40), and chest pain (ESS 5) […]. I decided that ESS 5 is the one that needs to be assessed/ruled out first. […]. ECG showed some deviations […]. Certainly, all “noise” can have an impact, but it feels important not to neglect these symptoms […]. Of course, the patient should be assessed and tested at the hospital […]. I chose to treat according to the guidelines for chest pain […]. Should be safe for the patient despite alcohol/mushroom intoxication’.
#9	4	‘Through information gathering. What do I see in front of me? What has happened? What underlying health conditions does the patient have? What deviates from normal? I Look at the ECG and request a new one since the first one could not be interpreted […]. Once I have determined that there is an ongoing ischemia, the patient requires acute cardiac care […]. It can be tricky to encounter a seemingly unaffected patient, as one can easily be misled’.
#4	4a	‘Patient has diabetes, has been vomiting since yesterday […]. Slightly high blood glucose and ketones over 3 […]. Rapid respiratory rate […]. Unclear regarding food and drink intake as well as insulin […]. Some drugs and drinking, history is somewhat unreliable […]. ESS 49 (hyperglycaemia), the patient is priority 2, as the patient has ketones >3 and respiratory rate >25 […]. Considering intoxication, but since the patient is GCS 15 and does not seem very drug influenced, I stick to the diabetes hypothesis […]. Needs admission with monitoring […]. Does the patient vomit due to diabetes or another cause? […]. Needs immediate emergency care’.

Note. This table contains quotes from participants representing the different SOLO taxonomy levels revealed in the results.

ECG = Electrocardiogram.

ESS = Emergency symptoms and signs (i.e. chief complaint).

GCS = Glasgow coma scale.

PCI = Percutaneous coronary intervention.

SOLO = Structure of observed learning outcomes.

2a = Two aspects of the case are discussed, but several aspects are missing and the nurse is unable to draw a correct conclusion.

3 = Multiple aspects of the case are discussed, but conflicts are ignored and a simplified conclusion is drawn.

3a = Multiple aspects of the presentation and conflicts are discussed but then overlooked, and a simplified conclusion is drawn.

4 = Broad reasoning based on aspects from several different types of data, including reflections on conflicts and inconsistencies.

4a = In addition to level 4, the premature nature of the field diagnosis is also suggested.

Quantitative Results

The median scores on the point-based model were 9 out of 12 in the acute myocardial infarction case, and 6 out of 10 in the diabetic ketoacidosis case. Nurses with 9–11 years of prehospital experience performed better on the diabetic ketoacidosis case; otherwise, no significant differences were found between groups or demographics (Table 5).

Table 5.

Demographic Characteristics and Total Score (TS).

Participants	AMI case median TS (range)	DKA case median TS (range)
Age
30–38 years old	10 (3–12)	6 (4–8)
43–49 years old	7 (3–12)	8 (6–10)
51–55 years old	7 (−2–10)	5 (3–6)
Years of experience in prehospital care
4–7 years	8 (−2–10)	4 (4–7)
9–11 years	6 (3–10)	9 (8–10)^a
17–25 years	9 (4–12)	6 (3–8)
Years as a registered nurse
6–11 years	8 (3–12)	6 (4–8)
12–18 years	9 (−2–10)	8 (4–10)
22–30 years	9 (4–12)	6 (3–8)
Gender
Female	9 (3–12)	8 (4–10)
Male	8 (−2–12)	6 (3–10)
Active as single responder
Yes	8 (3–12)	7 (3–10)
No	10 (−2–12)	4 (4–10)
Specialist subtype
Prehospital care	6 (−2–12)	7 (4–10)
Other^b	9 (9–10)	5 (3–8)
Completed AMLS
Yes	8 (3–12)	6 (3–10)
No	9 (−2–10)	8 (4–10)
Completed PHTLS
Yes	10 (3–12)	6 (3–10)
No	4 (−2–12)	7 (4–10)

Note. N = 13. Continuous variables were binned into equal-sized groups. The maximum total score in the AMI case was 12 points, with a median total score of 9 points (range = 14, from −2–12).

The maximum total score in the DKA case was 10 points, with a median total score of 6 points (range = 7, from 3 to 10).

^aStatistically significant difference between the experience groups (p = .024).

^bOther specialist subtypes represented were intensive care and anaesthesiology.

AMI = Acute myocardial infarction.

AMLS = Advanced medical life support.

DKA = Diabetic ketoacidosis.

PHTLS = Prehospital trauma life support.

Overall, 50% of the assessments resulted in the combination of a correct field diagnosis/ESS, correct priority, and correct level of care. This resulted in a categorisation of green risk colour, indicating no risk of potential harm to the patient. Only three out of thirteen nurses managed both their cases at this level (Table 7). Twenty-one out of the twenty-six assessments posed no or low risk of potential patient harm, with the remaining five assessments posing moderate or high risk. Four of these five assessments concerned the diabetic ketoacidosis case, equalling 31% of the assessments in this case.

The assessments in the acute myocardial infarction case yielded four different field diagnoses/ESS, of which acute myocardial infarction, chest pain (ESS 4), and dyspnoea (ESS 5) were considered correct. The diabetic ketoacidosis case yielded six different field diagnoses/ESS, of which diabetic ketoacidosis and hyperglycaemia (ESS 49) were considered correct. In the acute myocardial infarction case, only one assessment was considered incorrect (back pain; ESS 14), whereas several assessments in the diabetic ketoacidosis case were considered incorrect (Table 6). The difference in the amount of incorrect differential diagnoses between the cases was statistically significant (p = .016).

Table 6.

Variation in Field Diagnosis/ESS per Case.

Case	n	%
AMI case
AMI^a	5	38,5
ESS 4^b (shortness of breath)	2	15,4
ESS 5^b (chest pain)	5	38,5
ESS 14 (back pain)	1	7,7
DKA case
DKA^a	4	30,8
ESS 5 (chest pain)	3	23,1
ESS 6 (abdominal pain)	3	23,1
ESS 8 (vomiting)	1	7,7
ESS 40 (intoxication)	1	7,7
ESS 49^b (hyperglycaemia)	1	7,7

Note. Between the AMI case and the DKA case, there was a statistically significant difference regarding correct versus incorrect field diagnosis or ESS meaning that significantly less participants in the DKA case chose the correct field diagnosis or ESS (p = .016).

^aField diagnosis correct to final diagnosis.

^bESS correct to final diagnosis.

AMI = Acute myocardial infarction.

DKA = Diabetic ketoacidosis.

ESS = Emergency symptoms and signs (i.e. chief complaint).

There was a significant positive correlation between diagnostic agreement and total score, meaning a closer match to final diagnosis resulted in better clinical outcomes of the decision-making and less potential risk of patient harm (Scatter plot 1). In the acute myocardial infarction case, the correlation coefficient was .84 and in the diabetic ketoacidosis case it was .857 (p < .001).

Scatter plot 1.

DKA case correlation of total score and level of diagnostic agreement. Note. N = 13, r_s (11) = .857, r² = .734, p < .001. The maximum total score was 10 points. The interpolation line shows the mean value of the total score. The level of diagnostic agreement was calculated from the point-based model answer key as follows: 2 points for DKA; 1 point for ESS 49 (Hyperglycaemia); 0 points for all others. DKA = Diabetic ketoacidosis. ESS = Emergency symptoms and signs (i.e. chief complaint).

Lastly, in the acute myocardial infarction case, 12 out of 13 nurses performed an initial ECG, whereas 8 performed a second (readable) ECG, this being the key action to solve the case. Regarding key actions in the diabetic ketoacidosis case, 10 out of the 13 nurses initially checked the plasma glucose level, whereas 6 checked the plasma ketone levels. The number of ECGs performed in the diabetic ketoacidosis case differed significantly between the AMI/DKA and the DKA/AMI group. All the participants who first assessed the patient with an acute myocardial infarction also performed an ECG in the diabetic ketoacidosis case. In the other group, only two participants performed an ECG (p = .021).

Validation of the Point-Based Assessment Model

The results showed a strong positive correlation between total score and risk colour in both cases. Nurses with superior clinical decision-making (i.e. higher scores) posed less risk of potential patient harm (i.e. superior risk colour), overall validating the point-based scoring model from a patient safety point of view (Scatter plots 2 and 3). The correlations were statistically significant, and the strongest correlation was found in the diabetic ketoacidosis case (r = .942).

Scatter plot 2.

AMI case correlation of total score and risk colour. Note. N = 13; rs (11) = .867, r2 = .752, p < .001. Maximum total score was 12 points. The interpolation line shows the mean value of the total score. AMI = Acute myocardial infarction. ED = Emergency department. ESS = Emergency symptoms and signs (i.e. chief complaint).

Scatter plot 3.

DKA case correlation of total score and risk colour. Note. N = 13; r_s (11) = .942, r² = .887, p < .001. Maximum total score was 10 points. The interpolation line shows the mean value of the total score. DKA = Diabetic ketoacidosis. ED = Emergency department. ESS = Emergency symptoms and signs (i.e. chief complaint).

Effect of the Metacognitive Mnemonic

There were no statistically significant changes in overall performance between the first and second cases or between the groups. This includes total score, risk colour, and diagnostic agreement. There was a non-significant decrease in the median total score, from 9 points to 6 points, between the first and second cases (Table 7). Similarly, the risk colour changed from green to yellow, indicating a shift in the median potential risk of patient harm from non-existent to low. However, this change was also not significantly significant (p = .783 for total score, p = .774 for risk colour).

Table 7.

Total Score per Participant and Change From Case 1 to Case 2.

Participant	Case 1 Total score	Case 2 Total Score	Change	Group
#1^a	10	10	±0	AMI/DKA
#2	4	12	+8	DKA/AMI
#3^a	12	8	−4	AMI/DKA
#4	10	3	−7	DKA/AMI
#5	4	7	+3	AMI/DKA
#6	3	10	+7	DKA/AMI
#7	2	4	+6	AMI/DKA
#8	8	3	−5	DKA/AMI
#9	7	6	−1	AMI/DKA
#10^a	8	9	+1	DKA/AMI
#11	9	5	−4	AMI/DKA
#12	6	4	−2	DKA/AMI
#13	11	4	−7	AMI/DKA

Note. N = 13. The maximum score was 12 and 10 points for the AMI case and DKA case, respectively. Median total score in Case 1 was 8 (range = 14), with green risk colour, and median total score in Case 2 was 6 (range = 9), with yellow risk colour. The differences were non-significant (p = .783 for total score, p = .774 for risk colour).

^aParticipants #1, #3, and #10, whose patient assessments generated a green risk colour in both cases.

AMI = Acute myocardial infarction

AMI/DKA = The group that started with the AMI case, followed by the DKA case

DKA = Diabetic ketoacidosis

DKA/AMI = The group that started with the DKA case, followed by the AMI case

There was a significant increase in the number of differential diagnoses between the first and second cases, from a median of three (range = 4) to a median of five differentials (range = 8), when using the metacognitive mnemonic (p = .018). The increased number of differential diagnoses had an overall non-significant negative correlation to risk colour (p = .073) and total score (p = .161). However, in the diabetic ketoacidosis case, the correlations were significant (p = .021), establishing a negative relationship between increasing number of differential diagnoses and performance (Scatter plot 4).

Scatter plot 4.

DKA case correlation of risk colour and number of differential diagnoses. Note. N = 13; r_s (11) = −.629, r² = 0.396, p = .021. The interpolation line shows the mean number of differential diagnoses. DKA = Diabetic ketoacidosis. ED = Emergency department. ESS = Emergency symptoms and signs (i.e. chief complaint).

Integrative Analysis

Validation of the Overall Qual–Quant Assessment Model

The three-part combined qualitative and quantitative assessment model showed strong consistency and convergence between the quantitative measures of clinical decision-making (total score and risk colour) in both patient cases. The SOLO taxonomy level showed a stronger correlation to the quantitative measures in the diabetic ketoacidosis case, indicating clinical reasoning and the outcomes of the decision-making were more interdependent in these assessments (Scatter plot 5). In contrast, the participants were able to demonstrate good clinical decision-making in the acute myocardial infarction case, without exhibiting comprehensive clinical reasoning skills, resulting in a weaker correlation (Scatter plot 6).

Scatter plot 5.

DKA case correlations of total score, risk colour, and SOLO taxonomy level. Note. N = 13. The table shows the combined qualitative and quantitative assessment model. Maximum total score was 10 points. Data labels show the SOLO taxonomy level per participant. The interpolation line shows the mean value of the total score. The correlation coefficient between total score and risk colour was .942 (p < .001), and the correlation coefficient between risk colour and SOLO taxonomy was .611 (r² = .37, p = .026). DKA = Diabetic ketoacidosis. ED = Emergency department. ESS = Emergency symptoms and signs (i.e. chief complaint). SOLO = Structure of observed learning outcomes. 3 = Multiple aspects of the case are discussed, but conflicts are ignored and a simplified conclusion is drawn. 3a = Multiple aspects of the presentation and conflicts are discussed but then overlooked, and a simplified conclusion is drawn. 4 = Broad reasoning based on aspects from several different types of data, including reflections on conflicts and inconsistencies. 4a = In addition to level 4, the premature nature of the field diagnosis is also suggested.

Scatter plot 6.

AMI case correlations of total score, risk colour, and SOLO taxonomy level. Note. N = 13. The table shows the combined qualitative and quantitative assessment model. Maximum total score was 12 points. Data labels show the SOLO taxonomy level per participant. The interpolation line shows the mean value of the total score. The correlation coefficient between total score and risk colour was .867 (p < .001), and the correlation coefficient between risk colour and SOLO taxonomy was .326 (r² = .11, p = .277). AMI = Acute myocardial infarction. ED = Emergency department. ESS = Emergency symptoms and signs (i.e. chief complaint). SOLO = Structure of observed learning outcomes. 2a = Two aspects of the case are discussed, but several aspects are missing and the nurse is unable to draw a correct conclusion. 3 = Multiple aspects of the case are discussed, but conflicts are ignored and a simplified conclusion is drawn. 3a = Multiple aspects of the presentation and conflicts are discussed but then overlooked, and a simplified conclusion is drawn. 4 = Broad reasoning based on aspects from several different types of data, including reflections on conflicts and inconsistencies.

In cases of weaker correlation between reasoning and performance, the model’s ability to triangulate and further explore the assessments can be exemplified through cases with similar total scores but different risk colour. In the diabetic ketoacidosis case, participant #13 had a total score of 4 points and red risk colour, while participant #6 had a lower total score of 3 points but an orange risk colour. SOLO taxonomy analysis showed differences in the handling conflicting data, as a distinguishing factor between high and moderate risks of potential patient harm. Participant #13 ignored apparent conflicts in the data (SOLO taxonomy level 3), while participant #13 attempted to discuss them (SOLO taxonomy level 3a). Additionally, the sub-scores of the point-based model further explained the differences in clinical decision-making and potential patient harm (Table 8).

Table 8.

Exploring DKA Cases of Divergence in the Combined Quantitative and Qualitative Assessment Model.

Participant	Case	Total score	Risk colour	Points explained (points given)	SOLO level
#13	DKA	4	Red	Differential diagnosis (+2)Incorrect field diagnosis/ESS (0)Correct priority (+2)Non-conveyance decision (−1)B-glucose checked (+1)	3
#6	DKA	3	Orange	Differential diagnosis (+2)Incorrect field diagnosis/ESS (0)Incorrect priority block (0)Correct level of care (+1)	3a

Note. The maximum score was 10 points in the DKA case. This table shows the use of combined quantitative and qualitative data to explain results through the triangulation of clinical reasoning and decision-making. In this case, the handling of conflicting data explained why a nurse whose assessment rendered a higher score still posed a greater potential risk of patient harm. SOLO taxonomy levels 3 and 3a are both on a multistructural level, meaning a wrong conclusion is drawn, but with the difference of level 3 ignoring and level 3a discussing conflicts. DKA = Diabetic ketoacidosis. ESS = Emergency symptoms and signs (i.e. chief complaint). Risk colour = Risks of potential patient harm; red equals high risk; orange equals moderate risk. SOLO = Structure of observed learning outcomes. 3 = Multiple aspects of the case are discussed, but conflicts are ignored and a simplified conclusion is drawn. 3a = Multiple aspects of the presentation and conflicts are discussed but then overlooked, and a simplified conclusion is drawn.

In the acute myocardial infarction case, the results showed low SOLO taxonomy and high total scores, as well as a case of high SOLO taxonomy and low total score. In these cases of divergence, the combined model was able to explore the clinical reasoning and decision-making. Participant #5 had a low total score of 4, but a SOLO taxonomy level 4 (adequate handling of conflicts), leading to clinical decision-making with a low risk of potential patient harm. The key to solve the case was to perform a second, readable ECG, but the nurse did not complete that task. Despite this, the written reflection and differential diagnosis revealed management of conflicting data, maintaining a suspicion of cardiac or pulmonary cause, resulting in conveyance to the emergency department, with a second ECG likely to be the first action based on the RETTS triage system and the suspected condition.

Another contradictory finding concerned participant #2, who exhibited a low SOLO taxonomy level of 3 (no attention to conflicting data) but achieved a perfect score of all 12 possible points in the acute myocardial infarction case (Table 9). However, in the more complex diabetic ketoacidosis case, which was the first patient case for participant #2, the score was substantially lower (Table 7).

Table 9.

Exploring AMI Cases of Divergence in the Combined Quantitative and Qualitative Assessment Model.

Participant	Case	Total score	Risk colour	Points explained (points given)	SOLO level
#2	AMI	12	Green	Differential diagnosis (+2)Correct field diagnosis (+2)Correct priority (+2)Correct level of care (+2)All correct treatment given (+4)	3
#5	AMI	4	Yellow	Differential diagnosis (+2)Correct ESS (+1)Incorrect priority block (−1)Correct level of care (+1)Acetylsalicylic acid given (+1)	4

Note. The maximum score was 12 points in the AMI case. This table shows the use of combined quantitative and qualitative data to explain results through the triangulation of clinical reasoning and decision-making. A high total score and low SOLO taxonomy was possible, and the triangulation of this fact offered an opportunity to further expand the analysis of the case presentation and discuss possible issues with limited clinical reasoning. The SOLO taxonomy also offered an opportunity to analyse a case of low total score yet low risk of potential patient harm. The handling of conflicting data rendered an assessment with correct ESS and correct medication given, thus resulting in lower patient risk than the point-based model would predict. AMI = Acute myocardial infarction. ESS = Emergency symptoms and signs. Risk colour = Risks of potential patient harm; yellow equals low risk; green equals no risk. SOLO = Structure of observed learning outcomes. 3 = Multiple aspects of the case are discussed, but conflicts are ignored and a simplified conclusion is drawn. 4 = Broad reasoning based on aspects from several different types of data, including reflections on conflicts and inconsistencies.

Effect of the Metacognitive Mnemonic

The results did not support the hypothesis that the self-initiated TWED mnemonic would counter the biases induced in the patient cases. Total score, risk colour, and SOLO taxonomy were slightly lower in the second case, while the number of differential diagnoses significantly increased. This indicates that the TWED mnemonic, the way it was investigated within this study design, had a potential negative impact on clinical reasoning and decision-making.

Discussion

Overall Results

Clinical reasoning and decision-making varied, mainly not due to individual demographics but rather between the patient cases. Only three nurses managed both their patient assessments without any risks of potential patient harm. Regarding demographic differences, the only significant result was found in the diabetic ketoacidosis case, where nurses with 9–11 years of experience scored higher on the point-based model, compared to the other experience groups. As it was the only statistically significant difference, it was most likely due to random variation or low statistical power.

Four out of five patient assessments posing moderate to high risks of potential patient harm, was carried out in the diabetic ketoacidosis case. This indicating a higher level of difficulty compared to the acute myocardial infarction case. This could be explained by several intentional factors, one being a more stress-inducing surrounding setting. This is consistent with previous research that indicates a negative impact of stress on clinical decision-making (Leblanc et al., 2012; Morgado et al., 2015; Pottier et al., 2013). Relevant to this study design and the diabetic ketoacidosis case presentation; noise and conflict are found to be two major contributors to impaired clinical decision-making (Groombridge et al., 2019). Stress can also potentiate various biases, possibly explaining why several nurses did not check the glucose level in the diabetic ketoacidosis case, despite the dispatch notice containing information of vomiting, abdominal pain, and a history of diabetes (Yu, 2016). The diabetic ketoacidosis case had a higher complexity in the patient and scene presentation, generating more conflicting data. The patient was intoxicated, with music playing and a barking dog from behind a closed door. An annoying friend was interrupting the assessment, and there were more diverse symptoms to evaluate (Appendix 1).

In the acute myocardial infarction case the patient presented with a thoracic back pain and risk factors for cardiac disease. He appeared outside of his apartment, ready to take a ride to the emergency department, in an attempt to trigger negative emotions among the nurse specialists. Apart from that, there were some minor distractions and overall fewer symptoms to evaluate, along with less surrounding factors interrupting the assessment process (Appendix 2).

The ability to handle conflicting data (i.e. reasoning on a SOLO-taxonomy level 4 or higher) was linked to fewer potential risks of patient harm, as seen in the integrative analysis scatter plots (Scatterplots 5 and 6). In contrast, an inability to handle conflicting data (i.e. reasoning on a SOLO-taxonomy level 3a or less) had a greater potential of posing risks of patient harm. This was especially evident in the diabetic ketoacidosis case, where the SOLO taxonomy levels correlated to the elimination of potential risks of patient harm. In cases of stress, uncertainty and conflicting evidence, previous research has suggested a tendency to accept higher level of risks in the decision-making (Morgado et al., 2015).

Another potential explanation of the decreased performance in the diabetic ketoacidosis case could be a lack of initial risk stratification, either due to individual strategies or due to the design of the current regional guidelines used in the study. The specific guidelines applicable to the cases were based on diagnosis rather than initial risk stratification of patient groups or symptoms. Neither testing of glucose on all diabetic patients, nor performing an ECG on thoracic back pain, was mandatory due to protocol (Föreningen för Ledningsansvariga inom Svensk Ambulanssjukvård, 2017; Predicare, 2023), leaving the decision solely in the hands of the nurses’ individual reasoning. To overcome this, initial risk stratification may be beneficial (Bentivegna et al., 2021; Demandt et al., 2022; Spangler et al., 2019; Yu et al., 2022). Digital support systems that generate initial differential diagnoses may also support better clinical reasoning and decision-making, but this research field is still new and uncertain (Kourtidis et al., 2022). Risk stratification is especially important, when defining a chief complaint proves to be difficult. There is growing evidence on the increased risks of unfavourable outcomes among patients presenting with unclear or undefined symptoms (Djärv et al., 2015; Ibsen et al., 2021, 2022; Ivic, 2020).

Basic guideline overview, with regards to the above-mentioned challenges, could also support clinical reasoning and decision-making. This needs to be further investigated, as adherence to prehospital guidelines have been found to be suboptimal, with applicability issues and with no general effective implementation standard presented in past research (Ebben et al., 2013; Fishe et al., 2018; Martin-Gill et al., 2022). Proposed interventions are the use of reminders, education, end-user involvement, and adaptation of guidelines to the non-linear process that is clinical reasoning and decision-making in a prehospital context (Andersson et al., 2019; Ebben et al., 2018; Hagiwara et al., 2013).

The results of the study showed a strong positive correlation between diagnostic agreement and the quantitative outcome measures (i.e. total score and risk colour). This correlation was expected, but could have been exaggerated by the study design, as well as regional treatment guidelines based on initial detection of a specific diagnosis (Föreningen för Ledningsansvariga inom Svensk Ambulanssjukvård, 2017). There is evidence of great promise in checklists to improve clinical decision-making, following the correct diagnosis of an acute medical condition (Dryver et al., 2021; Hall et al., 2020). However, in a prehospital setting, the diagnostic accuracy may vary between 58% and 70% (Ackerman & Waldron, 2006; Christie et al., 2016; Cummins et al., 2013; Koivulahti et al., 2020; Magnusson et al., 2018) with a wide variation in regards of specific conditions, where some exceptions show a 100% accuracy (Wilson et al., 2019). This raises the question whether prehospital guidelines should be based on symptoms, risk stratification, diagnosis, or a combination of all three. There is no scientific overview available on the topic.

The Metacognitive Mnemonic

The use of the self-initiated metacognitive TWED mnemonic had a potentially negative impact and did not improve patient safety in this study. In fact, having more differential diagnoses may have increased the risk of commission bias (Pelaccia et al., 2011). When anchoring effects are already in place, it also seems difficult to calibrate the decision effectively, even when increasing the number of differential diagnoses (Brewer et al., 2007). Another aspect is the high educational level and long work experience among the participants in the study. Previous research has discussed the more dominant system 1 processes associated with expert decision-making (Evans, 2008), and the educational and short-term setup of this study may have been insufficient to promote change.

These study findings indicated that it was hard to calibrate decisions late in the decision process. A crew member, or maybe a consulting physician, might be a better facilitator of the metacognitive TWED mnemonic. On the other hand, it seems that technological debiasing interventions are more likely to succeed than cognitive strategies (Ludolph & Schulz, 2018), and that this study’s results are consistent with the 50% of studies on cognitive debiasing strategies that fail to show effectiveness. If this instrument is to be tested again in a similar setting, one might consider a different approach to the education and use of the mnemonic.

Validation of the Overall Qual–Quant Assessment Model

The combined qualitative and quantitative assessment model’s outcome measures were based on SOLO taxonomy (analysis of the reasoning process), point-based scoring, and grading of risks of potential harm (outcomes of the decision-making). There was a strong internal consistency within the quantitative measures of the combined model, with a moderate consistency within all three parameters of the model. In cases with inconsistencies, the different parts of the model could be used to triangulate and explore the reasoning and decision-making.

Through observation of multiple patient assessments, and analysis of written reflections, the model provided an opportunity to reflect on impact of stress, biases, and system 1 and system 2 thinking in different situations. As showed in the results, only three nurses managed both cases without potential patient risks. This opportunity of reflection is important for nurses’ professional growth (Miraglia & Asselin, 2015). Using SOLO taxonomy as qualitative outcome measure of the reflections, may offer a possibility to oversee and quantify long-term individual and group progress (Brabrand & Dahl; 2009). Although, the results from this study showed a lower rate of initial inter-rater agreement on SOLO taxonomy levels than expected.

Regarding the quantitative measures, the first item of the point-based model (i.e. number of differential diagnoses, and associated actions and reasoning) may need to be removed. The first reason for this is that a recalculation of total scores without this item didn’t affect the overall results. The second reason is the fact that the other items of the scoring model directly affect the potential outcome for the patient (i.e. field diagnosis/ESS, priority level, level of care, actions, and treatment), whereas the number of differential diagnoses is a surrogate for something not necessarily affecting clinical outcome. The third reason is based on this study findings, namely that explicit focus on increasing differential diagnoses has a potential negative impact on decision-making.

This study should be viewed as a pilot study on a new concept of simulation training assessment in a combined qualitative and quantitative manner. The choice of a different approach or different outcome measures may provide additional insights and quality to the model.

Limitations and Further Research

This study has some limitations within its design. Nurses were evaluated on individual assessments in a simulation setting, while team assessments are more common in a prehospital context. However, nurse specialists active as single responders did not perform better than other participants, which is an important lesson learnt from this study. Convenience sampling makes it difficult to generalise results, despite the strategic representation of gender, age, and experience. Additionally, the main authors work in the same organisation as the participants, which may have influenced the data collection and analysis.

The authors lacked prior SOLO taxonomy experience, potentially impacting the analysis. Despite a thorough analysis process, additional analysis made by a senior researcher with prior experience of SOLO taxonomy would provide more reliability to the qualitative results. Other qualitative approaches could also be considered (e.g. a talk-out-loud design with audio/video recording). However, due to the limited number of participants, audio or video recordings might have further reduced the sample size.

The sample size was calculated for a between-group comparison based on the pilot test results, while the main findings of the study were correlational in nature. The internal consistency of the quantitative measures was strong, but a larger sample size is needed to confirm the results in the overall combined model. This study should, therefore, be considered a pilot study, and further research with a larger sample size is needed to validate and understand the results.

This study suggests negative short-term effects of brief metacognitive training, and further research is warranted to confirm these results. Also, further research is needed on other possible interventions, such as guideline adaptation and implementation, or research with a long-term perspective on the effects of cognitive strategies on nurse specialists’ patient assessments.

Finally, to understand the results and its potential implications, one needs to be reminded of the context of Swedish prehospital healthcare, as described in the background section of this article. The ambulance crew is expected to work highly independently, making advanced patient assessments according to guidelines. The possibility to consult a physician by telephone exists but is not obligated in these guidelines, for instance regarding non-conveyance decisions. This study puts some light on aspects such as stress, cognitive biases, emotional response, and guideline adherence. All within the context of patient assessments, clinical reasoning and decision-making among nurse specialists within the Swedish prehospital emergency care.

Conclusions

Within a simulation setting, clinical reasoning and decision-making vary among prehospital nurse specialists, without apparent demographic differences. A higher complexity in the case presentation causes higher rates of assessments posing potential risks of patient harm. To facilitate good clinical decision-making in these circumstances, nurses need highly developed reasoning abilities to understand and interpret conflicting data. To assess reasoning abilities, and the subsequent outcomes of the decision-making, a novel combined qualitative and quantitative assessment model was developed. The model was based on SOLO taxonomy analysis and point-based scoring on important clinical decisions.

The novel assessment model showed satisfying internal consistency, especially in a complex case presentation. The combined measures of the model integrated the objectivity of predefined answer keys with the depth of written reflections. It measured clinical outcomes and, from an educational point of view, offered opportunities for feedback and professional growth.

Based on the study findings, self-initiated use of the metacognitive TWED mnemonic did not show any benefits. In fact, the results indicated decreased performance and commission bias, due to a significant increase of differential diagnoses.

Supplemental Material

Supplemental Material - Assessing Clinical Reasoning and Decision-Making in Swedish Prehospital Emergency Care: A Mixed Methods Study With an Experimental Design

Supplemental Material for Assessing Clinical Reasoning and Decision-Making in Swedish Prehospital Emergency Care: A Mixed Methods Study With an Experimental Design by Olof Fager, Ulrika Hindsberg, André Johansson, Christer Axelsson and Magnus Andersson Hagiwara in Journal of Cognitive Engineering and Decision Making

Consent for Publication

All participants received information on the planned publication of the study results. This was clearly stated in the written information received prior to giving informed consent. All data were also pseudonymised to secure anonymity for the participants.

Footnotes

Acknowledgments

Special thanks to Christoffer Nordin and Fredrik Hindsberg, who, together with A.J., did tremendous work as actors during the simulation sessions. Also, thanks to registered nurse Louise Godden and Professor Stephen Marr for helping with the translation process of the mnemonic. We also want to thank the entire regional ambulance organisation with its educational coordinator and chief medical physician, along with the most important persons of all: the participating specialist nurses.

Authors’ Contributions

O.F. and U.H. are the main researchers and authors of this article. O.F. cared for all statistical analyses, tables, and figures. Professor M.A.H. supervised the entire process, and A.J. contributed with additional knowledge and input regarding methods from the perspective of emergency medicine triage, as well as critical appraisal during the process. C.A. provided further valuable feedback on the content. All authors read and approved the final manuscript before submission.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Ethical Statement

ORCID iD

Olof Fager

Data Availability Statement

The source data, in the form of pseudonymised transcriptions on all written reflections and the SPSS dataset with all analysed variables, is stored on secure servers at the University of Borås and will be made available by the corresponding author upon reasonable request.*

Supplemental Material

Supplemental material for this article is available online.

Olof Fager works as a nurse in the emergency department. He has a master’s degree in pre- and intrahospital emergency care.

Ulrika Hindsberg works as an ambulance nurse. She has a master’s degree in pre- and intrahospital emergency care.

André Johansson works at Lund University, training nurses in a specialist program related to emergency care. His research interest is triage in emergency care.

Christer Axelsson is an ambulance nurse and a professor of prehospital emergency care.

Magnus Andersson Hagiwara is an ambulance nurse and a professor of prehospital emergency care.

References

Ackerman

Waldron

(2006). Difficulty breathing: Agreement of paramedic and emergency physician diagnoses. Prehospital Emergency Care, 10(1), 77–80. https://doi.org/10.1080/10903120500366888

Al-Khafaji

Townsend

R. F.

Townsend

Chopra

Gupta

(2022). Checklists to reduce diagnostic error: A systematic review of the literature using a human factors framework. BMJ Open, 12(4), Article e058219. https://doi.org/10.1136/bmjopen-2021-058219

Andersson

Andersson Hagiwara

Wireklint Sundström

Andersson

Maurin Söderholm

(2022). Clinical reasoning among registered nurses in emergency medical services: A case study. Journal of Cognitive Engineering and Decision Making, 16(3), 123–156. https://doi.org/10.1177/15553434221097788

Andersson

Maurin Söderholm

Wireklint Sundström

Andersson Hagiwara

Andersson

(2019). Clinical reasoning in the emergency medical services: An integrative review. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 27(1), 76. https://doi.org/10.1186/s13049-019-0646-y

Andrew

Nehme

Cameron

Smith

(2020). Drivers of increasing emergency ambulance demand. Prehospital Emergency Care, 24(3), 385. https://doi.org/10.1080/10903127.2019.1635670

Bentivegna

Hulme

Ebell

M. H.

(2021). Primary care relevant risk factors for adverse outcomes in patients with COVID-19 infection: A systematic review. Journal of the American Board of family medicine, 34(Suppl), 113–126. https://doi.org/10.3122/jabfm.2021.S1.200429

Berthet

(2021). The impact of cognitive biases on professionals' decision-making: A review of four occupational areas. Frontiers in Psychology, 12(4), Article 802439. https://doi.org/10.3389/fpsyg.2021.802439

Bigham

B. L.

Bull

Morrison

Burgess

Maher

Brooks

S. C.

Morrison

L. J.

(2011). Patient safety in emergency medical services: Executive summary and recommendations from the niagara summit. CJEM, 13(1), 13–18. https://doi.org/10.2310/8000.2011.100232

Brabrand

Dahl

(2009). Using the SOLO taxonomy to analyze competence progression of university science curricula. Higher Education, 58(4), 531–549. https://doi.org/10.1007/s10734-009-9210-4

10.

Brannick

M. T.

Erol-Korkmaz

H. T.

Prewett

(2011). A systematic review of the reliability of objective structured clinical examination scores. Medical Education, 45(12), 1181–1189. https://doi.org/10.1111/j.1365-2923.2011.04075.x

11.

Bremer

Andersson Hagiwara

Tavares

Paakkonen

Nyström

Andersson

(2020). Translation and further validation of a global rating scale for the assessment of clinical competence in prehospital emergency care. Nurse Education in Practice, 47, Article 102841. https://doi.org/10.1016/j.nepr.2020.102841

12.

Brentnall

Thackray

Judd

(2022). Evaluating the clinical reasoning of student health professionals in placement and simulation settings: A systematic review. International Journal of Environmental Research and Public Health, 19(2), 936. https://doi.org/10.3390/ijerph19020936

13.

Brewer

N. T.

Chapman

G. B.

Schwartz

J. A.

Bergus

G. R.

(2007). The influence of irrelevant anchors on the judgments and choices of doctors and patients. Medical Decision Making: An International Journal of the Society for Medical Decision Making, 27(2), 203–211. https://doi.org/10.1177/0272989X06298595

14.

Carlström

Fredén

(2017). The first single responders in Sweden: Evaluation of a pre-hospital single staffed unit. International Emergency Nursing, 32, 15–19. https://doi.org/10.1016/j.ienj.2016.05.003

15.

Chen

Kan

Qiu

Gui

(2016). Use and implementation of standard operating procedures and checklists in prehospital emergency medicine: A literature review. The American Journal of Emergency Medicine, 34(12), 2432–2439. https://doi.org/10.1016/j.ajem.2016.09.057

16.

Chew

K. S.

Durning

S. J.

van Merriënboer

(2016a). Teaching metacognition in clinical decision-making using a novel mnemonic checklist: An exploratory study. Singapore Medical Journal, 57(12), 694–700. https://doi.org/10.11622/smedj.2016015

17.

Chew

K. S.

van Merriënboer

Durning

S. J.

(2016b). A portable mnemonic to facilitate checking for cognitive errors. BMC Research Notes, 9(1), 445. https://doi.org/10.1186/s13104-016-2249-2

18.

Chew

K. S.

van Merriënboer

Durning

S. J.

(2017). Investing in the use of a checklist during differential diagnoses consideration: what's the trade-off? BMC Medical Education, 17(1), 234. https://doi.org/10.1186/s12909-017-1078-x

19.

Chew

K. S.

van Merriënboer

Durning

S. J.

(2019). Perception of the usability and implementation of a metacognitive mnemonic to check cognitive errors in clinical setting. BMC Medical Education, 19(1), 18. https://doi.org/10.1186/s12909-018-1451-4

20.

Chrismawaty

B. E.

Emilia

Rahayu

G. R.

Ana

I. D.

(2023). Clinical reasoning pattern used in oral health problem solving: A case study in Indonesian undergraduate dental students. BMC Medical Education, 23(1), 52. https://doi.org/10.1186/s12909-022-03808-7

21.

Christie

Costa-Scorse

Nicholls

Jones

Howie

(2016). Accuracy of working diagnosis by paramedics for patients presenting with dyspnoea. Emergency Medicine Australasia: Emergency Medicine Australasia, 28(5), 525–530. https://doi.org/10.1111/1742-6723.12618

22.

Creswell

J. W.

Creswell

J. D.

(2018). Research design: Qualitative, quantitative, and mixed methods approaches (5 uppl). Sage Publications.

23.

Croskerry

(2003). The importance of cognitive errors in diagnosis and strategies to minimize them. Academic Medicine: Journal of the Association of American Medical Colleges, 78(8), 775–780. https://doi.org/10.1097/00001888-200308000-00003

24.

Croskerry

(2017). A model for clinical decision-making in medicine. Medical Science Educator, 27(S1), 9–13. https://doi.org/10.1007/s40670-017-0499-9

25.

Croskerry

Campbell

S. G.

(2021). A cognitive autopsy approach towards explaining diagnostic failure. Cureus, 13(8), Article e17041. https://doi.org/10.7759/cureus.17041

26.

Croskerry

Singhal

Mamede

(2013). Cognitive debiasing 2: Impediments to and strategies for change. BMJ Quality and Safety, 22(Suppl 2), ii65-ii72. https://doi.org/10.1136/bmjqs-2012-001713

27.

Cummins

N. M.

Dixon

Garavan

Landymore

Mulligan

O'Donnell

(2013). Can advanced paramedics in the field diagnose patients and predict hospital admission? Emergency Medicine Journal: Engineering Management Journal, 30(12), 1043–1047. https://doi.org/10.1136/emermed-2012-201899

28.

Daniel

Khandelwal

Santen

S. A.

Malone

Croskerry

(2017). Cognitive debiasing strategies for the emergency department. AEM Education and Training, 1(1), 41–42. https://doi.org/10.1002/aet2.10010

29.

Daniel

Rencic

Durning

S. J.

Holmboe

Santen

S. A.

Lang

Ratcliffe

Gordon

Heist

Lubarsky

Estrada

C. A.

Ballard

Artino

A. R.

Jr. Sergio Da Silva

Cleary

Stojan

Gruppen

L. D.

(2019). Clinical reasoning assessment methods: A scoping review and practical guidance. Academic Medicine: Journal of the Association of American Medical Colleges, 94(6), 902–912. https://doi.org/10.1097/ACM.0000000000002618

30.

Demandt

J. P. A.

Zelis

J. M.

Koks

Smits

G. H. J. M.

van der Harst

Tonino

P. A. L.

Dekker

L. R. C.

van Het Veer

Vlaar

P. J.

(2022). Prehospital risk assessment in patients suspected of non-ST-segment elevation acute coronary syndrome: A systematic review and meta-analysis. BMJ Open, 12(4), Article e057305. https://doi.org/10.1136/bmjopen-2021-057305

31.

Djärv

Castrén

Mårtenson

Kurland

(2015). Decreased general condition in the emergency department: High in-hospital mortality and a broad range of discharge diagnoses. European Journal of Emergency Medicine: Official Journal of the European Society for Emergency Medicine, 22(4), 241–246. https://doi.org/10.1097/MEJ.0000000000000164

32.

Dryver

Lundager Forberg

Hård Af Segerstad

Dupont

W. D.

Bergenfelz

Ekelund

(2021). Medical crisis checklists in the emergency department: A simulation-based multi-institutional randomised controlled trial. BMJ Quality and Safety, 30(9), 697–705. https://doi.org/10.1136/bmjqs-2020-012740

33.

Ebben

R. H. A.

Siqeca

Madsen

U. R.

Vloet

L. C. M.

van Achterberg

(2018). Effectiveness of implementation strategies for the improvement of guideline and protocol adherence in emergency care: A systematic review. BMJ Open, 8(11), Article e017572. https://doi.org/10.1136/bmjopen-2017-017572

34.

Ebben

R. H. A.

Vloet

L. C.

Verhofstad

M. H.

Meijer

Mintjes-de Groot

J. A.

van Achterberg

(2013). Adherence to guidelines and protocols in the prehospital and emergency care setting: A systematic review. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 21(1), 9. https://doi.org/10.1186/1757-7241-21-9

35.

Ebben

R. H. A.

Vloet

L. C. M.

Speijers

R. F.

Tönjes

N. W.

Loef

Pelgrim

Hoogeveen

Berben

S. A. A.

(2017). A patient-safety and professional perspective on non-conveyance in ambulance care: A systematic review. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 25(1), 71. https://doi.org/10.1186/s13049-017-0409-6

36.

Evans

(2008). Dual-processing accounts of reasoning, judgment, and social cognition. Annual Review of Psychology, 59(1), 255–278. https://doi.org/10.1146/annurev.psych.59.103006.093629

37.

Fishe

J. N.

Crowe

R. P.

Cash

R. E.

Nudell

N. G.

Martin-Gill

Richards

C. T.

(2018). Implementing prehospital evidence-based guidelines: A systematic literature review. Prehospital Emergency Care, 22(4), 511–519. https://doi.org/10.1080/10903127.2017.1413466

38.

FitzGerald

Hurst

(2017). Implicit bias in healthcare professionals: A systematic review. BMC Medical Ethics, 18(1), 19. https://doi.org/10.1186/s12910-017-0179-8

39.

Föreningen för Ledningsansvariga inom Svensk Ambulanssjukvård . (2017). Behandlingsriktlinjer: Nationella riktlinjer för ambulanssjukvården. https://www.s112.se/wp-content/uploads/2018/05/SLAS-behandlingsriktlinjer-Vuxen-och-barn-20180102-2.pdf

40.

Fournier

J. P.

Demeester

Charlin

(2008). Script concordance tests: Guidelines for construction. BMC Medical Informatics and Decision Making, 8(18), 18. https://doi.org/10.1186/1472-6947-8-18

41.

Glawing

Karlsson

Kylin

Nilsson

(2024). Work-related stress, stress reactions and coping strategies in ambulance nurses: A qualitative interview study. Journal of Advanced Nursing, 80(2), 538–549. https://doi.org/10.1111/jan.15819

42.

Gopal

D. P.

Chetty

O'Donnell

Gajria

Blackadder-Weinstein

(2021). Implicit bias in healthcare: Clinical practice, research and decision making. Future Healthcare Journal, 8(1), 40–48. https://doi.org/10.7861/fhj.2020-0233

43.

Graber

M. L.

Kissam

Payne

V. L.

Meyer

A. N.

Sorensen

Lenfestey

Tant

Henriksen

Labresh

Singh

(2012). Cognitive interventions to reduce diagnostic error: A narrative review. BMJ Quality and Safety, 21(7), 535–557. https://doi.org/10.1136/bmjqs-2011-000149

44.

Groombridge

C. J.

Kim

Maini

Smit

Fitzgerald

M. C.

(2019). Stress and decision-making in resuscitation: A systematic review. Resuscitation, 144, 115–122. https://doi.org/10.1016/j.resuscitation.2019.09.023

45.

Gugiu

McKenna

K. D.

Platt

T. E.

Panchal

A. R.

(2022). A proposed theoretical framework for clinical judgment in EMS. Prehospital emergency care. Advance online publication. https://doi.org/10.1080/10903127.2022.2048756

46.

Hagiwara

M. A.

Magnusson

Herlitz

Seffel

Axelsson

Munters

Strömsöe

Nilsson

(2019). Adverse events in prehospital emergency care: A trigger tool study. BMC Emergency Medicine, 19(1), 14. https://doi.org/10.1186/s12873-019-0228-3

47.

Hagiwara

M. A.

Suserud

B. O.

Jonsson

Henricsson

(2013). Exclusion of context knowledge in the development of prehospital guidelines: Results produced by realistic evaluation. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 21(1), 46. https://doi.org/10.1186/1757-7241-21-46

48.

Hammond

M. E. H.

Stehlik

Drakos

S. G.

Kfoury

A. G.

(2021). Bias in medicine: Lessons learned and mitigation strategies. JACC. Basic to Translational Science, 6(1), 78–85. https://doi.org/10.1016/j.jacbts.2020.07.012

49.

Harden

R. M.

Stevenson

Downie

W. W.

Wilson

G. M.

(1975). Assessment of clinical competence using objective structured examination. British Medical Journal, 1(5955), 447–451. https://doi.org/10.1136/bmj.1.5955.447

50.

Hazel

Prosser

Trigwell

(2002). Variation in learning orchestration in university biology courses. International Journal of Science Education, 24(7), 737–751. https://doi.org/10.1080/09500690110098886

51.

Hjalmarsson

Holmberg

Asp

Östlund

Nilsson

K. W.

Kerstis

(2020). Characteristic patterns of emergency ambulance assignments for older adults compared with adults requiring emergency care at home in Sweden: A total population study. BMC Emergency Medicine, 20(1), 94. https://doi.org/10.1186/s12873-020-00387-y

52.

Hoban

(2017). Assessing for head injury in alcohol-intoxicated patients. Emergency Nurse: The Journal of the RCN Accident and Emergency Nursing Association, 25(5), 30–33. https://doi.org/10.7748/en.2017.e1670

53.

Höglund

Andersson-Hagiwara

Schröder

Möller

Ohlsson-Nevo

(2020). Characteristics of non-conveyed patients in emergency medical services (EMS): A one-year prospective descriptive and comparative study in a region of Sweden. BMC Emergency Medicine, 20(1), 61. https://doi.org/10.1186/s12873-020-00353-8

54.

Ibsen

Dam-Huus

K. B.

Nickel

C. H.

Christensen

E. F.

Søvsø

M. B.

(2022). Diagnoses and mortality among prehospital emergency patients calling 112 with unclear problems: A population-based cohort study from Denmark. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 30(1), 70. https://doi.org/10.1186/s13049-022-01052-y

55.

Ibsen

Lindskou

T. A.

Nickel

C. H.

Kløjgård

Christensen

E. F.

Søvsø

M. B.

(2021). Which symptoms pose the highest risk in patients calling for an ambulance? A population-based cohort study from Denmark. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 29(1), 59. https://doi.org/10.1186/s13049-021-00874-6

56.

Ilgüy

Fişekçioğlu

Oktay

(2014). Comparison of case-based and lecture-based learning in dental education using the SOLO-taxonomy. Journal of Dental Education, 78(11), 1521–1527. https://doi.org/10.1002/j.0022-0337.2014.78.11.tb05827.x

57.

International council for harmonisation of technical requirements for pharmaceuticals for human use . (2016). Guideline for good clinical practice (E6R2). https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf

58.

Ivic

Kurland

Vicente

Castrén

Bohm

(2020). Serious conditions among patients with non-specific chief complaints in the pre-hospital setting: A retrospective cohort study. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 28(1), 74. https://doi.org/10.1186/s13049-020-00767-0

59.

Johansson

Ekwall

Lundager Forberg

Ekelund

(2023). Development of outcomes for evaluating emergency care triage: A delphi approach. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 31(1), 10. https://doi.org/10.1186/s13049-023-01073-1

60.

Kahneman

Klein

(2009). Conditions for intuitive expertise: A failure to disagree. American Psychologist, 64(6), 515–526. https://doi.org/10.1037/a0016755

61.

Kerner

Schmidbauer

Tietz

Marung

Genzwuerker

H. V.

(2017). Use of checklists improves the quality and safety of prehospital emergency care. European Journal of Emergency Medicine: Official Journal of the European Society for Emergency Medicine, 24(2), 114–119. https://doi.org/10.1097/MEJ.0000000000000315

62.

Kersting

Kneer

Barzel

(2020). Patient-relevant outcomes: What are we talking about? A scoping review to improve conceptual clarity. BMC Health Services Research, 20(1), 596. https://doi.org/10.1186/s12913-020-05442-9

63.

H. C.

Turner

T. J.

Finnigan

M. A.

(2011). Systematic review of safety checklists for use by medical care teams in acute hospital settings: Limited evidence of effectiveness. BMC Health Services Research, 11, 211. https://doi.org/10.1186/1472-6963-11-211

64.

Koivulahti

Tommila

Haavisto

(2020). The accuracy of preliminary diagnoses made by paramedics - a cross-sectional comparative study. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 28(1), 70. https://doi.org/10.1186/s13049-020-00761-6

65.

Kourtidis

Nurek

Delaney

Kostopoulou

(2022). Influences of early diagnostic suggestions on clinical reasoning. Cognitive Research: Principles and Implications, 7(1), 103. https://doi.org/10.1186/s41235-022-00453-y

66.

Lambe

K. A.

O'Reilly

Kelly

B. D.

Curristan

(2016). Dual-process cognitive interventions to enhance diagnostic reasoning: A systematic review. BMJ Quality and Safety, 25(10), 808–820. https://doi.org/10.1136/bmjqs-2015-004417

67.

Leblanc

V. R.

Regehr

Tavares

Scott

A. K.

Macdonald

King

(2012). The impact of stress on paramedic performance during simulated critical events. Prehospital and Disaster Medicine, 27(4), 369–374. https://doi.org/10.1017/S1049023X12001021

68.

Lederman

Lindström

Elmqvist

Löfvenmark

Djärv

(2020). Non-conveyance in the ambulance service: A population-based cohort study in Stockholm, Sweden. BMJ Open, 10(7), Article e036659. https://doi.org/10.1136/bmjopen-2019-036659

69.

Lee

K. C.

(2021). The lasater clinical judgment rubric: Implications for evaluating teaching effectiveness. Journal of Nursing Education, 60(2), 67–73. https://doi.org/10.3928/01484834-20210120-03

70.

Lubarsky

Charlin

Cook

Chalk

van der Vleuten

. Script concordance testing: A review of published validity evidence. Med Educ. 2011 Apr;45(4):329-38. https://doi.org/10.1111/j.1365-2923.2010.03863.x. PMID: 21401680.

71.

Lucander

Bondemark

Brown

Knutsson

(2010). The structure of observed learning outcome (SOLO) taxonomy: A model to promote dental students' learning. European Journal of Dental Education: Official Journal of the Association for Dental Education in Europe, 14(3), 145–150. https://doi.org/10.1111/j.1600-0579.2009.00607.x

72.

Ludolph

Schulz

P. J.

(2018). Debiasing health-related judgments and decision making: A systematic review. Medical Decision Making: An International Journal of the Society for Medical Decision Making, 38(1), 3–13. https://doi.org/10.1177/0272989X17716672

73.

Macauley

Brudvig

T. J.

Kadakia

Bonneville

(2017). Systematic review of assessments that evaluate clinical decision making, clinical reasoning, and critical thinking changes after simulation participation. Journal, Physical Therapy Education. 31(4), 64–75. https://doi.org/10.1097/JTE.0000000000000011

74.

Magnusson

Axelsson

Nilsson

Strömsöe

Munters

Herlitz

Hagiwara

M. A.

(2018). The final assessment and its association with field assessment in patients who were transported by the emergency medical service. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 26(1), 111. https://doi.org/10.1186/s13049-018-0579-x

75.

Magnusson

Hagiwara

M. A.

Norberg-Boysen

Kauppi

Herlitz

Axelsson

Packendorff

Larsson

Wibring

(2022). Suboptimal prehospital decision- making for referral to alternative levels of care - frequency, measurement, acceptance rate and room for improvement. BMC Emergency Medicine, 22(1), 89. https://doi.org/10.1186/s12873-022-00643-3

76.

Mamede

de Carvalho-Filho

M. A.

de Faria

R. M. D.

Franci

Nunes

M. D. P. T.

Ribeiro

L. M. C.

Biegelmeyer

Zwaan

Schmidt

H. G.

(2020). Immunising' physicians against availability bias in diagnostic reasoning: A randomised controlled experiment. BMJ Quality and Safety, 29(7), 550–559. https://doi.org/10.1136/bmjqs-2019-010079

77.

Martin-Gill

Brown

K. M.

Cash

R. E.

Haupt

R. M.

Potts

B. T.

Richards

C. T.

Patterson

P. D.

Prehospital Guidelines Consortium . (2023). 2022 Systematic review of evidence-based guidelines for prehospital care. Prehospital Emergency Care, 27(2), 131–143. https://doi.org/10.1080/10903127.2022.2143603

78.

Miraglia

Asselin

M. E.

(2015). Reflection as an educational strategy in nursing professional development: An integrative review. Journal for Nurses in Professional Development, 31(2), 62–67. https://doi.org/10.1097/NND.0000000000000151

79.

Morgado

Sousa

Cerqueira

J. J.

(2015). The impact of stress in decision making in the context of uncertainty. Journal of Neuroscience Research, 93(6), 839–847. https://doi.org/10.1002/jnr.23521

80.

National Association of Emergency Medical Technicians . (2017). Amls: Advanced medical life support: An assessment-based approach (2 ed). Jones & Bartlett Learning.

81.

National Association of Emergency Medical Technicians & American College of Surgeons . (2020). Phtls: Prehospital trauma life support (8 ed). Jones & Bartlett Learning.

82.

Nilsson

Lindström

(2016). Clinical decision-making described by Swedish prehospital emergency care nurse students: An exploratory study. International Emergency Nursing, 27, 46–50. https://doi.org/10.1016/j.ienj.2015.10.006

83.

O’Sullivan

E. D.

Schofield

S. J.

(2019). A cognitive forcing tool to mitigate cognitive bias – a randomised control trial. BMC Medical Education, 19(1), 12. https://doi.org/10.1186/s12909-018-1444-3

84.

Pelaccia

Tardif

Triby

Charlin

(2011). An analysis of clinical reasoning through a recent and comprehensive approach: The dual-process theory. Medical Education Online, 16(10), 5890. https://doi.org/10.3402/meo.v16i0.5890

85.

Perona

Rahman

M. A.

O’Meara

(2019). Paramedic judgement, decision-making and cognitive processing: A review of the literature. Australasian Journal of Paramedicine, 16, 1–12. https://doi.org/10.33151/ajp.16.586

86.

Pinto Zipp

Maher

Donnelly

Fritz

Snowdon

(2016). Academicians and neurologic physical therapy residents partner to expand clinical reflection using the SOLO-taxonomy: A novel approach. Journal Of Allied Health, 45(2), 15–20.

87.

Pottier

Dejoie

Hardouin

J. B.

Le Loupp

A. G.

Planchon

Bonnaud

Leblanc

V. R.

(2013). Effect of stress on clinical reasoning during simulated ambulatory consultations. Medical Teacher, 35(6), 472–480. https://doi.org/10.3109/0142159X.2013.774336

88.

Prakash

Bihari

Need

Sprick

Schuwirth

(2017). Immersive high fidelity simulation of critically ill patients to study cognitive errors: A pilot study. BMC Medical Education, 17(1), 36. https://doi.org/10.1186/s12909-017-0871-x

89.

Prakash

Sladek

R. M.

Schuwirth

(2019). Interventions to improve diagnostic decision making: A systematic review and meta-analysis on reflective strategies. Medical Teacher, 41(5), 517–524. https://doi.org/10.1080/0142159X.2018.1497786

90.

Predicare . (2023). RETTS online. https://rettsonline-app.com/

91.

Ramadanov

Klein

Laue

Behringer

(2019). Diagnostic agreement between prehospital emergency and in-hospital physicians. Emergency Medicine International, 2019, Article 3769826. https://doi.org/10.1155/2019/3769826

92.

Saposnik

Redelmeier

Ruff

C. C.

Tobler

P. N.

(2016). Cognitive biases associated with medical decisions: A systematic review. BMC Medical Informatics and Decision Making, 16(1), 138. https://doi.org/10.1186/s12911-016-0377-1

93.

Sherbino

Kulasegaram

Howey

Norman

(2014). Ineffectiveness of cognitive forcing strategies to reduce biases in diagnostic reasoning: A controlled trial. Canadian Journal of Emergency Medicine, 16(1), 34–40. https://doi.org/10.2310/8000.2013.130860

94.

Sherbino

Yip

Dore

K. L.

Siu

Norman

G. R.

(2011). The effectiveness of cognitive forcing strategies to decrease diagnostic error: An exploratory study. Teaching and Learning in Medicine, 23(1), 78–84. https://doi.org/10.1080/10401334.2011.536897

95.

Singh

Khan

Scott

Jasper

Singh

(2017). Lesson of the month 2: A choroid plexus papilloma manifesting as anorexia nervosa in an adult. Clinical Medicine, 17(2), 183–185. https://doi.org/10.7861/clinmedicine.17-2-183

96.

Sousa

V. D.

Rojjanasrirat

(2011). Translation, adaptation and validation of instruments or scales for use in cross-cultural health care research: A clear and user-friendly guideline. Journal of Evaluation in Clinical Practice, 17(2), 268–274. https://doi.org/10.1111/j.1365-2753.2010.01434.x

97.

Spangler

Hermansson

Smekal

Blomberg

(2019). A validation of machine learning-based risk scores in the prehospital setting. PLoS One, 14(12), Article e0226518. https://doi.org/10.1371/journal.pone.0226518

98.

Staal

Hooftman

Gunput

S. T. G.

Mamede

Frens

M. A.

Van den Broek

W. W.

Alsma

Zwaan

(2022). Effect on diagnostic accuracy of cognitive reasoning tools for the workplace setting: Systematic review and meta-analysis. BMJ Quality and Safety, 31(12), 899–910. https://doi.org/10.1136/bmjqs-2022-014865

99.

Swedish Ethical Review Act . (2003). Ministry of Education and Research. https://rkrattsbaser.gov.se/sfst?bet=2003:460

100.

Swedish Higher Education Regulation . (1993). Ministry of education and research. https://rkrattsbaser.gov.se/sfst?bet=1993:100

101.

Swedish National Agency for Education . (2011). Kunskapsbedömning: Vad, hur och varför? https://www.skolverket.se/publikationsserier/kunskapsoversikter/2011/kunskapsbedomning---vad-hur-och-varfor

102.

Swedish National Board of Health and Welfare . (2023a). Sveriges prehospitala akutsjukvård (Swedish prehospital emergency care). https://www.socialstyrelsen.se/globalassets/sharepoint-dokument/artikelkatalog/ovrigt/2023-2-8337.pdf

103.

Swedish National Board of Health and Welfare . (2023b). Sveriges prehospitala akutsjukvård (Swedish prehospital emergency care) – Appendix 10. https://www.socialstyrelsen.se/globalassets/sharepoint-dokument/artikelkatalog/ovrigt/2023-2-8337-bilaga10-enkatsvar.pdf

104.

Swedish Regulations on Prehospital Care . (2009). Swedish national board of health and Welfare. https://www.socialstyrelsen.se/kunskapsstod-och-regler/regler-och-riktlinjer/foreskrifter-och-allmanna-rad/konsoliderade-foreskrifter/200910-om-ambulanssjukvard-m.m/

105.

Tavares

Boet

Theriault

Mallette

Eva

K. W.

(2013). Global rating scale for the assessment of paramedic clinical competence. Prehospital Emergency Care, 17(1), 57–67. https://doi.org/10.3109/10903127.2012.702194

106.

Tiffen

Corbridge

S. J.

Slimmer

(2014). Enhancing clinical decision making: Development of a contiguous definition and conceptual framework. Journal of Professional Nursing: Official Journal of the American Association of Colleges of Nursing, 30(5), 399–405. https://doi.org/10.1016/j.profnurs.2014.01.006

107.

Walsh

Bailey

P. H.

Koren

(2009). Objective structured clinical evaluation of clinical competence: An integrative review. Journal of Advanced Nursing, 65(8), 1584–1595. https://doi.org/10.1111/j.1365-2648.2009.05054.x

108.

Widgren

B. M.

Jourak

(2011). Medical emergency triage and treatment system (METTS): A new protocol in primary triage and secondary priority decision in emergency medicine. Journal of Emergency Medicine, 40(6), 623–628. https://doi.org/10.1016/j.jemermed.2008.04.003

109.

Williamson

G. R.

(2005). Illustrating triangulation in mixed-methods nursing research. Nurse Researcher, 12(4), 7–18. https://doi.org/10.7748/nr2005.04.12.4.7.c5955

110.

Wilson

Harley

Steels

(2018). Systematic review and meta-analysis of pre-hospital diagnostic accuracy studies. Emergency Medicine Journal: Engineering Management Journal, 35(12), 757–764. https://doi.org/10.1136/emermed-2018-207588

111.

Wireklint

S. C.

Elmqvist

Parenti

Göransson

K. E.

(2018). A descriptive study of registered nurses’ application of the triage scale RETTS©; a Swedish reliability study. International Emergency Nursing, 38, 21–28. https://doi.org/10.1016/j.ienj.2017.12.003

112.

World Medical Association . (2013). World Medical Association declaration of Helsinki: ethical principles for medical research involving human subjects. JAMA, 310(20), 2191–2194. https://doi.org/10.1001/jama.2013.281053

113.

Yazdani

Hoseini Abardeh

(2019). Five decades of research and theorization on clinical reasoning: A critical review. Advances in Medical Education and Practice, 10, 703–716. https://doi.org/10.2147/AMEP.S213492

114.

J. Y.

Xie

Nan

Yoon

Ong

M. E. H.

Y. Y.

Cha

W. C.

(2022). An external validation study of the Score for Emergency Risk Prediction (SERP), an interpretable machine learning-based triage score for the emergency department. Scientific Reports, 12(1), Article 17466. https://doi.org/10.1038/s41598-022-22233-w

115.

(2016). Stress potentiates decision biases: A stress induced deliberation-to-intuition (sidi) model. Neurobiology of Stress, 3, 83–95. https://doi.org/10.1016/j.ynstr.2015.12.006

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.12 MB