Abstract
Background
Delirium is a neuropsychiatric syndrome common in critical illness. Worsening delirium severity is associated with poorer clinical outcomes, yet its assessment remains under-reported with most severity assessment tools not validated for critical care. The DRS-R98 is a widely applied and validated tool. The aim of this project is to report the validation and utility of the DRS-R98 in critical illness.
Methods
This prospective, cohort study was conducted in adults with delirium admitted to a critical care unit and predicted to stay for ≥ 24 h. We excluded patients with severe neurological or communication barriers that would have interfered with the DRS-R98 assessment. Patients were screened using a delirium detection algorithm and the Confusion Assessment Method for the Intensive Care Unit. Eligible patient informations were collected and reported to qualified assessor/s before visiting clinical areas, confirming delirium presence and undertaking DRS- R98 assessments. To assess the tool's construct validity, an intensivist completed the Clinical Global Impression-Scale (CGI-S). To calculate the inter-rater reliability (IRR) a subset of patients were simultaneously evaluated by two assessors.
Results
We assessed 22 patients, 73% were male, with a median age of 65 years (IQR14). The DRS -R98 mean (SD) severity score was 24 (+/-7.7), total scale was 29 (+/18.0), and CGI-S 3.5 (+/11.5). Assessment duration was 90 min (+/-55) and 15 min (+/-5) for record data extraction and clinical assessment respectively. The CGI-S significantly correlated with DRS-R98 severity (r = 0.626) and total (r = 0.628) scales. The DRS-R98 Cronbach's alpha was 0.896 for severity scale and 0.886 for total scale. The inter-rater reliability (IRR) was assessed in six patients and reported an inter-correlation coefficient of 0.505 (p = 0.124) and 0.565 (p = 0.93) for the severity and total scale respectively.
Conclusions
In critical care, the Delirium Rating Scale R98 had good construct validity, excellent internal consistency, and moderate inter-rater reliability.
Background
Delirium is a neuropsychiatric presentation common in critical illness. The incidence is as high as 89% in mechanically ventilated patients. 1 The cause of this is multifactorial and a mixture of predisposing risk factors including older age (> 65 years), frailty, alcohol dependency, medication with a high anticholinergic burden and precipitating risk factors sepsis, exposure to deliriogenic medications (eg, benzodiazepines and opioids) and disturbance in circadian rhythm.2,3
Several outcomes are used as primary measures in delirium prevention and treatment. Prior to around 2010, there was greater emphasis on perceiving delirium as a binary condition, with outcomes that focused on delirium resolution dominating the literature. 4
After 2010, there has been greater interest in considering delirium as a continuum and in reporting outcomes which include delirium intensity and changes over time. Such outcomes can capture the different dimensions of delirium with unique features which binary or categorical outcomes can overlook.
The concept of delirium as a continuum can support clinicians in monitoring a patients’ progress. Greater delirium severity is generally associated with poorer outcomes, both during hospitalisation and on discharge 5 ,including increase in long term cognitive disturbance.6,7 Therefore delirium severity assessment could be a suitable outcome measure for clinical and research purposes. Rose et al recently included delirium severity as a core delirium outcome in their proposed core outcome set which was developed by a group of international delirium experts and organisations that support those surviving critical care illness “Del-COrS group.”8,9
Although a number of psychometric tools have been designed and validated to assess and numerically score delirium severity in hospital inpatients.4,10,11 The use of delirium severity assessment scales in critical illness remains limited.12,13 The majority of these tools were not designed or validated to assess delirium severity in patients admitted to a critical care setting. 14
The revised delirium rating scale (DRS-R98) remains one of the most validated, widely translated, and extensively applied delirium severity assessment scales outside of critical care, including medical, surgical, and older age population.13,15–47 The DRS-R98 has been applied in critical care, yet as far as we are aware, it has not been validated as a delirium severity assessment tool in this setting.13,14 The validation of a given tool in a specific setting is an essential step to justify its adaptation in research and integration into clinical practice.
Study aim
The aim of this study is to report our experience in validating the DRS-R98 in a critical care setting.
Method
Study Design
Prospective, cross-sectional, single-centre cohort project.
Study Setting
The study assessment was undertaken in adult tertiary referral critical care unit at Guy's and St Thomas (GSTT) NHS Foundation Trust, London, UK from 01/02/2016 to 20/09/2018.
Ethical Approval
Delirium severity assessment was included as a core aspect of the GSTT 2016 critical care analgesia, sedation, and delirium guidelines. 48 A clinical service evaluation authorisation request was submitted to GSTT research and development department (project 5711) to authorise the study. The study was approved and full application for ethical approval was deemed unnecessary by research and development. The intensivist in charge of the patient's care were contacted before each assessment and the patient, family and visitors were appraised whenever possible.
Participants
Patients eligible for delirium severity assessment included patients who were admitted to critical care and, expected to stay for longer than 24 h. A member of the assessment team confirmed they met the delirium DSM-IV diagnostic criteria on assessment day. 49 Patients were not eligible if there was a neurological or communication barrier which could significantly interfere with the ability to collect data needed to evaluate most of the DRS-R98 scale items. This included severe cerebrovascular accident (CVA), post-traumatic brain injury (TBI) with evidence of hypoxic brain damage, and if the patient was deeply sedated [Richmond Agitation Sedation Scale (RASS) ≤ -3].
Procedures
The core team were :- research fellow (EA), psychiatric nurse/psychologist (MB), neuropsychiatry registrar (AP), consultant neuropsychiatrist (MP) and consultant pharmacist (CM).
Prior to the study; a copy of the DRS-R98 administration manual was provided to the team. 50 EA facilitated a meeting with primary assessors (MP and MB) to discuss the approach to assess different items in selected patients.
A delirium detection algorithm for ICU was used to screen and identify eligible patients. 51 This included applying collected data to perform a delirium detection chart review and was followed by formal delirium screening whenever feasible, using the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU).52,53 Patients were screened, on average twice weekly over the entire study period, although sometimes considerably less depending on the team's clinical demands which took precedence.
The collected data included delirium relevant multidisciplinary team (MDT) notes from previous 24 h, medication history and selected neurological assessments (eg, CAM-ICU and RASS). All data was collected and stored in an anonymised case report form (CRF). Historical, social, and clinical information relevant to delirium assessment and management was extracted by EA.
The assessors (MP and AP or MB) liaised with EA and arranged a time to visit the patient within 2 h of initial screening. While visiting, the primary assessors (MP or MB) confirmed the patient met the delirium DSM-IV diagnostic criteria. 49 The DRS-R98 scale items were then scored by assessors’ (MP and AP or MB) using extracted data, clinical observations, interviewing staff, family, and/or caregivers. The DRS-R98 form was completed concurrently, collected by EA and stored securely for analysis.
Concurrent validity of the DRS-R98 was evaluated through direct comparison of the DRS-R98 score with the (CGI-S) Score assessed by the consultant intensivist in charge of patient care. 54 To assess inter-rater reliability (IRR), a subset of patients were evaluated by MP and AP. The primary assessor (MP) would undertake a clinical interview, and DRS-R98 forms were independently and simultaneously completed by both assessors.
To assess the tool's ability to detect change over time, EA tracked recruited patients to identify those who had recovered from delirium before critical care discharge, and to evaluate post recovery using DRS-R98 severity subscale, and the Clinical Global improvement scale (CGI-I).
Figure 1 illustrates the study flow chart.

DRS-R98 process, from screening to validation.
Data Management
All patient identifiers (eg, name, hospital number, gender, critical care admission and discharge dates) were coded and saved in a separate encrypted folder.
Definitions and Measurement Tools
Revised Delirium Rating Scale- -R98
The DRS-R98 consisted of 16 items split into diagnostic and severity scales. 55 The diagnostic scale includes three items and severity scale 13 items. Each item is rated on a scale of 0 to 2 or 3, with higher scores indicating greater severity. At baseline, both diagnostic and severity subscales are assessed to give total score, and a follow-up assessment is completed using severity scale only. The total scale score maximum is 46, and the severity scale score maximum is 39.
The DRS-R98 severity scale can then be subclassified into non-cognitive scores (scores from items 1-8) and cognitive scores (scores from items 9-13), which provided additional delirium severity monitoring strategies.26,42,56
Figure 2 demonstrates the DRS-R98 content, structure, sub-classifications, the different severity stages thresholds and ranges used in this study.

DRS-R98indivdual components and subclassifications.
Clinical Global Impression-Severity and Improvement Scales
The clinical global impression severity (CGI-S) and the clinical severity improvement (CGI-I) are clinician rated scales commonly used in psychiatry, both are applied by clinicians without training. 57 The CGI-S scale is a single overall impression of patient's psychiatric illness. It rates severity on a Likert-type scale ranging from 1 (normal) to 7 (most severely observed).
The CGI-I scale score reflects the degree of (change) in patient's psychiatric illness status from baseline and is rated by 7 points. Likert scale ranging from 1 (very much improved) to 7 (very much worse).
Motor Subtype Classification
A motor subtype classification was applied using the DRS-R98 severity scale scores on items 7 (motor agitation) and item 8 (motor retardation) scored. Patients with a score of ≥ 1 on item 7 of the DRS-R98 were classified as hyperactive, those with score of ≥ 1 on i 8 were classified as hypoactive, while score ≥ 1 on both items 7 and 8 were classified as having a mixed motor subtype. 58
Delirium Recovery
Defined as resolution of delirium after treatment was initiated by the clinician in charge over the same critical care admission. It was also determined by documentation of two consecutive negative CAM-ICU recordings within the previous 24 h and applying the delirium detection algorithm. Recovery was further confirmed by MP or MB using the DSM-IV criteria.
Sample Size
The sample size was calculated, a priori, with advice from medical statistics at Kings College London (KCL). 59 The analysis was based on values from the original validation study, where the DRS-R98 total score was correlated with the CGI-S (clinical global impression severity scale) with a correlation coefficient (r) of 0.62. 59
To demonstrate the DRS-R98's validity to measure delirium severity in critical illness a minimum of 18 patients were required to detect a correlation of 0.62 or higher between DRS-R98 total and CGI-S scores, (p ≤ 0.05, Power = 80%). 60 To determine the tools’ inter-rater reliability (IRR) a minimum sample of 10 patients was needed to detect a concordance between assessors of 0.7 or more via calculating the scales intraclass correlation.60,61
Statistical Analysis
Continuous variables (eg, age and scales scores) were described as a mean (with standard deviation) or median (with a range). Categorical variables (eg, gender) were presented using actual number and /or percentages.
Construct validity was assessed by correlating DRS-R98 total and severity scale scores with the CGI-S scores using Pearson's correlation coefficient, while the internal consistency was evaluated by Cronbach's alpha coefficient.
The inter rater-reliability (IRR) was estimated by calculating Intra-Class Correlation (ICC) between paired assessors scores [using two way mixed model-reported for single measure]. 62
The missing or non-assessable items were substituted with Pearson mean value calculated from the remaining scored items values in the assessed case. 63 All analyses was undertaken with Graph-pad Prism 7 and IBM SPSS 24 software.
Results
Participants and Scales
Twenty-two critical care patients were assessed with the DRS-R98 over the study period of 31.5 months. Although the actual recruitment time was about six months of the 31.5, because clinical work took precedence. The recruited patients were largely admitted to level 3 (highest acuity) critical care unit (n = 18, 82%), 64 most were males (n = 16, 73%), with a mean age of 65 years (SD+/−14), two had a previous medical history (PMH) of dementia (9%), five with a history of psychiatric and/or neurological disease (23%) and three reported alcohol or drug dependence (14%).
Most were able to communicate verbally (n = 16,73%), included patients communicating via a tracheostomy placed speaking valve.
Fourteen patients (64%) had an atypical antipsychotic prescribed for agitation and /or delirium on assessment. The most frequently prescribed agent was quetiapine (n = 8), followed by olanzapine (n = 4).
Thirteen patients (59%) were prescribed a sedative on assessment day, the most frequently prescribed agent was propofol (n = 7). Seventeen patients (77%) were also prescribed an analgesic, with the most frequently prescribed agents being paracetamol (n = 11) and fentanyl (n = 4). Patients’ characteristics including delirium relevant previous medical history and treatments are described in [Table 1].
Patients Demographics, Administered Medications and Delirium Motor Subtype.
N number of recruited subjects, PMH pre-admission medical history, * mean (standard deviation), ** count (percentage), ◊ on assessment day,
Thirty five percent patients were classified with mild, 40% moderate and 25% severe delirium.
In 17 patients (77%) the motor subtype was reported as mixed, four (18%) were hyperactive delirium and one (5%) hypoactive delirium.
A summary of the mean (+/− standard deviation) of DRS-R98 (total and severity) overall and individual items scores calculated and Clinical Global Impression severity scale scores, are provided in [Table 2].
Mean and Standard Deviation of the DRS-R98 (Total and Severity) Overall and Individual Items Scores, Internal Consistency, and CGI-S Scores.
* Mean and standard deviation calculated using imputed data set, N number of patients, ** missing data analysed using the non-imputed data set scored by primary assessors, r Cronbach's Alpha coefficient after item deletion, NA not applicable, DRS-R98 severity scale includes items 1 to 13, DRS-R98 total scale includes items 1 to 16
Main Results
The DRS-R98 severity scale score (r = 0.626, p = 0.002) and total score (r = 0.628, p = 0.002) correlated with CGI-S score. The DRS-R98's internal consistency was high, with Cronbach's alpha coefficient of 0.896 for DRS-R-98 severity scale items and 0.886 for the total scale. When the effect of individual items were deleted, coefficients of DRS-R98 severity scale ranged from 0.877 to 0.913 and for the total scale items from 0.867 to 0.903. Table 2 provides a summary of DRS-R98 internal consistency for both total and the severity scales (including item deletion).
The intraclass correlation coefficient (ICC) calculated between scores of paired assessors in six patients was moderate. The ICC was 0.505 (95% CI −0.988-0.961, p = 0.124) for the DRS-R98 severity scale and for 0.565 (95% CI −0.403-0.912, p = 0.093) for the total scale.
The scale's sensitivity to change overtime was not calculated, this was because none of the pre-assessed patients fully recovered from delirium and therefore qualified for post-assessment evaluation during their critical care admissions and approval was only given for critical care areas.
Scale Implantation, Duration and Missing Data
The mean time for data extraction and collection process was 90 min (SD +/- 55), and mean duration for clinical assessment and DRS-R98 scoring by primary assessor was 15 min (SD+/- 5).
Fifteen (68%) of assessed cases had incomplete data, with a total of 43 missing values (12.2%). The most common reason (n = 40, 93%) for incomplete data was inability to assess.
The DRS-R98 scale items with no missing data were item 1 (sleep awake cycle), item 4 (affect liability), item 7 (motor agitation), item 8 (motor retardation), item 14 (temporal onset), item 15 (fluctuation) and item 16 (physical disorder). The DRS-R98 items with most missing data values were item 13 (visual spatial ability, 33% and 38% respectively), item 12 (long term memory, 33% and 31%) and item 11 (short term memory, 33% and 25%) as rated by MP and MB sucessively.
The percentage of missing data values for DRS-R98 items scored varied between experts MP and MB (19.7% and 9.3%, respectively).
Discussion
In this project, we describe the validation of the DRS-R98 in 22 critical care patients in delirium. As far as we are aware, this is the first validation report for the DRS-R98 in a critical illness setting.
Our work demonstrates that the DRS-R98, when applied in critical care, exhibited a strong and significant correlation with the CGI-severity scale; this suggests excellent construct validity. The DRS-R98's Cronbach's alpha coefficient which assesses for internal consistency and homogeneity was excellent, even after individual item deletion. These findings were aligned with the DRS-R98 original validation work and validation studies in different inpatient settings.18,19,59
In 2018, Jones et al published a systematic review of delirium severity instruments and DRS-R98 had one of the highest rating. 65 The DRS-R98's multidimensional structure gives a detailed description of delirium phenomenology and prevention/treatment strategies. 66
Many critically ill patients (eg, intubated patients, severely ill) have limited ability to engage in interviews and assessments especially those with a predominant verbal component. 67 One of the advantages of the DRS-R98 is that it integrates information resources including direct observation, staff interview and medical records, 50 and thus more favourable to critical care. Furthermore, the DRS-R98 is one of the few severity assessment tools with a detailed application manual, where individual items are anchored with clear descriptions. 50 This further supports the severity assessment process. Finally using a numeric scale rather than a categorical screen. eg, CAM-ICU gives the opportunity to assess the severity of a patient's delirium over time and could, one could argue be better suited to delirium's fluctuant nature.
The project had several strengths including pragmatic design which reflected the less restrictive recruitment criteria, its implementation by primary assessors who were experts and able to complete the assessment using the DRS-R98 manual without additional training. Moreover, the recruited patient scores included patients with different stages of delirium severity and with a higher mean delirium severity scores than those reported in earlier validation studies where severe delirium is frequently underrepresented.13,15,16,18,20–22,59
We acknowledge that the project had its limitations. Firstly, the DRS-R98 construct validity was only assessed via correlation with CGI-S. At the design stage, other concurrent validity evaluation methods were considered. This included correlating the DRS-R98 scores with other delirium severity assessment scales, but the investigators considered this impossible to achieve as, at the time, there were no delirium severity scales that were validated in critical illness.
A second limitation was that the DRS-R98 scores were not correlated with clinical outcomes such as length of stay and mortality. This was too difficult to achieve because of the study's irregular collection schedule and lack of access to patient data once discharged from the critical care area (our approval to undertake the DRS-R98 as a service evaluation was strictly for critical care).
Moreover, the inter-rater reliability was determined by calculating the intra-class correlation (ICC) coefficient. The ICC calculated between MP and AP scores was moderate and not statistically significant. This was lower than that reported in the original and majority of validation studies, which consistently reported excellent (≥ 0.9) and statistically significant ICC.15,16,18,19–22,59 Reasons for that were that we were unable to recruit our a priori agreed sample size for the assessment and despite being undertaken by experts it was not preceded with extensive training. The IRR was only assessed in six patients, because the second assessor had additional clinical commitments that took precedence, and we were unable to recruit a qualified replacement.
Additionally, despite planning we were unable to assess patients after delirium resolution. This is because patients were discharged from critical care area before assessors could evaluate them on a second occasion. As the aims of this study were to describe our entire experience of utilising the DRS-R98 in critical illness; we believe it was essential to report all planned activity including logistical challenges that prevented our pre-planned goals. Future delirium severity validation projects should consider reassessing the IRR with a larger sample size; extensive training on the application of the DRS-R98; and assessing the DRS-R98's ability to detect changes in delirium over time in a critical care setting.
Finally, the principal reason for selecting the DRS-R98 is that it integrates data from multiple resources. However, this requires time and effort to collect, compare, and evaluate each of the individual items or symptoms. The DRS-R98 collection process consisted of two stages: stage one required detailed review of medical records, which was time consuming, although could be conducted in part, remotely from the ICU (Figure 3, describes assessment stages locations and durations). This extraction process could be improved by designing a data extraction form and improving the skill and knowledge of the extractor. Further, the duration for this stage is expected to reduce over time, with greater familiarity with process, patient and automated electronic capture of key data.

DRS-R98 assessment location, duration and impacting factors.
The second stage of data collection. Unlike the first stage, was short with a median duration of 15 min. The DRS-R98 is designed to be implemented by a psychiatrist or other healthcare professional who must be trained in assessing psychiatric conditions and who can solicit information expertly from patients. This was demonstrated, as several assessed patients could not communicate verbally, cannot move, and/or declined to interact. Despite the assessor's best efforts, this wasn’t always attainable and thus we were not surprised to report that items dependent on interview and clinical assessment (eg, visual spatial ability, memory) were the most challenging to evaluate. Whereas items that depended on other information resources (eg, sleep disorder, motor agitation), were easier to evaluate and complete. On reflection, discussion after each assessment session, may help determine how missing items may best be assessed.
Our overall impression is that the DRS-R98 is a potential tool in research and for complex delirium case evaluation in a critical care setting. Although its lengthy process and expert staff to extract and assess the relevant information could limit its application for routine clinical use.
Conclusion
This study has demonstrated that the DRS-R98 can successfully assess delirium severity in a critical care setting. It has demonstrated good construct validity, excellent internal consistency, and moderate inter-rater reliability. In clinical practice, the DRS-R98 could be applied as structured approach to assess complex delirium in critical care. In research, the tool could have a valuable role in describing delirium phenomenology and assessing response to delirium prevention and treatment approaches.
Footnotes
Acknowledgment and Credits
The authors recognise the contribution of Prof Paula T. Trzepacz.
Declaration of Conflicting Interest
All authors though have no competing interests to declare that are relevant to the content of this article.
Funding
The research is part of the first author's PhD degree, except for the student bench fess, the author(s) received no financial support for the research or authorship of this article. The article open access was provided by Southhampton Unversity.
