Abstract
Background:
The use of observational measures to assess palliative care patients’ level of consciousness may improve patient care and comfort. However, there is limited knowledge regarding the validity and reliability of these measures in palliative care settings.
Aim:
To identify and evaluate the psychometric performance of observational level of consciousness measures used in palliative care.
Design:
Systematic review; PROSPERO registration: CRD42017073080.
Data sources:
We searched six databases until November 2018, using search terms combining subject headings and free-text terms. Psychometric performance for each identified tool was appraised independently by two reviewers following established criteria for developing and evaluating health outcome measures.
Results:
We found 35 different levels of consciousness tools used in 65 studies. Only seven studies reported information about psychometric performance of just eight tools. All other studies used either ad hoc measures for which no formal validation had been undertaken (n = 21) or established tools mainly developed and validated in non-palliative care settings (n = 37). The Consciousness Scale for Palliative Care and a modified version of the Richmond Agitation–Sedation Scale received the highest ratings in our appraisal, but, since psychometric evidence was limited, no tool could be assessed for all psychometric properties.
Conclusion:
An increasing number of studies in palliative care are using observational measures of level of consciousness. However, only a few of these tools have been tested for their psychometric performance in that context. Future research in this area should validate and/or refine the existing measures, rather than developing new tools.
Keywords
The European Association for Palliative Care (EAPC) framework for sedative use recommends that patients’ level of consciousness should be evaluated as part of their periodical assessments during and after administering sedative medication.
Observational measures are frequently employed for monitoring consciousness levels in settings where sedatives and analgesics are commonly used.
The use of observational measures to assess palliative care patients’ level of consciousness may improve patient care and comfort; however, little is known about which measures are the most appropriate, valid and reliable to use in the palliative care setting.
An increasing number of studies are using observational tools for the assessment of palliative care patients’ level of consciousness.
Only eight of these tools have been tested for their psychometric performance with palliative care patients in single validation studies, and none have been tested for all measurement properties.
Most measures of level of consciousness used in primary studies are ad hoc tools for which no formal validation has been undertaken or tools developed and validated in non-palliative care settings.
Clinicians and researchers should be mindful of the limited evidence supporting the psychometric quality of existing level of consciousness measures, especially in terms of responsiveness, when using such scales in the palliative care setting.
Future research should focus on validating and refining the existing measures for use in palliative care, rather than developing new tools.
Background
Palliative care patients may experience alterations in their level of consciousness, either as a result of disease and symptom progression or as an effect of different pharmacological treatments. 1 Clinicians may intentionally reduce the consciousness of some patients, especially towards the end of life when symptom burden tends to increase, by administering sedative and/or analgesic medication. This practice aims to relieve patients’ intractable distress resulting from one or more treatment-resistant symptoms. 2
National and international palliative care organisations recommend using sedative medication for the alleviation of refractory symptoms at the end of life. 3 However, the prevalence and practice of sedative use vary considerably according to setting and country.4–6 Nevertheless, the majority of clinical practice guidelines on the use of sedatives in palliative care agree that sedative medication should be used proportionately, to the extent that distressing symptoms for each individual patient are adequately addressed.2,7,8
Inappropriate use of sedative and analgesic medication may have considerable consequences for the care and experience of patients and family members. A survey among palliative care nurses found that sedative use was considered insufficiently effective by approximately 40% of the respondents, 9 while another study reported suboptimal use of palliative sedation performed by general practitioners in 11 of the 27 described cases. 10 Inadequate symptom palliation can be traumatic for patients and a significant source of emotional distress for their families.10,11 Conversely, the use of disproportionately high doses of sedatives may be equally distressing for relatives due to the impaired ability of the patient to interact with family members and the possible risk of hastening death.12,13
The European Association for Palliative Care (EAPC) framework for sedative use recommends that patients’ level of consciousness should be evaluated as part of their periodical assessments during and after administering sedative medication. This is in order to avoid the effects of over- or under-sedation and fulfil the requirements of proportionality. 2 In settings where sedatives and analgesics are commonly used, observer-rated measures are frequently employed for monitoring consciousness levels.14–16 A review of sedation instruments in intensive care units identified 25 studies describing relevant tools. 14 Similarly, another review found that numerous tools measuring sedation depth had been used in clinical research on procedural sedation. 16 Although the authors of these reviews concluded that further research into the psychometric performance of the identified measures is needed, a number of measures achieved high ratings for validity and reliability in the settings/populations in which they were tested. Most of the instruments in these studies comprise a single item with a categorical grading representing decreasing levels of consciousness, usually assessed by patients’ response to stimulation of increasing intensity. This type of scale structure may create overlaps between different consciousness levels which are not necessarily mutually exclusive, but provides benefits in terms of simplicity and ease of use, so allowing for repeated administrations to be quickly performed and, consequently, enabling the close monitoring of responses to sedative and analgesic use. 17 Other advantages of using valid and reliable observational measures for the assessment of level of consciousness include improved consistency in medication administration, better communication among healthcare professionals, enabling the development of sedation guidelines and protocols, and facilitating comparison between research data and findings.18–20 Occasionally, level of consciousness scores may also provide an indication of disease progression and expected survival.21–23
Despite these benefits being highly applicable and relevant to the palliative care context, little is known about which measures are the most appropriate, valid and reliable to use with palliative care patients. The aim of the present systematic review, therefore, was to (1) identify all relevant observational levels of consciousness tools used in primary research studies, (2) describe their content and (3) critically appraise their psychometric performance. This review was undertaken as part of the sedation work package of I-CAN-CARE (Improving care, assessment, communication and training at the end of life), a Marie Curie-funded research programme on prognosis and sedative use in palliative care.
Methods
This review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement 24 and the review protocol published in the International Prospective Register of Systematic Reviews (PROSPERO; registration number: CRD42017073080).
Search strategy
A four-step search strategy was employed (Table 1). An initial broad search was performed to identify primary research studies reporting the use of observational level of consciousness measures and produce a list of search terms. Six databases were then systematically searched using a combination of subject headings and free-text terms for palliative care, measurement instruments and sedative use, adjusted for each database. Subsequently, the reference lists of all included papers were hand-searched for relevant publications. When eligible articles were identified, the process of backward reference searching was repeated until no more relevant publications could be located. The same method was applied for finding newer studies citing the included papers. Finally, the authors of conference abstracts meeting inclusion criteria were contacted for full-text publications. Where relevant data were missing from included papers, authors were also contacted.
Search strategy and eligibility criteria.
CENTRAL: Cochrane Central Register of Controlled Trials; CINAHL: Cumulative Index to Nursing and Allied Health Literature; WoS: Web of Science.
Eligibility criteria
Full-text publications of primary studies (prospective or retrospective, patient-based or clinician-based) describing the use of observational measures (validated or ad hoc) for the assessment and/or monitoring of level of consciousness/sedation depth in adult palliative care patients were included.
We excluded non-primary studies, such as systematic reviews, and studies providing no information about sample size. Due to resource constraints, non-English language publications were also excluded.
Study selection
After removing duplicates 11,938 titles and abstracts were screened against eligibility criteria (A.M.K.). A second reviewer (E.M.) independently screened a random 10% selection. The inter-reviewer agreement for the initial title and abstract screening was κ = 0.71. Full-text publications which potentially met inclusion criteria after first screening were each independently assessed for eligibility by two reviewers from a group of six (A.M.K., J.S., E.M., S.M., B.V. and P.S.). Discrepancies at each stage of study selection were resolved through discussion.
Data extraction
We extracted the following information for each included study into a standardised form: first author, date of publication, country of origin, study aim(s), setting, sample size and participant characteristics. For each measure identified, tool name, measurement aim/purpose, number of subscales and items and response options were extracted. Data on the psychometric performance of instruments, where available, were also extracted.
Psychometric performance of included measures
We used a checklist (Table 2) to evaluate the psychometric performance of included measures. This checklist drew on that developed by Zwakhalen et al. 25 with some modifications, following discussion between A.M.K. and B.V., based on the established criteria for developing and evaluating health outcome measures.26–28
Quality criteria for measure appraisal.
The psychometric properties appraised include the reported validity, reliability and responsiveness of measures. In addition, the feasibility and origin (source) of tool items were also evaluated.
Validity of an instrument was defined as an assessment of the extent to which it measures what it purports to measure. 26 It is generally understood that there are four types of validity; we assessed three of these: (1) content validity: the degree to which the construct of interest is comprehensively represented by the measure items, assessed through the extent of involvement of the target population in item selection and the provision of a clear description of the concept that the instrument is intended to measure; 28 (2) construct validity: correlation of the level of consciousness scale with other instruments that are known to measure the same construct. Pearson’s or Spearman’s correlation coefficient of 0.6 or above was considered acceptable in this review; 25 (3) structural validity: assessed through the degree of variance explained by factor analysis. There is no agreed ‘gold standard’ for measuring level of consciousness in palliative care, so we did not assess the fourth type of validity, (4) criterion validity: the extent to which a proposed new measure correlates with another instrument generally accepted to accurately measure the construct of interest (‘gold standard’). 26
Reliability refers to the overall consistency and reproducibility of a measure. 26 Four types of reliability estimates were included in our assessment criteria: (1) homogeneity (internal consistency), assessed through Cronbach’s alpha coefficient; (2) inter-rater reliability; (3) intra-rater reliability; and (4) test–retest reliability. The common statistical methods for evaluating the latter three properties are intraclass correlation coefficient (ICC) for continuous measures and Cohen’s kappa for nominal/ordinal measures. 28 We took values of less than 0.6, between 0.6 and 0.8, and greater than 0.8 as indicative of low, adequate and high reliability, respectively.
Responsiveness is the ability of an instrument to detect clinically meaningful changes over time in the construct measured. The most common approaches to assessing responsiveness are the correlations of change scores for an instrument over time with changes in other available variables, and the area under the receiver operator characteristic (ROC) curve (AUC).26,28
Feasibility is described as the user-friendliness of a measure in terms of administration and processing. 26 The burden on staff of collecting and processing data is an important parameter to consider when selecting a tool for use in clinical practice or for research purposes. 26
Origin of items refers to whether the measure items were specifically developed for use with the target population, modified, or taken from a scale developed for another population. 25
Evidence of psychometric performance was categorised according to the aforementioned criteria. For each property, measures were scored according to the following scheme: 2 if the property was evaluated and fully met criteria; 1 if criteria were partially met; and 0 when criteria were not met. If a property was not evaluated/not reported or the information provided was unclear, a rating was not given. Psychometric properties were independently evaluated by two raters (A.M.K. and E.M.), achieving a high initial agreement (κ = 0.91). Raters conferred over discrepancies until full consensus on ratings was reached.
Results
The database search yielded 13,827 results. After removing duplicates and initial screening of titles and abstracts, 491 potentially eligible articles remained, which were examined in full. Of these, 55 met criteria for inclusion. Further 10 eligible studies were identified through forward and backward citation searching, resulting in 65 included studies (see Figure 1). Only seven studies provided data on the psychometric performance of level of consciousness tools in the palliative population; 21 studies presented information on ad hoc measures (i.e. those developed specifically for the purposes of individual studies); and 37 reported using established scales, the majority of which had been validated in non-palliative care settings. Table 3 presents a summary of study and measure characteristics.

PRISMA flow diagram of study selection process. 24
Description of identified studies and measures.
CPS: continuous palliative sedation; BIS: bispectral index; GCS: Glasgow Coma Scale; PS: palliative sedation; RASS: Richmond Agitation–Sedation Scale; CDSUD: continuous deep sedation until death; CSD: continuous sedation until death; RSS: Ramsay Sedation Scale; VAS: visual analogue scale; RLS85: Reaction Level Scale 85; RDOS: Respiratory Distress Observation Scale; MWSS: Modified Wilson Sedation Scale; CSPC: Consciousness Scale for Palliative Care; AVPU: Alert/Verbal/Painful/Unresponsive Scale; CF: cognitive function; OAA/S: Observer’s Assessment of Alertness/Sedation; ILD: interstitial lung disease; JCS: Japan Coma Scale; RASS-PAL: Richmond Agitation–Sedation Scale–Palliative version; PCT: patient-controlled therapy; MSAT: Minnesota Sedation Assessment Tool; VICS: Vancouver Interaction and Calmness Scale; KNMG: Sedation score proposed in the Guideline for Palliative Sedation of the Royal Dutch Medical Association.
Description of included studies
Morita et al.41,42 published two articles in which separate analyses of data collected from a single study were performed. Similarly, Barbato et al.,53,54 Campbell et al.,57,59 Claessens et al.1,97,98 and Van Deijck et al.48,49 reported distinct findings from one study in two or more papers. Each of these papers described discrete study aims and outcomes, so we defined them as separate studies. A large number of studies reporting on level of consciousness measures have been published recently, with 26 of the 65 (40%) included studies published after 2013.
Most included studies were patient-based (n = 58), with recruitment and data collection conducted prospectively (n = 49). In eight studies some or all relevant data were obtained retrospectively from patients’ medical records,4,35,38,41,42,55,79,90 while in one study patients were recruited both prospectively (on admission) and retrospectively (after death). 37 Another study reported mixed methods for data collection, a prospective quantitative survey and semi-structured interviews with general practitioners involved in the practice of palliative sedation. 10 Six studies used questionnaires as a means of data collection.31,43,46–49 In these, researchers asked clinicians (physicians (n = 4)43,47–49 or nurses (n = 2)31,46), to provide information about patients under their care who had received sedative medication.
Studies were mainly conducted in a single setting (n = 36); principally hospices, palliative care units or hospitals. Nine studies involved home care patients,4,10,31,56,62,67,81,84,90 and an equal number included nursing home participants.31,37,44,46,48–50,92,93 One study included patients recruited from a cancer centre. 81
Sample size varied considerably (median: 132 participants, interquartile range (IQR): 44–266). The most prevalent diagnosis among study participants was cancer (n = 29). Other reported diagnoses included dementia (n = 3)37,44,50 and interstitial lung disease (n = 1). 79 A total of 32 studies reported mixed diagnoses or did not provide this information. Patients in almost all studies were at an advanced or an end stage of disease.
Reflecting the wide diversity of study aims, level of consciousness tools in each study were employed to serve a number of distinct purposes. The most frequently reported were: to assess/monitor sedation depth after palliative sedation initiation (n = 29), to assess effects or side effects of opioid use (n = 7),29,34,45,54,56,81,91 to evaluate signs/symptoms of impending death (n = 8)21–23,37,42,60,61,79 and to examine associations between level of consciousness and discomfort or other symptoms (n = 6).40,44,50,60,61,64 It is noteworthy that only four studies sought to validate level of consciousness instruments in the palliative care setting.18,73,87,93 Of these, only one aimed to develop a new tool. 18
Description of identified measures
A total of 35 different measures assessing level of consciousness were described in the articles included in this review. Only eight were measures for which evidence of psychometric quality in the palliative setting was available. Fifteen were established instruments or single items taken from compound scales validated as a whole, and 17 were tools constructed for individual study purposes (ad hoc measures). Information on psychometric performance in palliative care was provided for five of the 15 established measures, therefore, there is an overlap between the first 2 described categories (see Figure 2). Across all categories, the tool most frequently employed was the original Richmond Agitation–Sedation Scale (RASS) or its modified versions (n = 17).10,21,35,51,53–55,64,70,72,73,77,84,86,87,91,93

Number of identified studies and measures by instrument category.
Three of the ad hoc measures were modified versions of the existing tools: the Glasgow Coma Scale (GCS),32,33 RASS19,35 and Riker Sedation–Agitation Scale.23,39 All other ad hoc measures comprised unique tools. None of the reported ad hoc measures had been formally validated before use.
The established measures most commonly used were the RASS 19 (n = 11)10,21,51,53–55,64,70,77,84,91 and Ramsay Sedation Scale (RSS; 52 n = 7).4,65,78,84,88–90 Most established measures had been developed and validated for use in settings other than palliative care; mainly the intensive care unit. The studies with palliative care patients in which these measures were used provided no information on their validity or reliability.
Two of the existing measures used for the evaluation of level of consciousness consisted of items extracted from multi-item tools developed to assess constructs other than level of consciousness (i.e. the conscious level item of the Communication Capacity Scale (CCS)5,76,83 and the sedation item of the Pain Flow Sheet81,82). These tools had been evaluated psychometrically in palliative care settings, but validity and reliability have only ever been established for each measure as a whole, not for the individual items measuring levels of consciousness.
Almost all of the described measures consisted of one item with a range of mutually exclusive scoring options (n = 27), usually involving observation of spontaneous activities, such as eye opening, or responses to auditory and/or tactile stimuli performed in a logical progression. The majority of these tools (n = 23) evaluated a single construct: consciousness in terms of arousal, while the remaining measures (n = 4) incorporated the assessment of agitation into single scales for consciousness/sedation.
Evidence of psychometric performance was provided for: the Minnesota Sedation Assessment Tool (MSAT),93,94 RASS,19,93 Vancouver Interaction and Calmness Scale (VICS),93,95 Sedation score proposed in the Guideline for Palliative Sedation of the Royal Dutch Medical Association (KNMG),93,96 Modified RASS,19,73 Richmond Agitation–Sedation Scale–Palliative version (RASS-PAL),19,87 GCS1,33,97,98 and Consciousness Scale for Palliative Care (CSPC). 18
Dutch versions of original English language measures were created by researchers for the MSAT,93,94 RASS,19,93 VICS93,95 and GCS.1,33,97,98 The RASS modified by Benitez-Rosario et al. 73 was translated and further adjusted for use with Spanish palliative care patients. Modifications to the original RASS 19 included the removal of descriptors relating to the mechanical ventilation of patients and a clarification to the scoring instructions addressing the possibility that restless behaviour may be present in patients who are not fully alert. Similarly, Bush et al. 87 reported performing minor changes to the RASS 19 when testing its psychometric performance in the palliative care setting. The CSPC was validated in its source language (Portuguese) and, subsequently, translated by its authors into English. 18
Appraisal of psychometric performance
Evidence regarding structural validity, test–retest and intra-rater reliability was not provided for any of the evaluated measures, so we do not present findings relating to these properties. The CSPC 18 and a modified version of the RASS 73 achieved the highest ratings in our quality appraisal, but our evaluation was based on evidence obtained from just one study for each measure. Table 4 provides a summary of the quality appraisal process for each instrument.
Appraisal of psychometric performance of observational level of consciousness measures.
MSAT: Minnesota Sedation Assessment Tool; NE: not evaluated; NR: not reported; MSATa: Minnesota Sedation Assessment Tool arousal subscale; ICC: intraclass correlation coefficient; CI: confidence interval; MSATm: Minnesota Sedation Assessment Tool motor activity subscale; MSATq: Minnesota Sedation Assessment Tool quality of sedation subscale; RASS: Richmond Agitation–Sedation Scale; VICS: Vancouver Interaction and Calmness Scale; VICSi: Vancouver Interaction and Calmness Scale interaction subscale; VICSc: Vancouver Interaction and Calmness Scale calmness subscale; KNMG: Sedation score proposed in the Guideline for Palliative Sedation of the Royal Dutch Medical Association; RASS-PAL: Richmond Agitation–Sedation Scale–Palliative version; GCS: Glasgow Coma Scale; CSPC: Consciousness Scale for Palliative Care.
Content validity
All studies provided a clear description of the construct measured by the reported instruments. However, the involvement of the target population in selecting or modifying scale items was described only for three of the eight evaluated measures: the CSPC, 18 RASS-PAL 87 and Modified RASS. 73 One study 18 reported receiving feedback on the content of the CSPC from seven palliative care doctors and nurses at the construction stage on the scale. Likewise, the input of palliative care professionals guided the modification of scale items for the RASS-PAL 87 and RASS modified by Benitez-Rosario et al. 73
Construct validity
Information on construct validity was available for six of the eight included measures: the MSAT,93,94 VICS,93,95 RASS,19,93 KNMG,93,96 CSPC 18 and Modified RASS. 73 For these, construct validity was evaluated through the correlation of the tested instrument with others that were assumed to measure the same construct (convergent validity). Discriminant validity was not assessed for any tool.
Correlations were reported per subscale for the MSAT and VICS.93–95 The MSAT arousal subscale performed better than the motor activity subscale with Spearman’s correlation coefficient ranging from 0.48 to 0.83, depending on the measure with which it was correlated (RASS, KNMG and VICS). Low to moderate correlations were reported for the motor activity subscale of the MSAT (ρ = 0.42–0.61). Mostly moderate correlations were found between both subscales of the VICS with other tools measuring level of consciousness (interaction subscale: ρ = 0.31–0.72, calmness subscale: ρ = 0.31–0.57).93–95
Construct validity of the RASS and KNMG was supported by moderate-strong associations when compared with corresponding instruments.19,93,96 Strong correlations with other tools measuring level of consciousness were reported for the Modified RASS and CSPC.18,73 Spearman’s correlation coefficient for the Modified RASS to the GCS 33 ranged from 0.81 to 0.85 and 0.82–0.89 when compared with the RSS, 52 depending on the group of professionals scoring the scales (palliative care physicians or medical residents). 73 Likewise, the CSPC correlated highly with a 100 mm visual analogue scale (VAS) anchored in the terms ‘awake’ and ‘unarousable’ (ρ = 0.94–0.95) and with the GCS (ρ = 0.82–0.85).18,33
Homogeneity (internal consistency)
As the aim of some of the studies was not to address unique measure characteristics, homogeneity was evaluated for only one of the appraised measures, the CSPC. 18 For this instrument, the reported Cronbach’s alpha coefficient was very high (α = 0.99). 18
Inter-rater reliability
ICC or weighted Cohen’s kappa was used for the assessment of inter-rated reliability in all of the included studies. From the tested measures, inter-rater reliability was found to be high for the CSPC (ICC = 0.99), 18 GCS (ICC = 0.807),1,33,97,98 RASS-PAL (ICC = 0.84–0.98) 87 and Modified RASS (κ = 0.85–0.95). 73 Moderate correlations within paired observational assessments were reported for the RASS (ICC = 0.71–0.73)19,93 and KNMG (ICC = 0.66–0.71).93,96 Of the MSAT and VICS subscales, the VICS interaction scale performed best with ICC ranging from 0.77 to 0.85, followed by the MSAT arousal scale (ICC = 0.59–0.64).93–95 Depending on the time interval between paired assessments, Cohen’s kappa coefficient ranged from 0.44 to 0.54 for the MSAT overall quality of sedation subscale, suggesting low agreement between scale assessors. No correlations were found for the MSAT motor activity and VICS calmness subscales.93–95
Responsiveness
Change scores indicating clinically meaningful change over time in consciousness/sedation levels were not described for any of the appraised measures. Bush et al. 87 provided some information on the floor and ceiling effects for the RASS-PAL but it is not adequate for the assessment of responsiveness.
Origin of items
Items for half of the measures for which evidence of psychometric performance was available originated from scales developed for non-palliative care patients. Specifically, aspects of the measurement properties of the Dutch versions of the MSAT,93,94 VICS,93,95 RASS19,93 and GCS1,33,97,98 were appraised by study authors adopting the original items of these scales without assessing their appropriateness for the palliative care setting.
For the other half of the scales, items were either modified (RASS-PAL 87 and Modified RASS 73 ) or particularly developed (KNMG 96 and CSPC 18 ) for monitoring palliative care patients’ level of consciousness.
Feasibility
In a comparison for user-friendliness between the Dutch versions of the RASS, 19 MSAT 94 and VICS, 95 Arevalo et al. 93 reported that most palliative care professionals found RASS the least time-consuming, clearest and easiest to use. Acceptable ratings were achieved for the MSAT, while the VICS was evaluated as the least clear and easy to use among the three tools. The RASS-PAL, 87 CSPC 18 and Modified RASS 73 were also regarded as feasible and useful tools by healthcare professionals.
Discussion
Main findings
This systematic review aimed to identify, describe and appraise the psychometric performance of observational level of consciousness measures used in palliative care. We found 35 different levels of consciousness tools used in 65 studies. Evidence of psychometric performance, however, was available for only eight of these instruments. Two of these eight tools were specifically developed for palliative care populations (CSPC 18 and KNMG 96 ), two were versions of an existing tool (i.e. the RASS 19 ) modified for use in palliative care (Modified RASS 73 and RASS-PAL 87 ) and four were measures developed for different populations, tested for aspects of validity and/or reliability in the palliative setting (GCS,1,33,97,98 MSAT,93,94 RASS19,93 and VICS93,95). None of these tools had been evaluated across all relevant psychometric properties; hence no measures appraised had been fully validated.
The majority of measures identified were either ad hoc tools for which no formal validation had been undertaken (n = 17) or tools developed and validated mainly in non-palliative care settings (n = 15). This widespread use of non-validated measures raises questions regarding the methodological robustness of studies and the quality of reported evidence, 99 not least because, although tools’ psychometric performance may have been investigated in specific contexts, this does not transfer to other settings. 100 It is therefore essential, as with any measures to be used in palliative care, that tools assessing level of consciousness should be thoroughly validated with palliative care patients in order to be certain that they are reliable for this population.
Most measures identified sought to measure consciousness in terms of wakefulness and, therefore, mostly (n = 23) comprised one item with a range of levels describing patients’ responses to verbal and/or physical stimulation. Apart from consciousness, a small number of tools (n = 4) included the assessment of agitation, as a domain related to sedative and analgesic use, in a single scale. These tools have been criticised for various reasons, including the lack of clarity in the definition of different consciousness levels, and the poor standardisation of employed stimuli.16,18 Moreover, the assessment of patients presenting decreased consciousness and restlessness at the same time may be compromised when both conditions are evaluated on the same scale.14,16 Nevertheless, the most commonly employed measure was the RASS 19 (a tool assessing sedation and agitation on a single-item scale) or modified versions of it (n = 17). An explanation for this may be that the RASS requires minimal training and can be quickly and easily administered at the bedside. 19 These are particularly desirable features for a scale intending to measure level of consciousness, an often unstable characteristic, in clinical environments where patients are cared for by professionals of different backgrounds, as in palliative care. 18
Limited information was available on the measurement properties of tools, thus making it difficult to draw definitive conclusions about their psychometric performance. Our evaluation was based on evidence obtained from a single study, rather than a group of studies, for each measure. Some studies did not aim to specifically develop and/or validate level of consciousness measures.1,93,97,98 As a result, these studies assessed only certain psychometric properties on each occasion, and no tools were tested across all measurement properties. Our quality assessment outcomes should be treated with caution, therefore, until further evidence on the psychometric performance of the appraised measures becomes available.
Information on inter-rater reliability and internal consistency was provided by all studies, with most tools performing adequately on both properties. Due to the lack of a ‘gold standard’ level of consciousness measure in palliative care, criterion validity could not be assessed. Instead, in three studies the tested tools were compared with other instruments known to measure level of consciousness.18,73,93 However, although the reported correlations between the assessed measures and other comparable tools were acceptable to high, the reference measures were not themselves tested for their psychometric performance in a palliative care context.
No publications provided any information regarding test–retest or intra-rater reliability, although all studies described collecting data at more than one time point. This might be explained by the lack of stability of the construct measured, that is, palliative care patients’ fluctuating level of consciousness. Thus, the assessment of these psychometric properties may not be feasible for level of consciousness measures in this population.
The measures with the highest ratings in our appraisal were the CSPC, 18 a tool specifically developed to measure level of consciousness in palliative care, and a version of the RASS modified for use with palliative care patients. 73 However, the only information available about the psychometric performance of either was restricted to that of initial validation studies and insufficient for assessing all appraised measurement properties. Palliative care clinicians and researchers should be mindful of these restrictions when using level of consciousness measures, therefore.
Our findings agree with those of previously published reviews. In their review of level of sedation instruments, De Jonghe et al. 14 reported that responsiveness had not been tested for any of the scales identified. They commented that responsiveness is an important measurement property because it can inform the titration, initiation and withdrawal of sedative drugs. 14 Apart from these benefits, a measure that can reliably detect changes in patients’ level of consciousness over time may enable the longitudinal evaluation of patients and provide a useful outcome measure for palliative care research. Nevertheless, like De Jonghe et al., 14 we did not find adequate evidence to appraise responsiveness in our review. When seeking to determine clinically important changes in patients’ status or evaluate the effects of medical interventions it may be problematic to use measures that do not demonstrate satisfactory responsiveness, since changes in scores may result from measurement error rather than true changes in patients’ consciousness levels. Thus, it is important that clinicians and researchers are aware of the limited evidence regarding responsiveness when choosing measures to evaluate treatment/intervention outcomes or interpreting level of consciousness scale scores. In order to enable clinical assessment and decision-making, and support the testing of new interventions, future studies that seek to develop new level of consciousness tools or validate existing ones should aim to provide strong evidence on the responsiveness of these measures.
Brinkkemper et al. 101 identified seven scales measuring level of awareness reported in primary studies. Of these, similar to our findings, a significant proportion were ad hoc measures, while the RASS 19 was the most commonly used of the established scales. Brinkkemper et al. 101 found only one tool, the CCS, 76 for which information on psychometric performance was available. Although the authors presented this information, they did not formally evaluate the psychometric quality of the CCS 76 because this was outside the scope of their review. Our search identified the CCS,5,76,83 but it was excluded from our quality appraisal because the scale used for the assessment of consciousness level constitutes an individual item extracted from a compound measure for assessing the ability of terminally ill patients to communicate that was developed and tested as a whole. Hence, the psychometric evidence provided pertain to the CCS 76 measure as a whole, not its individual items.
Brinkkemper et al. identified a substantially smaller number of tools than we did, because their review focused specifically on the effects of palliative sedation. Our inclusion criteria were broader, allowing the inclusion of studies reporting the use of observational measures regardless of the purpose for which these were employed. Moreover, an increasing number of studies using level of consciousness tools have been published since the publication of their review in 2013. Of the 65 included studies in our review, 26 (40%) have been published since 2013. A possible explanation for this upwards trend may be the recent publication of high impact guidelines recommending the use of observational scales for the monitoring of level of consciousness of palliative care patients receiving sedative medication.2,102
Strengths and limitations
A strength of this systematic review is the comprehensive yet broad search strategy followed, including six databases without applying date restrictions. We also performed a thorough backward and forward citation search for all included articles and contacted abstract authors in order to ensure that all relevant publications were identified. A limitation is that we included only English language publications. It is possible that studies providing evidence on measurement properties of translated versions of tools were missed. We are aware of at least one validation study, which was excluded from this review due to language restrictions. 103
Two reviewers (A.M.K. and E.M.) independently performed the appraisal of the psychometric performance of the identified measures against well-defined quality criteria. Nevertheless, comparability of evidence was hindered by the heterogeneity of studies reporting data on psychometric properties in terms of setting, sample size, participant population, study design and objectives, and of the purposes for which tools were employed on each occasion. Our evaluation, therefore, was based on the limited published evidence from individual studies for each appraised measure.
Conclusion
This systematic review demonstrates that although an increasing number of studies are using observational level of consciousness measures, only a few of these tools have been tested for their psychometric performance in the palliative care setting, and none across all relevant measurement properties. The CSPC and a modified version of the RASS achieved the highest ratings in our appraisal, but further evidence on their measurement properties is needed before either can be recommended as valid and reliable measures for use in palliative care practice and research. Future research in this area should use, and seek to further validate and refine existing level of consciousness measures, rather than developing new tools or using ad hoc instruments.
Footnotes
Acknowledgements
We thank Bridget Candy and Nurije Kupeli for their significant contribution to the development of search terms for the electronic databases and their overall support in designing this review. We would also like to thank all current and former members of the study advisory and working groups: Alice Colum, Anna Gola, Tariq Husain, Yana Kitova, Philip Lodge, Rebecca Lodwick, Jon Martin, Vinnie Nambisan, Denise O’Malley, Liz Sampson, Liz Thomas, Adrian Tookman and Tim Wehner. Particular thanks to Hilary Bird and Kathy Seddon (Marie Curie Expert Voices PPI representatives on the Advisory Group). Special thanks to Jimmy Arevalo, Tijn Brinkkemper, Wojciech Leppert, Staffan Lundström, Ryo Matsunuma, Jesús Mateos-Nozal, José Pereira and Jenny van der Steen for responding to requests for full-text publications and/or additional information regarding their studies.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by Marie Curie (grant numbers: MCCC-FPO-16-U and MCCC-FBFO-16-U).
