Abstract
Background/rationale:
This study tested the psychometric properties of the Quality of Life in Late-Stage Dementia (QUALID) Scale using Rasch analysis. The QUALID includes 11 items with a 5-point response scale. Scores range from 11 to 55, and lower scores indicate higher quality of life (QoL).
Methods:
Baseline data from a randomized clinical trial including 137 residents from 14 nursing homes were used. Psychometric testing included item mapping, evaluation of response categories, item reliability, construct validity based on INFIT and OUTFIT statistics, and convergent validity based on correlations between QoL and pain, agitation, depression, and function.
Results:
The Cronbach α was .89. All the items except “appears physically uncomfortable” fit the model. There was a significant relationship between QoL and depressive symptoms (r = .71, P = .001), pain (r = .26, P = .01), physical function (r = −.19, P = .03), and agitation (r = .56, P = .001). The categories were appropriately used. Item mapping suggested a need for easier items.
Keywords
Quality of life (QoL) has been a difficult concept to conceptualize or measure. Among individuals with cognitive impairment, this is even more challenging. Most researchers and clinicians agree that QoL is a multidimensional concept that includes life satisfaction, physical, emotional, and mental health, and social and behavioral components of well-being. 1,2 The World Health Organization Quality of Life Group defined QoL as “an individual’s perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards, and concerns.” 3 A person’s state of health, as well as the factors that influence overall well-being, is believed to influence QoL.
There are known challenges to measuring QoL among those with cognitive impairment including such things as their ability to provide subjective input, the evaluators’ ability to know what is relevant to QoL for the individual with moderate to severe cognitive impairment, and potential biases that occur when proxy reporting is done. 1,4 –11 Several measures have been used to evaluate QoL objectively and subjectively among older adults with cognitive impairment 12 including the Dementia Quality of Life Questionnaire (DQoL), 13 the Quality of Life in Dementia Scale (QoL-AD), 14 dementia care mapping, 15 and the Quality of Life in Late-Stage Dementia (QUALID) Scale.
The DQoL scale includes 29 items that reflect 5 domains: self-esteem, positive affect and humor, negative affect, feelings of belonging, and sense of aesthetics. Responses range from 1 = never to 5 = very often. In addition, there is a final item on the DQoL scale that is not scored. This item asks participants to describe their overall QoL as excellent, very good, good, fair, or bad. One advantage of this measure is that older adults with moderate to severe cognitive impairment have been able to respond to the items as they are focused on how the individual is feeling at the time of testing. Specifically, participants are asked basic things such as whether or not they like music or watching the birds. Initial psychometric testing provided sufficient evidence of reliability with an internal consistency of each subscale ranging from α of .67 to .89 and 2-week test–retest reliability ranging from .64 to .90. Evidence of validity was based on significant correlations between scores on the DQoL scale and measures of well-being and depression. 13 Ongoing use of this measure, however, raised concerns about the value of the subscales, “feelings of belonging” and “sense of aesthetics” as these were not statistically significantly related to overall QoL. 5,16
The QoL-AD is another commonly used measure to evaluate QoL among older adults with dementia. The QoL-AD 14 was originally developed for community-dwelling older adults with Alzheimer’s disease and includes questions about the individuals’ physical condition, mood, interpersonal relationships, ability to participate in meaningful activities, and financial situation. The QoL-AD has 13 items with higher scores indicating better QoL. The measure can be completed by the older individual and a proxy provider. Prior use of this measure has provided evidence of internal consistency (α coefficient = .84-.86) and test–retest reliability (r was .76 for patients and .92 for caregivers). 14,17 Validity was based on low to moderate correlations between scores on the QoL-AD and cognition, instrumental activities of daily living, depression, and engagement in pleasant events. 18 A major concern with use of this measure, however, was the low intraclass correlations between self-reports and caregiver reports across all cognitive levels (r = .14-0.39) 18 raising concerns about the accuracy of either proxy or self-reports. The lack of consistency between self-report and proxy report on the QoL-AD may be due to the type of items included in the measure. Specifically, there were several items related to finances, marriage, family, and friends that may not be relevant for many older adults, particularly those in long-term care settings.
As an alternative to use of self-report questionnaires to evaluate QoL among older adults with dementia, Kitwood and Bredlin 15 developed dementia care mapping. This is a comprehensive, objective assessment system that includes observations of behavior and feedback to staff aimed at improving care for older adults with dementia based on person-centered care approaches. Evaluators are trained to complete behavioral assessments of the individual from the older individual’s perspective. The ill-being and well-being states of the individual are calculated as well as a combined well-being and ill-being value. The older individual is observed for 6 hours, and their behavior is categorized every 5 minutes into 1 of 23 behavior categories. There is prior evidence of the reliability and validity of this approach to measurement of QoL 19,20 , and it is useful for the identification of change in QoL following interventions. 21 From a practical and feasibility perspective, this type of evaluation is difficult to perform and potentially could be invasive in that the older adult may become aware of being observed over a long period of time.
The QUALID 22 overcomes some of the challenges in these previously described measures of QoL in that it was developed specifically for individuals with late-stage dementia and those who cannot communicate coherently and may not be involved in activities widely accepted by others as affording QoL such as reading or watching a movie. The QUALID scale was developed based on the input from experts in care of older adults with late-stage dementia. The scale includes 11 observable behaviors thought to be indicative of QoL such as whether or not the individual smiles, appears sad, cries, has facial expressions of discomfort, appears emotionally calm and comfortable, or is irritable or aggressive. These observations are rated on a 5-point Likert scale which is different for each item but reflects the amount of time spent in the behavior. For example, for the item “enjoys eating,” the responses include the following: (1) at most meals and snacks, (2) twice a day, (3) at least once a day, (4) less than once each day, or (5) rarely or never. Conversely for the item “appears sad,” responses include the following: (1) rarely or never, (2) only in response to external stimuli less than once each day, (3) only in response to external stimuli at least once each day, (4) for no apparent reason less than once each day, or (5) for no apparent reason once or more each day. The measure takes 5 minutes to complete but should be done by someone who has had 30 or more hours of exposure to the resident over the previous week.
Initial psychometric testing was done on a sample of 42 residents from a single facility. Testing provided evidence of internal consistency with an α coefficient of .77, test–retest reliability with an intraclass correlation of .81, and interrater reliability with an intraclass correlation of .83. 22 There was evidence of validity based on significant correlations between scores on the QUALID and the Geriatric Depression Scale (r = .36, P = .04) and the Neuropsychiatric Inventory (r = .40, P = .01). 22 The purpose of this study was to expand the psychometric testing of the QUALID using a Rasch measurement model and a larger sample of nursing home residents from 14 different nursing homes across multiple states. Use of the Rasch measurement model has the advantage of providing a more realistic probabilistic framework, evaluating the items in detail and addressing whether or not the items cover the full continuum of QoL among those with dementia, and establishing if the responses are used appropriately and are consistent.
Methods
Design and Setting
This study used baseline data from the first cohort of a randomized clinical trial testing the implementation of the Evidence of Integration Triangle for Behavior and Psychological Symptoms of Dementia (EIT-4-BPSD). Fourteen nursing homes from 2 states on the East Coast were invited to participate in the study if they (1) agreed to actively partner with the research team on an initiative to change practice, (2) had at least 100 beds, (3) identified a staff member to be an internal champion and work with the research team in the implementation process, and (4) were able to access e-mail and websites via a phone, tablet, or computer.
Residents were eligible to participate if they (1) were living in a participating nursing home, (2) were 55 years of age or older, (3) had cognitive impairment as determined by a score of 0 to 12 on the Brief Interview of Mental Status (BIMS), 23,24 (4) were not enrolled in hospice, and (5) were not in the nursing facility for short-stay rehabilitation care. A list of all eligible residents was obtained from a designated staff member and residents were approached with the goal of recruiting 12 to 13 residents per setting. Potentially eligible residents were invited to complete the Evaluation to Sign Consent (ESC). 25 Evidence of ability to sign consent was based on correct responses to all 5 items on the ESC. If the resident was not able to independently sign consent, then assent was obtained from the resident, and the legally authorized representative (LAR) was approached to complete the consent process. A total of 305 residents were approached and 137 (45%) consented. A total of 141 residents (46%) were unwilling or unable to provide assent to participate, 5 (2%) residents refused to consent, 3 (1%) LARs refused to consent, 16 (5%) LARs were not reachable, and 3 (1%) individuals were not eligible due to being in hospice or too young.
Procedure
All data collection was done by research evaluators with prior experience working with nursing home residents with moderate to severe cognitive impairment and their caregivers. All the measures were completed based on direct observation of the resident or input from the nursing assistant that was providing care to the resident on the day of testing.
Measures
Descriptive information for residents included age, race, gender, cognitive status, comorbidities, and marital status. In addition to completing the QUALID, assessments of depression, physical function (ie, activities of daily living), pain, and agitation were completed on each participant. All assessments were completed by trained research assistants with prior experience working with this population.
The QUALID, 22 described above, was administered in an interview format to a formal caregiver who was familiar with the resident. The individual completing the scale was asked to respond to the questions based on observations of the resident over the past 7 days.
Comorbidities were calculated using the Cumulative Illness Rating Scale (CIRS). 26 The CIRS was designed for use with frail nursing facility residents. It contains ratings of both illness severity and comorbidity. Residents are evaluated based on 13 organs or systems and a psychiatric/behavioral rating. In prior research, CIRSs ratings correlated with medication use and predicted mortality, hospitalization, and disability. 26,27 Depressive symptoms were measured using the Cornell Scale for Depression in Dementia (CSDD). 28 The CSDD is a 19-item survey that assesses depressive symptoms in individuals living with dementia. Prior research has provided evidence of reliability and validity of the CSDD based on agreement in testing between 2 psychiatrists (k = 0.6) and internal consistency with a Cronbach α of .84. There was evidence of validity based on a correlation between the CSDD and the rank order of the Research Diagnostic Criteria measure of depression (r = .83). 28,29
Agitated behaviors were measured using the 14-item Cohen-Mansfield Agitation Inventory (CMAI). 30,31 This is a survey of disturbing behaviors commonly found in persons living with dementia. The 14-item version of the CMAI uses a 5-point Likert scale to rate the frequency of behavioral symptoms. 31,32 Prior use supported evidence of reliability based on internal consistency with a Cronbach α of .86 to .91, interrater reliability with agreement between 2 raters with a .82 correlation, and validity based on correlations between observations made by primary caregivers and research evaluators’ observations of behaviors of the older individual. 31,32
The Brief Inventory for Mental Status (BIMS) 23,33 was used to evaluate cognition. The BIMS includes 3-item recall and orientation questions with scores ranging from 0 to 15. Prior use of the BIMS established validity based on significant correlations with criterion measures for cognition and evidence of reliability with a κ score of 0.95. 23,33
The Barthel Index (BI) 34 was used to evaluate the function. This measure includes 10 items that address performance of activities of daily living. Estimates of internal consistency ranged from α coefficients of .62 to .80, interrater reliability was supported based on an intraclass correlation of .89 between 2 observers, and validity was based on correlations with the Functional Inventory Measure (r = .97, P < .05). 34,35 Items are weighted to account for the amount of assistance required. A total score of 100 indicates complete independence.
Pain was evaluated using the Pain Assessment in Advanced Dementia (PAINAD) Scale. 36 The PAINAD includes 5 behaviors that are commonly noted among individuals with pain. Observations were done, as recommended, during periods of activity such as transferring or ambulating. Scoring ranges from 0 to 2 for each specific pain behavior. A total score of 1 to 3 is indicative of mild pain, 4 to 6 is moderate pain, and 7 to 10 is severe pain. There is some evidence of internal consistency with Cronbach α ranging from .50 to .67 and interrater reliability with correlations ranging from .82 to .97. There was evidence of validity based on significant correlations between viewers’ ratings of facial expressions and vocalizations of older adults as a measure of the presence of pain. 36
Data Analysis
Descriptive statistics were done using SPSS version 24.0 to describe the sample and consider relationships between QoL and age, gender, race, cognition, and comorbidities. To evaluate the reliability and validity of the QUALID, a Rasch analysis was done using the Winsteps statistical program and bivariate Spearman correlations were done to determine whether there were associations between QoL and depressive symptoms, agitation, function, and pain. A P < .05 level of significance was used for all analyses.
Reliability Testing
Testing of the internal consistency of the QUALID was based on the Rasch measurement model and item reliability and the item separation index. 37 The item separation index defines how well items can be discriminated from one another on the basis of their difficulty and is analogous in interpretation to Cronbach α. The closer the reliability is to 1.0, the less the variability of the measurement can be attributed to measurement error. An equivalent to a Cronbach α of .70 was considered acceptable evidence of item reliability. 37
Validity Testing
Validity testing was based on construct validity of the QUALID and evidence that each item fit the concept of QoL. The Winsteps statistical program was used to establish item fit based on INFIT and OUTFIT statistics. INFIT and OUTFIT statistics are based on conventional chi-square statistics. The INFIT statistic is more sensitive to unexpected patterns of observations by individuals on items that are generally targeted to their ability. OUTFIT statistics are more sensitive to unexpected observations by individuals on items that are relatively very easy or very hard for them. INFIT and OUTFIT statistics are considered acceptable if they are between 0.4 and 1.6. 37 An INFIT or OUTFIT value of less than 0.4 indicates that the item may not provide additional information beyond the rest of the items on the scale. An INFIT or OUTFIT value greater than 1.6 indicates that the item may not define the same construct as the rest of the items in the instrument, is poorly written and thus may have been misunderstood by participants, or is ambiguous. 38
Validity testing for the QUALID was also evaluated based on convergent validity. It was hypothesized that scores on the QUALID would be significantly associated with depression, agitation, function, and pain such that there would be lower QoL for those who had more symptoms associated with depression, more agitation, lower function, and more pain. Bivariate correlations were used to test these associations.
Based on Rasch analysis, additional psychometric properties of the measure were evaluated including item mapping and evaluation of the response categories. Item mapping was done to establish which item was the easiest to the most difficult item to endorse and whether or not the items covered the full continuum of QoL among the participants. Evaluation of the categories was done to determine whether the categories were used appropriately by the participants based on the probability that the categories as used in this sample would be used similarly in other samples of older adults with dementia.
Results
Sample descriptives are shown in Table 1. The majority of the sample was female (69%), Caucasian (69%), non-Hispanic (98%), and not married (78%).
Descriptive Statistics for Sample.
The mean age of the participants was 82 (SD = 11) and the mean score on the BIMS was 4.14 (SD = 3.50) indicating severe cognitive impairment. The mean CSDD score was 5.45 (SD = 4.33) reflecting few depressive symptoms, and the mean score on the Cohen-Mansfield Agitation Scale was 22.44 (SD = 7.84) indicating low levels of agitation. There was little evidence of pain with a mean PAINAD score of 0.77 (SD = 1.59). There was evidence of functional impairment with a mean BI score of 43.72 (SD = 29.83). The QUALID mean was 20.34 (SD = 7.56). There was not a statistically significant relationship between age (r = .11, P = .19), race (r = .11, P = .19), cognition (−0.02, P = .85), or comorbidities (r = −.01, P = .96) and QoL. There was a significant relationship between gender and QoL such that women had lower QoL than men (r = .17, P = .04).
Reliability and Validity Testing
There was evidence of internal consistency of the items with an equivalent Cronbach α of .89. The items all fit the concept of QoL based on INFIT and OUTFIT statistics as shown in Table 2 with the exception of the item “appears physically uncomfortable” which had an INFIT statistic of 1.79 and OUTFIT statistic of 2.57. As hypothesized, there was a statistically significant relationship between QoL and depressive symptoms (r = .71, P = .001), pain (r = .26, P = .01), function (r = −.19, P =.03), and agitation (r = .56, P = .001). Those who had more depressive symptoms, more pain, more agitation, and lower function had lower QoL.
Item Means and INFIT and OUTFIT Statistics.
Item mapping showed that the easiest items to endorse were that the resident enjoys touching or being touched and the resident enjoys interacting or being with others. The 2 next equally more difficult items to endorse were that the resident appears sad or that he or she smiles. Following those items, the next 2 items equally difficult to endorse included the resident enjoys eating, and the resident has a facial expression of discomfort. The next 3 most difficult items to endorse included the resident cries, the resident appears emotionally calm and comfortable, and the resident makes statements or sounds that suggest discontent, unhappiness, or discomfort. The next most difficult item to endorse was that the resident appears physically uncomfortable and finally the most difficult item to endorse related to QoL was that the resident was irritable or aggressive. There were 36 individuals who scored so low in QoL that they could not be differentiated by the 11 items in the scale.
Overall, there was evidence of monotonicity of the category responses as the successive values moved in the same direction. This indicates that the categories were ordered appropriately. The response categories, as shown in Figure 1, were appropriate for response options 1 and 5. Responses for 2, 3, and 4 never peaked, which suggests that they were not used as often.

Category responses for the QUALID. QUALID indicates Quality of Life in Late-Stage Dementia.
Discussion
The findings from this study provide some continued support for the reliability and validity of the QUALID scale when used with older adults. Similar to the initial psychometric testing of the scale, 22 our study included individuals with moderate to severe cognitive impairment, of similar age, racial mix, and functional ability. There was a good distribution of scores on the QUALID among our sample with an overall mean of 20.34 (SD = 7.56) indicating a generally good QoL (possible range 11-55). There was evidence of internal consistency and validity based on item fit and significant correlations with depression, pain, function, and agitation as hypothesized. The fit of the items to the concept of QoL and item mapping suggested that the items covered the concept comprehensively. The item “appears physically uncomfortable” had a slightly high INFIT statistic of 1.79 and OUTFIT statistic of 2.57 and may reflect pain rather than QoL. Pain is an important aspect of QoL, and therefore, we recommend that the items remain in the measure for additional testing to be done. It might be helpful to clarify the focus of the item by revising it to state, “appears physically uncomfortable due to pain.”
Item mapping of the QUALID indicated that there were individuals so low in QoL that they could not be differentiated. Easier items are needed to help differentiate these individuals. Repeatedly, it has been noted that depression and impaired function requiring the individual to be dependent on others for activities of daily living has been associated with lower QoL among older adults with dementia. 39 –41 Specifically, Kolanowski and colleagues 11 found that greater physical dependency in nursing home residents with moderate to severe cognitive impairment was associated with lower positive emotion and greater fluctuation in positive emotion regardless of mental status. Therefore, items that reflect depression and impaired function may be useful to add to the QUALID. Examples of items that could be added to address impaired function include needing help with personal care activities (eg, bathing, dressing, or eating). Items to consider that are associated with depression include changes in sleep patterns, evidence of feelings of worthlessness or hopelessness, or inappropriate or excessive guilt, irritability, or sadness.
The QUALID has the advantage of being short and feasible for a proxy to complete. There may, however, be other factors that are very relevant to QoL for residents with dementia who are not included in this measure. For example, personal factors such as maintaining one’s dignity, privacy, feeling safe, having a purpose in life, engaging in activities that are meaningful to the individual and spiritual well-being, 42 as well as visits from individuals outside of the long-term care setting (eg, family and friends) may influence QoL. Additionally, setting characteristics such as activity programing, ability to go outside, interactions with staff, and staff satisfaction among other factors may also influence QoL. 39,40,43,44 Consideration of including some items to reflect these areas may be useful.
The response items seemed to be appropriate in terms of individuals high in QoL selecting the appropriate response item. In this sample, the majority of the responses were in categories indicative of high QoL. Given the limited use of response options 2 to 4, it may be useful to decrease the number of responses to 2 (1 and 5) so that the item could be endorsed or not endorsed (ie, appears happy, cries, enjoys eating). Future testing should consider this more simplified response option.
Although the QoL scores based on the QUALID were fairly well distributed, overall the participants in this study had good QoL with the mean being 20.34, the median 19.00, and the mode 11.00. This high level of QoL among long-term care residents with moderate to severe dementia has been noted in prior studies. 8,39,40 The lack of a consistent relationship between descriptive factors including cognition, age, race, and comorbidities has also been noted. 1,8,39,40 Gender, however, was associated with QoL, albeit a low correlation. It is possible that there may be other factors moderating the association between gender and QoL such as social support, physical function, and opportunity for meaningful activities. 7,45
Study Limitation and Conclusion
This study was limited by the inclusion of a small, select sample of individuals who consented to participate in a study addressing behavioral and psychological symptoms associated with dementia. Rasch model analysis is noted to result in reliable findings even with sample sizes as small as 25 to 30 participants. Samples of 100 or greater are anticipated to generate sound statistics at a 95% to 99% confidence interval. 46–47 Despite having a sample size of 137, we did not do an a priori power analysis and our findings are limited as they can’t be generalized to all nursing home residents (eg, those that did not consent or were not eligible). Test–retest reliability and interrater reliability were not done and are important to establish the reliability of the measure. Testing of the measure should be considered when completed by individuals from different races and ethnicities as well. This can be done using a differential item functioning (DIF) analysis within the Winsteps statistical program and determining whether the items were responded to similarly between those of different races and ethnicities. Consideration should also be given to the race and ethnicity of the individual observing the resident. Likewise, the measure was not completed over time and there was no consideration given to the ability of the QUALID to identify change in QoL over time or when acute events occur that would be likely to influence QoL such as an illness and/or hospitalization. Despite these limitations, the psychometric testing of the QUALID among a group of moderate to severely impaired residents from 14 nursing homes across 2 states provides some additional support for the reliability and validity of this measure. Recommendations for future use of the measure include the addition of items that will help differentiate those who have low QoL and decrease the response options to just 2 options. Ongoing use of observation tools such as the QUALID is needed to facilitate assessment of QoL among individuals with moderate to severe cognitive impairment and thereby allow for the testing of interventions to improve QoL.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
