Abstract
The Mattis Dementia Rating Scale (MDRS) is a multidimensional cognitive measure popular with clinicians for its brevity, diagnostic validity, and utility in monitoring impairment severity. In spite of the test’s significant value, one task can cause discomfort because the patient is asked to name items the examiner is wearing. This task also creates possible cultural bias and standardization issues. We studied 102 MDRS profiles that included this item. Adjusted scores were calculated by giving all patients full credit for the apparel-naming item. The average adjustment was just one point, and the resulting dementia-severity ratings remained unchanged in 97% of the patients. These results show that administration of the item can be defensibly skipped if there is concern about its appropriateness with an individual patient. The adjusted scores provide a viable and fair alternative that preserves the psychometric properties of this useful instrument.
Introduction
The Mattis Dementia Rating Scale (MDRS) 1 is a brief multidimensional cognitive test battery. It consists of 36 tasks grouped into 5 subscales, Attention, Initiation/Perseveration (I/P), Construction, Conceptualization, and Memory. The MDRS is popular with clinicians for several reasons. It is easy to administer, can be conducted in about 30 minutes, and is a useful way to monitor dementia severity. It was designed to include easy items in order to avoid the floor effects that limit some tests’ discrimination at lower ability levels. 2
Several studies have demonstrated the diagnostic validity and utility of the MDRS. 3 –5 For example, Monsch and colleagues showed its validity in detecting Alzheimer’s disease in a community sample. 4 Matteau and associates reported that it can be used as a screening test for mild cognitive impairment (MCI) although not for differential diagnosis of MCI subtypes. 6 Other studies have shown the measure’s ability to identify the stages of impairment 4,7 –9 and its ability to predict dementia outcomes. 10,11 It is also useful for determining functional capacity and level of assistance required for daily care needs. 12
We have administered the MDRS extensively for clinical and research purposes in our testing laboratories, and it was included in the Mayo Older Adult Normative Studies (MOANS), extending the standardized comparisons beyond age 100. 10 In spite of its utility, we have long been aware that there is one task in the battery which often causes discomfort and consternation on the part of examiners and, more importantly, patients. For item F in the I/P scale, the patient is asked to look at the examiner and name what that examiner is “wearing and holding.” In our practice, many examiners have commented that this task often causes embarrassment. They often do not feel comfortable asking patients to focus on their personal attire, especially older patients or those of the opposite gender: “I feel embarrassed when they name my underwear. I can sense they feel uncomfortable with the question.”
“Sometimes they don’t even look at me; they just name things that they think I should be wearing.” Furthermore, serious questions have arisen about the cultural or religious appropriateness of the task for some patient groups (personal communication with Mayo Clinic Language Department interpreter). If the task is embarrassing or offensive to select patient groups, bias is introduced. Persons feeling discomfort might be more reluctant to name items the examiner is wearing, especially those not readily visible, and thus earn a lower score.
There are other concerns with the task as well. For instance, it is unclear whether the examiner should stand before the patient, giving them a full view of their attire, or remain seated. Scoring criteria are unclear when patients name body parts (eg, ‘brown eyes” or “blonde hair”), items that are not visible such as undergarments, or physical features such as “a smile.” Another issue is that item difficulty will vary for different patients depending on the number of items visible. The number of items actually worn and held by examiners can vary widely, ranging from a simple outfit of pants and a shirt or a layered outfit with many accessories. Further, the manual does not indicate how many items the examiner should “hold.” Thus, the pool of potential correct responses will vary, and some patients might not have the opportunity to earn the full credit of 8 points for the task.
These concerns about cultural sensitivity, standardization, clarity, and patient and administrator comfort led us to examine the characteristics of item F. Our purpose was to determine whether it makes an important contribution to the MDRS total score and dementia severity classification and whether a viable and fair alternative could be developed which would preserve the psychometric properties of the battery.
Methods
Patients
Item F is administered only to individuals who fail to reach a minimum of 14 points for the previous item, a 60-second supermarket fluency task. In this retrospective analysis, we selected 102 consecutive MDRS profiles of individuals who were administered item F on the I/P scale. Patients included 60 males and 42 females ranging in age from 58 to 95 years (M = 75.8, standard deviation [SD] = 7.8) with an average education of 14 years (SD = 3.2). These individuals were clinically referred to the Psychological Assessment Laboratory at the Mayo Clinic in Rochester, MN. The only profiles excluded were those where item F was not administered or MDRS scores were absent from the record.
Measures: MDRS, I/P Scale, and Item F
The only change in the newest edition of the test, the MDRS-2, is the inclusion of MOANS norms in the manual. 2 The test itself is unchanged from the original, so the term “MDRS” is used throughout this manuscript. The MDRS consists of 5 subscales, Attention, I/P, Construction, Conceptualization, and Memory, which comprise the total score. Total scores can range from 0 to 144. Within the I/P subscale, 37 points are possible. Item E, which is supermarket fluency (labeled “Complex Verbal Initiation/Perseveration”), accounts for the most points within the I/P scale (up to 20). Item F, the apparel-naming item being examined here, is labeled “Simple Verbal Initiation/Perseveration” and accounts for up to 8 points. For each of the fluency tasks 60 seconds are allowed. Consonant and vowel repetition tasks as well as motor and graphomotor tasks requiring alternating sequences account for the remaining 9 possible points. As noted previously, item F is administered only in cases where fewer than 14 points are earned on supermarket fluency. For all others (ie, those who earn 14 points or more on supermarket fluency), all 8 points are awarded for item F without administering the task.
Procedure
The MDRS raw scores for items E and F as well as the sum of scores for the I/P scale, scaled score for I/P, total raw score, and age-adjusted MOANS scaled score were extracted from the profiles. Next, we recalculated the scores after adjusting item F scores. Three adjusted scores were compared, including (1) giving everyone half credit (4 points) for the item, (2) giving everyone the mean score (6 points) for the item, and (3) giving everyone full credit (8 points) for the item (ie, all patients were given the same score for item F regardless of their actual performance on the task). Descriptive data for each of the variables are presented in Tables 1 and 2. Cognitive impairment classification from the MDRS manual was determined before and after score adjustment.
Original and Adjusted Scores.
Abbreviations: DRS, Dementia Rating Scale; I/P, Initiation/Perseveration; MOANS, Mayo Older Americans Normative Study; SD, standard deviation.
a Adjusted scores reflect full credit (8 points) on item F for all patients.
Frequency Distribution of Dementia Classification for Pre- and Postscore Adjustment.
Abbreviations: AMSS, Age-Adjusted MOANS Scaled Score; MOANS, Mayo Older Americans Normative Study.
a AMSS and interpretation adopted from DRS-2 Manual. 2
This study was approved by the Mayo Clinic institutional review board.
Results
Raw score distributions for items E and F are shown in Figure 1. A spearman ρ correlation, used because of the skewed distribution of scores, indicated a modest relationship between the two items, r(101) = .35, P < .001. The sample was impaired as a group, by definition, with a mean total score of just 106 (range 52-129). Preadjustment and postadjustment (using the full-credit adjustment) means and SDs are shown in Table 1. Preadjusted and postadjusted impairment classifications are presented in Table 2. When scores were adjusted to add full credit (ie, 8 points) for item F, 97% of the sample remained unchanged in their cognitive impairment classification. Score of one patient changed from the severely impaired to moderately impaired range and two changed from moderately to mildly impaired. Classification was unchanged for the remaining 99 patients. In all, 88% remained unchanged with respect to MOANS total scaled score; of those whose MOANS total scaled score did change, the majority of changes seen were from a scaled score of 2 to 3 (remaining in the severely impaired range). Changes seen in the I/P subscale itself again were also minimal, with the majority of changes being from a MOANS scaled score of 2 to 3 and 71% of the sample remaining unchanged. The median score added to the actual obtained score on item F to give full credit was just 1 point. The full-credit adjustment proved to be the best of the 3 alternatives we tried. When all patients were given 4 points for item F (ie, half credit), dementia severity rating changed in 12% of them; when all patients were given 6 points, dementia rating changed in 9%. The impact was, therefore, most minimal when everyone was given the full 8 points, changing dementia severity rating in 3%.

Raw score distributions for items E and F.
Discussion
To address potential cultural bias, standardization issues, and patient and examiner discomfort, we explored a scoring alternative for MDRS item F. This item gives patients 60 seconds to name what the examiner is wearing and is administered to a subset of low-scoring examinees. Our analysis showed that giving all patients full credit for item F had little impact on MDRS scaled scores and no impact whatsoever on cognitive impairment severity rating in the vast majority (97%) of patients. Furthermore, this adjustment had a minimal impact on raw scores. Nearly half (41 of 102) of the sample obtained full credit on item F on their own, and the average adjustment to give full credit was just one point.
The test developers intended a hierarchical design, wherein easier items are administered only if more difficult items are failed. 2 Item F (labeled “Simple Verbal Initiation/Perseveration”) is administered only if item E (supermarket fluency, labeled “Complex Verbal Initiation/Perseveration”) is failed. Thus, apparel naming serves as a presumably easier proxy for the supermarket fluency item. However, it is not clear whether item F is actually easier, as the mean number of words generated in 60 seconds is much lower. Furthermore, difficulty of the task probably varies depending on how many items the examiner is actually wearing and holding.
The correlation between the two tasks was only modest in our sample (r = .35), and it is not clear that they measure the same construct. We did not find descriptions of any tasks similar to item F in the literature nor information on the construct validity of the item F task itself, independent of the I/P scale. Evidence for the construct validity of the I/P scale has been modest at best 13 and has usually consisted of comparing I/P with phonemic fluency scores on the Controlled Oral Word Association test. 14 Smith and colleagues 10 found the I/P scale has particularly poor internal consistency—again suggesting that the various items tap different constructs—and concluded that the items should not be aggregated into a single subscale score. Regardless of whether the tasks tap similar constructs, we found that using an adjusted score was an acceptable alternative to administering item F. The adjustment applies to only a small subset of patients who are quite impaired as a group and could be applied in select patients at the clinician’s discretion.
These findings are salient in the context of concerted efforts to reduce bias and improve cross-cultural assessment. 15 Our results do not obviate the need for continued efforts to generate culturally matched normative data, develop translations, and determine the validity of the battery as a whole in various populations. 3,16,17 Our results do, however, offer a psychometrically sound alternative to item F in cases where its administration could compromise the clinical milieu. Our study was limited to 102 patients using a retrospective design. Future studies on this issue could include a larger sample and prospectively survey participants for their perspectives.
Footnotes
Acknowledgments
The authors would like to thank Lori L. Solmonson for assistance with the manuscript and Tiffany J. Vedamuthu for assistance in abstracting test protocols and entering data.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
