Abstract
Many patients with presumptive Alzheimer’s disease (AD) or other dementias may show minimal impairment on the Boston Naming Test (BNT), a visual confrontation naming measure. We sought to determine whether a semantic naming test, the Auditory Naming Test (ANT), would improve accuracy for identifying naming deficits in patients diagnosed with dementia (
Language impairment, especially naming difficulty, occurs frequently in many dementias and may be present even prior to clinical diagnosis (Cummings & Benson, 1992; Jelcic et al., 2012; E. Miller, 1989). The severity of naming deficits is more pronounced in some dementias such as Alzheimer’s disease (AD), primary progressive aphasia (PPA) temporal lobe variant of frontotemporal lobar degeneration, and dementia with Lewy bodies (DLB) than in others such as subcortical vascular dementia (VaD) or Huntington’s disease (Brandt, Bakker, & Maroof, 2010) or in amnestic mild cognitive impairment (aMCI; Duong, Whitehead, Hanratty, & Chertkow, 2006). Naming deficits are not a clinical feature of behavioral variant frontotemporal dementia (FTD; Laforce, 2013). Hence, detection of these naming deficits may be important in the differential diagnosis of dementia or in distinguishing dementias such as AD from possible pre-dementia states such as aMCI.
Our understanding of the neural networks that underlie naming is informed by studies with both normal participants and patients suffering from neurological disorders (Baldo, Arévalo, Patterson, & Dronkers, 2013; Gleichgerrcht, Fridriksson, & Bonilha, 2015; Hamberger, Habeck, Pantazatos, Williams, & Hirsch, 2014; Hamberger & Seidel, 2003). Naming of objects or abstract entities is a multimodal process of cortical networks that include visual processing and recognition, both intentional and automatic semantic processing, abstract representation, and execution of speech output (Duong et al., 2006; Gleichgerrcht et al., 2015). Brain regions subserved include the visual cortex in the occipital lobe (bilaterally), the occipitotemporal/fusiform regions (bilaterally), anterior temporal cortices (bilaterally), left posterior superior temporal gyrus, left angular gyrus, left inferior frontal gyrus, the left posterior inferior frontal gyrus, and subcortical structures. Pathology in one or more of these regions gives rise to the characteristic naming difficulties of different disorders including specific dementias (Gleichgerrcht et al., 2015). For example, the visual agnosia common to posterior cortical atrophy may be present in advanced, but not early-stage AD. Pathology in more anterior regions (i.e., in the temporal cortex) may impair word meaning or word retrieval—including the popularly known phenomenon of tip-of-the tongue. These impairments are common in AD, PPA variant of frontotemporal lobar degeneration, and focal strokes (Gleichgerrcht et al., 2015).
The Boston Naming Test (BNT; Kaplan, Goodglass, & Weintraub, 1983), a visual confrontation naming test, is the most frequently used instrument for assessing naming disorders. Until recently, the BNT was the only used naming test in a typical neuropsychological dementia battery (Hobson et al., 2011; B. W. Williams, Mack, & Henderson, 1989). However, the BNT suffers from a high false negative rate (Domoto-Reilly, Sapolsky, Brickhouse, & Dickerson, 2012; Lansing, Ivnik, Cullum, & Randolph, 1999). Conversely, there is age-related decline on the BNT in normal participants (Lansing et al., 1999; MacKay, Connor, & Storandt, 2005; Randolph, Lansing, Ivnik, Cullum, & Hermann, 1999) resulting in relaxed cutoff scores and unintentionally missed diagnoses in patients with mild dementia (B. W. Williams et al., 1989). BNT total scores are positively correlated with education in both normal and AD populations (Lansing et al., 1999; Randolph et al., 1999).
The BNT has been criticized on a number of grounds including poor psychometric properties, inadequate standardization, inadequate norms, inadequate sampling of categories, and insufficiently encompassing all the processes involved in the multifaceted construct known as naming (Harry & Crowe, 2014). Severity of anomia measured on the BNT varies with the specific type of dementia and with visual perceptual problems that are sometimes contributory (Braaten, Parsons, McCue, Sellers, & Burns, 2006; Harnish et al., 2010; Harry & Crowe, 2014; Lukatela, Malloy, Jenkins, & Cohen, 1998; Stern, Richards, Sano, & Mayeux, 1993; V. G. Williams et al., 2007). The utility of the BNT in the diagnosis of dementia of the Alzheimer’s type has been criticized because naming deficits are typically only evident in moderate-to-severe AD, but not in mild AD or mild cognitive impairment (MCI; Testa et al., 2004). As high as 59% of patients with very mild or mild AD performed in the normal range on the BNT (Domoto-Reilly et al., 2012). The BNT is also susceptible to deficits in visual recognition (Baldo et al., 2013).
In recent years, another naming test has become available. Hamberger and Seidel (2003) developed the Auditory Naming Test (ANT) as an alternative to visual naming tests. The ANT requires participants to respond directly to semantic cues while bypassing the initial steps of visual perception and recognition, as is required with the BNT. Thus, rather than presenting a patient with a line drawing of an object or animal as on the BNT, he or she is provided with a verbal cue (e.g., “what a king wears on his head”). The ANT may be less susceptible than the BNT to limited vocabulary because items are more familiar to most participants (Hamberger & Seidel, 2003; Yochim, Rashid, Raymond, & Beaudreau, 2013). Like the BNT, total correct item-response scores may be quantified on the ANT. The ANT also allows for determination of two other measures that are even more sensitive than total word response scores: tip-of-the tongue extended latency scores and reactions times per response.
Functionally, there are both similarities and differences between the ANT and BNT: Both tests share common left temporal lobe neural networks involved in lexical–semantic naming (Hamberger et al., 2014). However, the ANT, but not the BNT, specifically activates the left hemisphere. Unlike the BNT, the ANT activates left frontal regions but not the left parietal lobe.
The ANT has been found to be a reliable, valid, and sensitive instrument for revealing naming deficits in non-geriatric patients with left temporal lobe epilepsy (Hamberger & Seidel, 2003). It was of interest to us to consider the ANT as an alternative to the BNT in diagnosing naming deficits in an older sample of patients with dementia. Considering both the overlap but also the very real differences in these two tests we were curious to determine if the ANT would be more revealing than the BNT for naming deficits in dementia. Indeed, research indicates that cognitively intact elderly participants (Hanna-Pladdy & Choi, 2010) and those with dementia are more likely to show impairment on auditory naming tests than on visual naming tests (Brandt et al., 2010; K. M. Miller, Finney, Meador, & Loring, 2010). However, these studies used either another auditory naming tests or an abbreviated form of the ANT with less well-characterized psychometrics than the complete ANT. Population samples were of limited size. The contribution of age, education, and sex were not considered with the dementia population.
Preliminary research with the full ANT demonstrated its clinical utility in the diagnosis of dementia (Cuesta, Hirsch, & Jordan, 2004; Hirsch, Cuesta, & Jordan, 2008). We sought to more fully explore this relationship with a larger population. Our goals were to determine (a) whether the ANT was more likely than the BNT to identify naming difficulties in patients with cognitive impairment, (b) which naming test was more likely to reveal naming deficits in amnestic MCI or specific subgroups of dementia, and (c) the role of age, education, and sex or degree of cognitive impairment on auditory versus visual naming in patients with dementia. Last, we sought to develop norms on the ANT for a normal older population and to determine whether normal aging influences auditory naming.
Method
Participants
Five-hundred fifty-nine patients with memory or other cognitive complaints were referred by their physicians to our outpatient Memory Evaluation and Treatment Service for assessment. All were native English speakers or learned English before age 5. Two-hundred forty-nine were male, and 310 were female. Mean age was 76.9 (± 8.3) years, and education was 14.5 (± 3.1) years. Patients were evaluated over the course of a 10-year period by an interdisciplinary team of neurologists, geriatricians, and neuropsychologists using diagnostic criteria consistent with the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA; Dubois et al., 2007; see below).
A volunteer group of older orthopedic patients with no cognitive complaints was also evaluated. Using a cutoff score of 27 on the Mini Mental State Exam (MMSE), 31 participants were retained to serve as a non-demented normal sample (“ortho normal”) for normative and comparative purposes according to the procedure of O’Bryant et al. (2008). Mean age was 72.9 (± 6.4) years, and education was 16 (± 3.2) years.
Standard Protocol Approvals, Registration, and Consents
An ethical standards committee of the Independent Review Board for clinical research at our institution approved (a) the retrospective analysis of data presented in this article for determination of the effectiveness of tests used to identify naming deficits in a population presenting with complaints of cognitive impairment and (b) conducting a study on a normal sample of volunteers with no memory complaints.
Procedures
The diagnosis of dementia and specific subtypes was based on the clinical team’s consensus. Factors considered were clinical presentation, course of illness, neuroimaging, laboratory results, ratings on activity of daily living scales (Lawton & Brody, 1969; Pfeffer, Kurosaki, Harrah, Chance, & Filos, 1982), informants’ reports of behavior, and performance on neuropsychological tests. The neuropsychological battery, a modification of the one used by Stern et al. (1992), consisted of tests of mental status (MMSE), Standardized Assessment of Concussion (SAC), verbal memory (Buschke Selective Reminding Test, Logical Memory Test from the Wechsler Memory Test–Third Edition or WMS-III), visuospatial ability and visual memory (Rey Complex Figure Test, Benton Visual Retention Test: matching and recognition), visual confrontation naming (BNT), auditory naming (ANT), letter fluency (Controlled Oral Word Association Test), semantic fluency (Animal Naming), repetition of phrases, verbal comprehension (Complex Ideational Material), verbal abstract reasoning (Similarities subtest from the Wechsler Adult Intelligence Scales–Third Edition or WAIS-III), visual abstract reasoning (Identity and Oddities subtest from the Mattis Dementia Rating Scale), verbal attention (Digit Span subtest from the WAIS-III), visual attention (Digit Symbol Coding subtest from the WAIS-III; Trail Making Test part A), and mental flexibility (Trail Making Test part B).
Patients were considered cognitively impaired on specific tests if they performed below recommended cutoff scores, or in the absence of cutoff scores, were at least two standard deviations below the means from published norms. To clinically diagnose a patient with dementia, there had to be impairment in verbal memory and at least one other cognitive domain (e.g., visuospatial ability, executive function, language) as well as impairment in activities of daily living. Impairment in memory only resulted in a clinical diagnosis of aMCI. Volunteer orthopedic patients who scored 27 or above on the MMSE served as an additional matched control normal sample of non-demented participants (O’Bryant et al., 2008).
Statistical Method
The data were analyzed with the SPSS, Version 23 for Mac (IBM, Armonk, New York) or JMP statistical discovery software, Version 12 (SAS, Cary, North Carolina). Descriptive analyses were run to determine frequencies and distributions of all variables. For the current article, only demographic variables and performance on the MMSE, BNT total score (correct spontaneous responses plus stimulus cued responses), and ANT total score responses are shown for specific dementia groups, patients with aMCI, and non-demented normal patients. Norms are also shown for a non-demented orthopedic sample. Because MMSE, ANT, and BNT raw scores were not normally distributed, non-parametric statistics were used for ANOVA (Kruskal–Wallis), matched pairs comparisons (Mann–Whitney
To delineate the differences in performance on the ANT and BNT, raw scores for each patient or participant were converted to dichotomous impairment scores (1 =
Cutoff Scores for ANT and BNT Correct Total Scores.
During the course of this evaluation and diagnosis of patients, two studies were published with improved stratification of elderly participants administered the BNT (Gangulia et al., 2013; Zec, Burkett, Markwell, & Larsen, 2007). Concerned that some of our patients may have been misdiagnosed as having no naming impairment on the BNT, these data were re-analyzed utilizing these stricter cutoff scores and are also reported in this study (Table 1). Logistic regression analyses, receiver operating characteristic curves with area under the curve (ROC AUC), sensitivity, selectivity, positive predictive value (PPV), and negative predictive value (NPV) were determined for the major groups with dementia.
Exploratory factor analysis with dementia patients was conducted on the entire battery to determine factor structure and thus more fully delineate the characteristics of our neuropsychological tests on this population. (Performances on the MMSE and SAC were excluded from this analysis because they are omnibus screening tests rather than instruments measuring distinct cognitive domains). Initially, principal components analysis was conducted. This was followed by scree plot and eigenvalues examination. The final stage involved rotation utilizing the ProMax procedure.
Results
Characteristics of Clinical Groups
The vast majority of our patient population satisfied the clinical criteria for a diagnosis of dementia (
The characteristics of the AD, VaD, mixed AD/VaD, and aMCI patients are shown in Table 2. Also included for demographic comparisons only are normal patients with memory complaints and normal orthopedic volunteers. AD patients were slightly older and less educated than normal patients or normal orthopedic volunteers. The AD and mixed AD/VaD groups were generally the most similar in terms of overall scores on the MMSE, BNT, and ANT. The normal patients and MCI patients were indistinguishable on any of the variables.
Clinical Groups Analyzed.
Relationship Between ANT and BNT
Because the major focus of this study was a comparison of the ANT with the BNT, we examined the relationship between the two tests and then determined how the same patients performed on both tests. First, we determined that the correlation between the ANT and BNT was moderate for normal patients (ρ = 0.45,
Factor Analysis
Factor analysis of neuropsychological test score performance was conducted on our sample of patients diagnosed with dementia. On the basis of scree plot and eigenvalues examination, a four-factor solution was derived (Table 3). As can be seen, the ANT and BNT total scores loaded highly on Factor 3, demonstrating that they are grossly measuring a similar construct, naming, in dementia patients. Other factors revealed were a visual factor (Factor 1), a verbal memory factor (Factor 2), and a visual memory factor (Factor 4).
Rotated Factor Loadings of the Neuropsychological Battery.
Frequency of Impairment
The McNemar test was performed for the entire dementia sample, as well as specific dementia subgroups, using cutoff scores from published norms at the time of initial diagnosis (Hamberger & Seidel, 2003; Spreen & Strauss, 1998). For the entire dementia sample, twice as many scored impaired on the ANT (68.3%) as the BNT (34.6%; Table 4;
Observed Frequencies of Impairment With ANT Versus BNT in All Dementias.
Observed Frequencies of Impairment With ANT Versus BNT in AD.
As a check against misdiagnosis and possible spurious statistical effects, the BNT data were re-analyzed using more stringent cutoff scores from norms that were not available at the time of the initial diagnoses of many of these patients (Gangulia et al., 2013; Zec et al., 2007). These analyses revealed that there was a significant shift in diagnoses from less to more impaired on the BNT regardless of which set of newer norms were used (
Frequency of Impairment on the Boston Naming Test as a Function of Normative Data Used.
Logistic Regression: BNT Impaired Scores Versus ANT Impaired Scores.

ROC curve for impairment with BNT and ANT in AD.
The pattern of greater frequency of detecting naming disorders with the ANT versus the BNT in the VaD sample with the McNemar test was virtually indistinguishable from that in AD: χ2(1,
The pattern in the mixed AD/VaD sample was similar qualitatively, but far less pronounced in all parameters: χ2(1,
There were no significant differences in impairment frequencies on the ANT versus BNT in the DLB group or the aMCI group. There was no impairment with either the BNT or ANT in either of the two samples of normal individuals (i.e., normal patients seen at our memory evaluation clinic or an orthopedic control sample).
Interaction Effects
Normal sample
Correlation coefficients between ANT total score and age or education in the orthopedic control group (
AD sample
BNT total scores (spontaneous responses and response to stimulus cues) were directly related to level of education. The Kruskal–Wallis
The Kruskal–Wallis
Last, we sought to determine whether education or level of cognitive impairment attenuated or conversely accentuated the relative differences between the ANT and the BNT in each test’s ability to reveal a naming deficit in our largest group, those who were diagnosed with AD. To this end, we again determined the frequencies of impairment on the BNT versus ANT, using the cutoff scores of Zec et al. (2007) and Hamberger and Seidel (2003), but this time performed the analyses per stratum of each education level (discussed above) or range of scores on the MMSE. The latter was chosen as a proxy for dementia severity. Although not without controversy in terms of its sensitivity to detect dementia, especially in a highly educated sample, there is nevertheless some precedent for using different scores on the MMSE as a crude measure of dementia severity (Perneczky et al., 2006; Reisberg et al., 2011). Because we were analyzing these data in patients that we had already diagnosed as having dementia, we made slight modifications in the labels used that were associated with specific MMSE score ranges:
Frequency of Impairment in BNT or ANT Versus Level of Education.
Impairment in BNT or ANT Versus Level of Dementia.
Discussion
The present study supported our rationale for inclusion of the ANT in a neuropsychological battery for diagnosing patients for possible naming deficits in dementia. Using two instruments with well-defined reliability and validity, the ANT and the BNT, our research revealed that the ANT was more sensitive than the BNT to detect naming difficulties in most patients with dementia, especially those with AD, VaD, and mixed AD/VaD but not those with a possible pre-morbid state, aMCI. Although word-finding difficulty may be the first cognitive difficulty reported by many patients with suspected dementia, it may not be revealed in a battery that uses only the BNT, the most commonly used visual confrontation naming test.
The failure to consistently demonstrate a naming difficulty with the BNT may be a consequence of the less than desirable ecological validity of this test. Thus, visual confrontation naming test performance does not correlate as well with subjective complaints of word-finding difficulties as does auditory naming (Hamberger & Seidel, 2003). In part, this is related to the inclusion of low-frequency (harder) items on the BNT, such that what might be measured is less anomia than a limited expressive vocabulary. The ANT controls for this confound.
Another potentially significant consideration is the interaction between task and neuroanatomical demand. There are, indeed, distinctions in this regard between visual confrontation naming versus auditory naming. However, the findings have been inconsistent with regard to neuroanatomical regional localization, choice of participants (e.g., temporal lobe epilepsy patients vs. normal participant), or experimental design. Earlier research suggested that auditory naming was more diffusely localized than visual confrontation naming (Hamberger, Seidel, McKhann, & Goodman 2010; Tomaszewski, Harrington, Broom, & Seyel, 2005). However, a more recent study by Hamberger et al. (2014) demonstrated that the two modalities tapped both unique and overlapping brain regions. The authors suggest that auditory naming, which involves frontal activation, might reflect a higher level of response competition and resolution than visual confrontation naming. They also argue that the ANT may be more language-specific and left (dominant) hemisphere–focused. We posit that this greater neuronal and cognitive demand may contribute to the greater sensitivity of the ANT than the BNT in patients with specific dementias. However, the more extensive bilateral activation required in visual confrontation may not significantly interfere with performance in patients with mild AD or some other dementias when the integrity of these pathways have not been compromised through neuropathology.
Thus, the distinction between the auditory and visual confrontation naming tests is yet another example of the well-recognized phenomenon in neuropsychological assessment that few, if any cognitive constructs, are unitary and, therefore justify inclusion of two or more instruments that measure the same broad construct. Indeed, certain omnibus tests such as the WMS contain several highly inter-correlated but nevertheless distinct subtests of both verbal and visual memory. Therefore, rather than being redundant with two naming tests in our battery, the inclusion of the ANT for assessing patients with suspected dementia actually improved clinical utility and supported that very argument by Hamberger et al. (2014). The latter concluded that using only one naming test runs the risk of failing to detect a genuine naming deficit (i.e., false negative). Such was our rationale for both clinical and research purposes as it has been our observation prior to such inclusion that, despite patient complaints of naming difficulties, we were often not able to demonstrate this impairment psychometrically when we administered the BNT alone.
Inclusion of the ANT to our battery substantially improved the odds of detecting naming deficits in patients with AD, VaD, and mixed AD/VaD dementia. Sensitivity, specificity, and PPV in diagnosing naming impairments in dementia were respectable when the ANT was added to our battery. Because the ANT failed to reveal significant naming deficits in our aMCI and normal samples, the false positive rate associated with the ANT was minimal.
Our research is consistent with that of other investigators who also found auditory naming to be sensitive to naming problems in patients with dementia. However, their findings are neither directly comparable with one another nor to ours due to differences in tests used or items of tests selected for assessment, the populations contrasted, or the statistical analyses performed. K. M. Miller et al. (2010) reported seemingly inconsistent findings. A mixed sample of patients with mild-to-moderate dementia of various etiologies performed more poorly on the ANT than on the Columbia Visual Naming test, a visual confrontation naming test developed by Hamberger and Seidel (2003). However, they only used a subset of items from the ANT and the Columbia Visual Naming test, each with unknown psychometrics. When the abbreviated version of the ANT was compared with a 15-item version of the BNT, the results were the converse. Now the patients from their combined dementia group showed greater accuracy on the ANT than the BNT. It is important to note that a methodologically questionable procedure was used, with the full 60-item BNT administered to 10% of their patient sample and the score was then divided by four for comparative purposes. It is unknown what confound was introduced by administering the full test to some of their participants.
Brandt et al. (2010) reported that patients with mild-to-moderate AD performed more poorly than normal controls on an auditory naming test. Both patients and controls had greater difficulty with auditory naming than with visual confrontation naming. Of note, this study was performed with an auditory naming test that the authors themselves developed with stimuli consisting of sound files associated with objects (e.g., musical instrument, vehicles) or animals, not with semantic stimuli from the ANT. In addition, their comparison with visual naming was also with their own test and not with the BNT, although that test was also administered. In patients with mild AD, their auditory naming test had the identical sensitivity to their visual confrontation naming test, but with lower specificity. Consequently, an auditory naming test did not improve diagnostic accuracy relative to visual confrontation naming. Regardless of the differences among these three studies, auditory naming tests did show sufficient sensitivity to reveal naming problems in a population of patients with dementia. In that regard, the ANT serves a useful diagnostic role when assessing patients with putative dementia. This was apparent in our study groups with specific types of dementia such as AD, VaD, and mixed AD/VaD dementia, a finding consistent with the research of others (Brandt et al., 2010; K. M. Miller et al., 2010).
Because sample sizes in patients with other types of dementia were too small to meaningfully analyze in the present study, it was impossible to determine how well the ANT could reveal the presence or absence of naming problems. This limitation in sample size was also true for both our normal, non-demented patient group and our orthopedic controls and did not allow for statistical comparison with our dementia groups. Therefore, normative data for these elderly normal populations should be considered preliminary. Perhaps future research with larger samples will be more elucidative.
Last, consistent with previous research, both ANT and BNT scores were positively correlated with educational level but negatively correlated with severity of dementia in patients with AD. Most important, regardless of education level or severity of dementia, the relationship of greater sensitivity of the ANT than the BNT remained consistent. These findings support our contention that the ANT may be of particular diagnostic utility in revealing naming deficits in less severely impaired and better educated individuals with dementia, at a point where the BNT, alone, may be insensitive. However, AD patients’ BNT scores were inversely correlated to age but ANT scores were not. The failure to establish an age-related decline in auditory naming was surprising but may reflect the relative insensitivity of accuracy scores on the ANT to detect this phenomenon. The ANT has two other parameters that are even more sensitive to accuracy scores: “tip-of-the tongue” extended latencies and response times. We are presently exploring whether these scores will reveal a possible education effect in patients with dementia as well as naming difficulties on the ANT in patients with aMCI.
Footnotes
Acknowledgements
The authors gratefully acknowledge Dr. Marla Hamberger for providing them with stimuli for the Auditory Naming Test. They also thank Dr. Jessica Elder and Ms. Kristin Bonistall for assistance with the statistical analyses.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research and/or authorship of this article.
