Abstract
Background
In prior research, the Digital Assessment of Cognition (DAC), a brief digitally administered neuropsychological protocol that assesses verbal episodic memory, verbal working memory, and language, has been used to classify a small sample of memory clinic patients (n = 77) into four meaningful clinical groups.
Objective
The current research sought to extend these findings with a considerably larger sample.
Methods
The DAC was administered to 179 ambulatory care/memory clinic patients (45.30% female; 91.10% Caucasian). A comprehensive analysis of DAC core outcome measures and behavior reflecting process/errors was undertaken. Traditional paper/pencil assessment was also obtained. Using Jak, Bondi criteria (2009), paper/pencil test results classified patients into five groups: cognitively unimpaired (CU; n = 74), subtle cognitive impairment (SCI; n = 21), amnestic mild cognitive impairment (aMCI; n = 21), combined dysexecutive/mixed MCI (dys/mxMCI; n = 22), and mild dementia (n = 41).
Results
The aMCI group presented with many of the classic features consistent with amnesia, i.e., rapid forgetting, reduced free recall clustering, and profligate responding to recognition foils. Latency for correct recognition responding was slower for aMCI compared to the CU group and appears to be associated with a neurocognitive network measuring both memory and language-related operations. SCI and dys/mxMCI groups tended to produce more perseverations on working memory test trials; and produced lower scores on DAC executive outcome measures that assessed auditory span and semantic fluency.
Conclusions
These findings support the criterion and construct validity of the DAC. When brought to scale the DAC could be an effective tool to assess for emergent MCI and dementia syndromes.
Keywords
Introduction
Interest regarding how to use and deploy brief digital assessment tests is growing.1–6 Indeed, there are a number of advantages when suspected neurocognitive impairment is assessed using digital assessment technology. First, a critical issue revolves around the reliability for both test administration and scoring. Commonly used paper and pencil tests such as the Mini-Mental State Examination (MMSE), 7 the Montreal Cognitive Assessment (MoCA), 8 and the Saint Louis University Mental Status Examination 9 ask patients to write, draw, or copy geometric figures. How these test items are scored from examiner to examiner can be variable. Digital assessment technology obviates this problem as test administration and scoring are carried out automatically, thus ensuring the highest degree of objectivity and reliability. Accurate identification of patients for treatment targeting cognitive impairment, including potentially higher risk options like monoclonal antibody therapy, requires objective measurement. 10 Thus, the use of digital assessment technology can result in better clinical decision making. A second advantage of digital assessment technology revolves around the capacity to, as it were, draw back the curtain and reveal the occult or hidden content related to key neurocognitive constructs that underlie summary scores from commonly administered paper/ pencil neuropsychological tests via an analysis of the process and errors associated with both correct and incorrect responding.4–5,11–13
The Digital Assessment of Cognition (DAC) is a brief, 7-min, iPad administered and scored protocol of neuropsychological tests. 4 The DAC assesses verbal episodic memory using a 6-word version of the Philadelphia (repeatable) Verbal Learning Test 14 (P[r]VLT). Verbal working memory is assessed with three trials of 5 digits backward.15–18 Finally, language and lexical/ semantic operations are assessed with the ‘animal’ fluency test. 19 In previous research, Libon and colleagues 4 administered the DAC to a comparatively small group of memory clinic patients (n = 77). In this prior research, person-centered statistics using a core of four DAC outcome variables was able to classify patients into four clinically meaningful groups suggesting cognitively unimpaired (CU), mild dementia, amnestic MCI (aMCI), and dysexecutive MCI (dMCI).
In the current research the DAC, along with a protocol of paper and pencil neuropsychological tests, was administered to a larger sample of memory clinic patients. Using the results from paper/ pencil tests, patients were classified as presenting with normal cognitive abilities (CU), subtle cognitive impairment (SCI), 20 mild cognitive impairment (MCI)21,22 or mild dementia. The goal of this research was to extend prior findings 4 and assessed how well the DAC can dissociate patients into clinical groups.
Methods
Participants
The current research examined a group of 185 memory clinic patients. Six protocols were removed because of internet connectivity problems. This resulted in a final sample of 179 participants (45.30% female; 91.10% Caucasian). Participants came from two sources including: (1) the Rowan-Virtua, School of Osteopathic Medicine (SOM), the New Jersey Institute for Successful Aging, (NJISA) Memory Assessment Program (MAP; n = 138) and (2) outpatient referrals for neuropsychological assessment for suspected dementia (n = 41). The NJISA MAP program provides a comprehensive, outpatient evaluation, and work-up for suspected alterations involving neurocognition and personality/ behavior. In addition to neuropsychological assessment, MAP patients were evaluated by a board certified psychiatrist.
In conjunction with the paper/ pencil neuropsychological evaluation described below, the MMSE 7 was administered; a detailed clinical interview was conducted with patients and their family members; and family members were asked to rate functional disabilities using the Lawton and Brody Activities of Daily Living/Instrumental Activities of Daily Living questionnaire. 23 A CT or MRI study of the brain was routinely obtained to rule out potentially treatable medical illness. Serum tests including a CBC, CMP, thyroid/B12, folate, and an analysis of lipids was obtained. The sample does not contain any patients with sudden or de novo stroke, or patients diagnosed clinically with idiopathic Parkinson's disease or fronto-temporal lobe dementia.
Participants were excluded from this study if English was not their first language; or if there was any history of head injury, substance abuse, a significant psychiatric disorder such as major depression, other neurologic illness such as epilepsy, or metabolic disorders such as B12, folate, or a thyroid deficiency. This study was approved by the Rowan University Institutional Review Board with consent obtained consistent with the Declaration of Helsinki.
The digital assessment of cognition
The order of DAC test administration was identical as described by Libon and colleagues 4 —two 6-word Philadelphia (repeatable) Verbal Learning Test 14 [P(r)VLT] - immediate free recall test trials; the semantic/‘animal’ fluency test (60 s); three trials of 5-digits backwards from the Backward Digit Span Test (BDST)17,18; a 6-item depression/ anxiety screening inventory; and concluding with the P(r)VLT- delay free recall and delay recognition test conditions.
The 6-word P(r)VLT was modeled after the California Verbal Learning Test 24 and the original P(r)VLT 9-word version as described by Price and colleagues. 14 For this test, 2 words were drawn from 3 semantic categories (fruits, tools, school supplies). Each word was spoken by the iPad one second at a time. Two immediate free recall test trials were administered (range 0–6). After each immediate free recall test trial, the iPad asked the participant to verbally recall as many words as possible.
Throughout the test administration, participants spoke their responses aloud, and the iPad recorded all speech for later processing and analysis. The protocol was administered using an 11-inch Apple iPad Pro. Consistent with previous research, 4 all testing was proctored; instructions were delivered verbally by the iPad; and the iPad was positioned on a flat surface in the portrait orientation. As listed in Supplemental Table 1, the DAC contains a corpus of seven core outcome measures; and an additional corpus of 10 outcome measures that assess process and errors (see references11–13).
Core DAC outcome measures
The Philadelphia (repeatable) Verbal Learning Test. The four core outcome measures drawn from the P(r)VLT include; trial 1, immediate free recall; trial 2, immediate free recall; delayed free recall; and the delayed recognition ratio. On the delayed recognition test condition, participants saw and heard the iPad display groups of three words. Each group of three words contained one of the original target items along with one prototypic semantic foil (e.g., apple, hammer) and one generic semantic foil (e.g., peaches, wrench). Six recognition trials were administered, and participants were asked to touch the one word that was part of the original word list. The relation between identifying targets from the original word list (i.e., correct hits) and correctly rejecting recognition foils was expressed with a correct hit versus false positive recognition ratio, i.e., [recognition hits/ (recognition hits + total incorrect recognition foils)]. This formula (range = 0.00–1.00) was modeled after Rascovsky and colleagues. 25 A higher score suggests greater numbers of correctly identified recognition hits in relation to fewer numbers of incorrectly identified recognition foils.
The Backward Digit Span Test (BDST). Outcome measures designed to assess auditory span and verbal working memory were operationalized using the Backward Digit Span Test BDST.15–18 On this test, participants were asked to repeat three trials of 5 numbers backwards. Two core outcome measures were obtained from this test including percent ANY order and percent SERIAL order recall. All three test trials contained contiguous numbers that were placed in strategic positions (e.g., 1
Percent ANY order recall is the sum total of every digit correctly recalled regardless of serial order position, divided by the total possible correct responses (i.e., 3 trials of 5 digits = 15 total responses). This measure was created to assess for less complex aspects of working memory characterized mainly by auditory span and rehearsal mechanisms. Percent SERIAL order recall tallied the total number of digits correctly recalled in accurate serial position also divided by the total possible correct responses. As described by Lamar and colleagues,17,18 this measure was created to assess verbal working memory and mental manipulation.
The ‘Animal’ Fluency Test. Lexical/ semantic operations were assessed with the ‘animal’ fluency test 19 where participants were given 60 s to verbally generate animal exemplars. The number of correct responses was tallied, and all responses were recorded by the iPad.
DAC Memory and Executive Index Scores. Standardized index scores (z- scores) designed to express severity of episodic memory and dysexecutive impairment were compiled from the grand mean and standard deviation from the entire sample. The DAC Memory Index score averaged P(r)VLT total delayed free recall and the recognition ratio.14,26,27 The DAC Executive Index score averaged total output on the ‘animal’ fluency test and BDST percent SERIAL order recall. Neither of the DAC summary indices were used for SCI or MCI classification.
DAC error and process outcome measures
P(r)VLT free recall semantic cluster responses, extra-list intrusion errors, and perseverations. All free recall semantic cluster responses, extra-list intrusion errors, and perseverations were scored consistent with prior research.14,20,24 All free recall cluster responses, extra-list intrusion errors, and perseverations were tallied to create three single scores, respectively.
Percent P(r)VLT savings score (range 0–100%). This behavior was assessed consistent with prior research 26 by dividing all words recalled on the delay free recall test condition by all words recalled on the second immediate free recall test condition.
P(r)VLT recognition foils and recognition latency. Separate tallies were compiled for recognition prototypic (range = 0–6) and generic (range = 0–6) recognition foils. The mean latency or reaction time (milliseconds) only for correct recognition test trials was tallied and averaged.
BDST transposition and perseveration errors. The total number of out-of-sequence or transposition errors were tallied as described by Hurlstone and colleagues
28
and Emrani and colleagues.
29
A variety of errors were coded including (1) within-trial perseveration errors where patients repeated a digit within a given trial (i.e., 16579 – ‘9756
The ‘animal’ Fluency Association Index (AI). 19 All animal exemplars were coded (yes = 1, no = 0) on six attributes: size (big, small); geographic location (foreign, local); habitat (farm, pet, water, prairie, forest, African-jungle, Australian, widespread); zoological class (insects, mammals, birds, fish, amphibians, reptiles); zoological orders, families, and related groupings (feline, cervidae, rodenta), and diet (herbivore, carnivore, omnivore). The ‘animal’ fluency AI is the cumulative number of shared attributes between all successive responses divided by the total number of words generated minus one. The sum of the shared attributes was divided by the number of responses minus one to guard against inflating the AI as attributes from the first response are never actually figured into the sum of the scaled attributes. This index was devised to measure the strength of the semantic association between consecutive responses.
Paper and pencil neuropsychological evaluation
Six scores from paper/ pencil neuropsychological tests were obtained and scored using available norms. Executive abilities were assessed with the letter (‘FAS’) fluency test 30 and the Trail Making Test–Part B. 30 Language was assessed with the Boston Naming Test 31 and the Wechsler Adult Intelligence Scale-III Similarities subtest. 32 Normative values as provided by Heaton and colleagues 30 were used for letter fluency, Trail Making Test- Part B, and the Boston Naming Test. Verbal episodic memory was assessed with the 9-word California Verbal Learning Test-short form 33 delayed free recall and recognition discriminability test conditions.
Determination of subtle cognitive impairment and mild cognitive impairment subtypes
The six paper/pencil outcome measures described above were used to classify participants into groups consistent with SCI as suggested by Edmonds and colleagues 20 ; or MCI as suggested by Bondi and colleagues.21,22
Subtle cognitive impairment. Participants were classified as presenting with SCI (n = 21) when one score from one neurocognitive domain (say, memory), and a second score from another neurocognitive domain (say, executive abilities) were below one standard deviation using available norms.
Single domain and multi-domain mild cognitive impairment. Single domain MCI syndromes were determined when participants scored one standard deviation below normative expectations on any of two measures from a single cognitive domain. Mixed MCI syndromes were determined when participants scored one standard deviation below normative expectations on any of two measures from two or more cognitive domains. On the basis of these procedures, 21 patients were diagnosed with single domain amnestic MCI (aMCI). Because of the small number of dysexecutive dys/mxMCI patients (n = 8), these patients were combined with the dysexecutive group (n = 14) patients to create a combined dysexecutive/ mixed MCI group (n = 22).
Normal cognitive abilities. Many participants did not meet criteria for either SCI or MCI and obtained scores above one standard deviation on all of the six paper/ pencil outcome measures described above (n = 66) or scored below one standard deviation only on one of the six paper/pencil outcome measures (n = 8). All of these participants were combined into a single cognitively unimpaired (CU) group.
Mild dementia. The classification for dementia was made clinically using Diagnostic and Statistical Manual of Mental Disorders- 5 (DSM-5) 34 criteria. Patients diagnosed with dementia were impaired on many of the six paper/pencil neuropsychological tests. Also, based on information provided by the family, instrumental activities of daily living (IADL 23 ) were significantly compromised, often requiring some degree of supervision from the family. Using the procedures described above, one of us (DJL) classified patients into their respective groups. In order to insure accuracy, another of us (LP), blinded to the initial classification, also classified all participants into their respective groups. The MMSE was not used to determine SCI or MCI classification.
Statistical analysis
Using criteria described above20–22 as a grouping or independent variable, subsequent between-group and within-group analyses were carried out with MANOVA, ANOVA, or within group t-tests, as indicated. Effect sizes were expressed using Eta Squared or Cohen's D-prime metrics. Age, education, and sex were covaried. Relations between the DAC P(r)VLT recognition latency; and the DAC ‘animal’ Association Index, and other DAC measures were assessed with stepwise regression analyses. For these analyses, DAC recognition latency or the ‘animal’ Association Index were dependent variables, respectively; age, education, and sex were entered into block 1; and selected DAC outcome measures were entered into block 2.
Results
Demographic and clinical information
No between-group differences were found for age. With respect to education (F = 11.49, df = 4, 174, p < 0.001), the CU participants presented with more years of education than the aMCI group (p < 0.008). CU and dys/mx/MCI participants presented with more years of education than the DEM group (p < 0.050, all tests). On the MMSE (F = 81.19, df = 4, 172, p < 0.001), CU participants obtained higher MMSE scores compared to other groups (p < 0.044, all analyses); and the DEM group scored lower compared to all other groups (p < 0.001, all analyses; Table 1).
Demographic and clinical information: means and standard deviations.
SCI: subtle cognitive impairment; MCI: mild cognitive impairment; MMSE: Mini-Mental State Examination; FAQ: functional assessment questionnaire; ADL: activities of daily living; IADL: instrumental activities for daily living.
Between group differences were found on the Lawton and Brody ADL scale (F = 18.78, df = 4, 174, p < 0.001; η2 = 0.302) where the DEM group was rated with greater disability compared to other groups (p < 0.001, all tests). Between-group differences were also seen on the Lawton and Brody IADL scale (F = 45.95, df = 4, 174, p < 0.001; η2 = 0.514). The DEM group continued to be rated with greater disability compared to other groups (p < 0.001, all tests). Both the aMCI and dys/mxMCI groups were rated with marginally greater disability than the CU group (p < 0.050, both tests). For aMCI and dys/mxMCI patients, further querying from family members suggest that IADL concerns were due to orthopedic, visual acuity, or related problems.
DAC P(r)VLT test performance
P(r)VLT immediate free recall. Figure 1 displays performance on the four core P(r)VLT test conditions. A repeated measures ANOVA found a significant 5 group×4 test condition interaction (F = 4.51, df = 12, 475, p < 0.001; η2 = 0.102; Table 2, Figure 1). On the first immediate free recall test trial, CU participants recalled more words than SCI and dys/mxMCI participants (p < 0.001); and DEM participants recalled fewer words than all other groups (p < 0.005, all analyses). On the second immediate free recall test trial, the CU group outperformed all other groups (p < 0.007), and DEM participants continued to recall fewer words than all groups (p < 0.006).

Philadelphia (repeatable) verbal learning test: patterns of performance. P(r)VLT: Philadelphia (repeatable) Verbal Learning Test; immed: immediate free recall; 95% confidence interval.
Digital assessment of cognition – core outcome measures: means and standard deviations (raw scores).
SCI: subtle cognitive impairment; MCI: mild cognitive impairment.
P(r)VLT free recall cluster, extra list intrusion errors, and perseverations. The multivariate effect for group was significant (F = 3.09, df = 12, 476, p < 0.001; η2 = 0.072; Table 3). Subsequent analyses found that CU participants produced more free recall cluster responses than all other groups (p < 0.050; all analyses). There were no differences for extra-list intrusion and perseveration errors.
Digital assessment of cognition – error, process and related outcome measures: means and standard deviations (raw scores).
SCI: subtle cognitive impairment; MCI: mild cognitive impairment; P(r)VLT: Philadelphia (repeatable) Verbal Learning Test; BDST: Backward Digit Span Test.
P(r)VLT delay free recall and savings analyses. After a filled delay, CU participants continued to score better than all other groups (p < 0.001, all analyses); DEM participants recalled fewer words than all other groups (p < 0.001, all analyses); and borderline effects were found suggesting that aMCI participants recalled fewer words than SCI and dys/mxMCI participants (p < 0.088, both analyses).
The ANOVA for the savings measure was significant (F = 21.95, df = 4, 162, p < 0.001, η2 = 0.352). Subsequent analyses found better savings for the CU compared to all other groups (p < 0.004); SCI and dys/mxMCI participants exhibited greater savings than aMCI and DEM participants (p < 0.017, both analyses). There was no difference when aMCI and DEM groups were compared.
P(r)VLT delayed free recall/ recognition contrast analysis. Delayed free recall versus recognition contrast analyses were undertaken with paired t-tests. SCI and dys/mxMCI participants obtained a better score for recognition correct responses versus words recalled on the delayed free recall test condition (p < 0.023 and p < 0.050, respectively). By contrast, there was no difference on these test conditions for the aMCI group; and DEM participants scored worse on the recognition versus the delayed free recall test condition (p < 0.031).
P(r)VLT recognition correct responses and foils. For delayed recognition correct responses, DEM and aMCI groups did not differ and both of these groups scored lower than all other groups (p < 0.009, all analyses); and there were no differences between the CU, SCI, and dys/mxMCI participants. The multivariant effect for group for prototypic and generic recognition foils was significant (F = 18.53, df = 8, 320, p < 0.001, η2 = 0.317). Both aMCI and DEM groups generated more prototypic recognition foils than all other groups (p < 0.029, all analyses). aMCI and DEM groups also endorsed more generic recognition foils than CU and dys/mxMCI groups (p < 0.026, all analyses).
P(r)VLT recognition latency for correct responses. The effect for group was significant (F = 13.56, df = 4, 162, p < 0.001; η2 = 0.261). The recognition latency for correct responses was faster for CU participants compared to the aMCI and DEM groups (p < 0.005; both analyses). Recognition latency was also faster for SCI and dys/mxMCI participants compared to the DEM group (p < 0.005).
DAC backward digit span performance
BDST ANY and SERIAL order recall. The multivariant effect of group for BDST ANY and SERIAL order recall was significant (F = 14.68, df = 8, 320, p < 0.001; η2 = 0.269). For ANY order recall, dementia participants scored lower than all groups (p < 0.001, all analyses); CU scored better than dys/mxMCI participants (p < 0.037), and there was a trend for CU participants to score better than SCI participants (p < 0.073). For SERIAL order recall, DEM participants continued to score lower than all groups (p < 0.003, all analyses), and CU participants obtained a better score than the SCI group (p < 0.038). The CU and aMCI groups did not differ on this measure.
BDST perseveration errors. The effect for group was significant (F = 11.35, df = 4, 162, p < 0.001; η2 = 0.219). CU participants produced fewer perseverations compared to the SCI, dys/mxMCI, and DEM groups (p < 0.005, all analyses); and aMCI participants produced fewer perseverations than DEM participants (p < 0.001). CU and aMCI groups did not differ on this measure.
DAC semantic (‘animal’) fluency performance
The multivariant effect for group was significant for total output and the Association Index (F = 19.19, df = 8, 312, p < 0.001; η2 = 0.330). Subsequent analyses for total output found that CU participants generated more correct responses compared to all other groups (p < 0.037, all analyses); the aMCI group outperformed SCI and dys/mxMCI participants (p < 0.028, both analyses); and DEM participants performed worse compared to SCI and aMCI participants (p < 0.042, both analyses). For the ‘animal’ Association Index, the DEM group produced a lower score than CU and aMCI groups (p < 0.043, both analyses).
Regression analyses: P(r)VLT recognition latency and the ‘animal’ association index
Recognition latency and DAC outcome measures. A series of four stepwise regression analyses were undertaken that assessed relations between P(r)VLT recognition latency for correct responses and selected DAC outcome measures. In all analyses age, education, and sex were entered into block 1, and variables of interest were entered into block 2. Full statistics can be found in Supplemental Tables 2–5.
For P(r)VLT total free recall cluster and extra-list intrusion responses, only cluster responses entered the final model (beta = −0.199, p < 0.007) where slower latency (i.e., a larger raw score) was associated with the production of fewer numbers of total free recall cluster responses. For P(r)VLT recognition prototypic and generic responses, prototypic foils entered the final model first (beta = 0.468, p < 0.001), followed by generic foils (beta = 0.188, p < 0.003) where a slower recognition latency (i.e., a larger raw score) was associated with the production of greater numbers of foil responses.
A third regression analysis investigated how P(r)VLT recognition latency for correct responses was related to BDST SERIAL order recall and total ‘animal’ fluency output. Here, slower recognition latency for correct responses was associated only with fewer ‘animal’ fluency responses (beta = −0.361, p < 0.001). Finally, the analysis looking at recognition latency and BDST perseveration and transposition errors was not significant.
Semantic (‘animal’) fluency association index. Three regression analyses were undertaken to assess the relationship between the ‘animal’ fluency AI and selected DAC outcome measures. With respect to P(r)VLT recognition foils, a lower or reduced ‘animal’ AI was seen along with greater numbers of P(r)VLT prototypic recognition foils (beta = −0.265, p < 0.001; Supplemental Table 6). Analyses looking at relations between the ‘animal’ Association Index and P(r)VLT free recall cluster responses and extra-list intrusion errors; and the ‘animal’ Association Index and BDST total transpositions and perseveration errors were not significant, respectively.
DAC index scores
Between group analyses. As described above, memory and executive index scores were created by averaging z-scores for P(r)VLT delayed free recall and the recognition ratio; and BDST SERIAL order recall and total ‘animal’ fluency output, respectively. The multivariate effect for group was significant (F = 39.20, df = 8, 320, p < 0.001; η2 = 0.495). For the DAC Executive Index, subsequent analyses found that CU participants obtained a better score than SCI, dys/mxMCI, and DEM participants (p < 0.035, all analyses); the aMCI and CU group did not differ; however, aMCI participants outperformed the SCI and dys/mxMCI groups (p < 0.035, both analyses). DEM participants scored lower than all other groups (p < 0.002, all analyses). For the DAC Memory Index, subsequent analyses found that CU participants performed better than SCI, aMCI, and dys/mxMCI groups (p < 0.001, all analyses); SCI and dys/mxMCI participants obtained a better score than the aMCI group, (p < 0.001, both analyses). DEM and aMCI participants did not differ on this outcome measure; however, DEM participants scored lower than CN, SCI and dys/mxMCI groups (p < 0.001, all analyses; Table 1).
Within-group analyses. Paired t-tests examined DAC Executive and Memory indices within each group. The aMCI group scored lower on the Memory versus the Executive Index (p < 0.001). No other analyses were significant.
Discussion
Accurate characterization of MCI subtypes has important implications for illness treatment and disease management. For example, there is evidence to suggest that subtle impairment on tests assessing delayed free recall, semantic fluency for animals, and test sensitive to dysexecutive impairment can predict MCI and dementia onset. 35 Prior research also suggests that patients diagnosed with a mixed MCI phenotype may convert to dementia faster than those characterized with single domain amnestic MCI.36,37 Huey and colleagues 38 found that mixed MCI patients were less likely to present with indications for AD-related pathology and were more likely to have stroke as seen on MRI scans. The need for accurate characterization of MCI phenotypes has become particularly acute given the availability of disease modifying medications to treat MCI and dementia associated with AD pathology.39,40 A digital
neurocognitive assessment protocol that could detect impairment that is not yet apparent to families, caregivers, or healthcare teams may guide timely work-up for medication and other treatment strategies.41,42
In addition to dementia such as AD, millions of Americans suffer with known cardiovascular illness including heart disease, hypertension, and diabetes.43–45 Nonetheless, even when these common medical illnesses are thought to be successfully treated, measurable problems can be found on neuropsychological tests that assess executive abilities and verbal episodic memory.46–48 Moreover, chronic, common cardiovascular risks have been linked to the eventual emergence of AD pathology. 49 It is possible that routine testing using digital assessment technology deployed in both primary and specialty medical care venues50,51 could flag the deleterious effects of emergent neurodegenerative and cardiovascular illness and target patients toward appropriate therapies and interventions, resulting in improved healthcare outcomes.
In prior research, 4 a core group of digital outcome measures used in the current research were able to classify patients into four well-known clinical groups. Moreover, our prior report illustrated how a wide array of process and error measures11–13 were associated with these index scores, thus providing additional evidence for the construct and criterion validity of the DAC. However, the sample of patients investigated in our prior research was modest.
In the current research, patients underwent assessment with a protocol of paper and pencil neuropsychological tests and were classified as presenting with SCI, aMCI, and dys/mxMCI subtypes using criteria suggested by Bondi and colleagues20–23 and compared to CU and DEM patients. The efficacy of these actuarial criteria has been provided by Edmonds and colleagues. 52 These researchers re-analyzed data from the Alzheimer's Disease Cooperative Study (ADCS) vitamin E and Donepezil trial for MCI. 53 After Jak, Bondi21,22 criteria for MCI were applied, greater effect for drug was found. In the current research, patients classified as aMCI using paper and pencil tests were disproportionately impaired on DAC verbal episodic memory outcome measures. Similarly, patients classified with SCI and dys/mxMCI were disproportionately impaired on DAC executive and semantic fluency measures.
The clinical features associated with aMCI have been previously described.54–56 On tests similar to the memory measures used in the current research,14,24 amnestic patients obtain scores on immediate free recall test trials that are statistically in the normal range followed by a precipitous decline after a delay that is not ameliorated upon assessment with a recognition test condition. Also, amnestic patients generally are not able to benefit from the embedded semantic structure in free recall test trials, often generate extra-list free recall intrusion errors, and endorse many recognition false positive foils.14,26,57
Most, but not all of this behavior was displayed by the amnestic group as described above. As shown in Figure 1, P(r)VLT free recall test scores for the aMCI group was statistically reduced compared to other groups. Nonetheless, both immediate free recall test scores were statistically WNL. After a delay, the trial 2 immediate free recall versus delayed free recall saving score was markedly reduced for aMCI participants compared to CU, SCI, and dys/mxMCI groups. Equally important was the observation that the reduced performance displayed by aMCI patients on the delayed recognition test was driven, in part, by profligate responding to prototypic rather than generic semantic foils. Patients in the DEM group also produced much the same profile. Absent in the current research was the generation of significant numbers of free recall, extra-list intrusion errors. Nonetheless, aMCI patients presented with all other well-known features of amnesia.54–56
Interestingly, the SCI and dys/mxMCI groups also produced P(r)VLT immediate free recall test scores that were statistically intact. The analysis of their savings scores suggest some decline after a delay. However, unlike their aMCI counterparts, both SCI and dys/mxMCI groups improved when assessed with the recognition test condition and endorsed far fewer recognition foils. This retrieval-based pattern of performance is well-known and has been documented in other dementia and MCI patient groups.14,58–60 These results suggest that the DAC is able to define behavior consistent with amnesia.14,57
An additional well documented neurocognitive disability associated with dementia and MCI is dysexecutive difficulty. The DAC assessed executive abilities, in part, with a short version of the Backward Digit Span Test.15–18 As described above, the BDST ANY order index was created to assess for less demanding aspects of attention/ concentration and auditory span, while the SERIAL order index was created to assessed the capacity for mental manipulation and working memory ability. In prior research with dementia patients group according to the severity of MRI defined WMAs, no differences were seen for ANY order recall. Still, between-group differences were observed for SERIAL order recall.17,18 Emrani and colleagues 15 found similar pattens of performance among their MCI patients. Also, Emrani and colleagues 15 observed the production of numerous perseveration errors in their mixed/dysexecutive group.
In the current research, both SCI and dys/mxMCI participants scored lower compared to the CU group for BDST ANY order recall to suggest problems with auditory span. For BDST SERIAL order recall, SCI participants scored lower compared to the CU group. Perseveration is another feature suggesting the presence of dysexecutive difficulty. 61 As such, BDST performance produced by SCI and dys/mxMCI participants was typified by greater perseveration compared to the CU group. With respect to output on the ‘animal’ fluency test, both CU and aMCI groups outperformed the SCI and dys/mxMCI groups.
On the DAC executive index, CU and aMCI participants obtained better scores than the SCI and dys/mxMCI groups. The positive findings suggesting greater dysexecutive difficulty when CU participants were compared to SCI and dys/mxMCI groups, as well as the negative findings demonstrating no statistical differences when CU participants were compared to the aMCI group, suggest that the DAC is able to uncover and operationally define circumscribed neurocognitive syndromes revolving around dysexecutive difficulty.
In prior research, Carew and colleagues 19 found that ‘animal’ fluency output among AD and vascular dementia (VaD) patients was quite reduced. Nonetheless, the VaD group obtained a better AI score than AD patients suggesting relatively intact lexical/ semantic operations. In the current research reduced ‘animal’ AI scores were associated with the production of greater numbers of P(r)VLT prototypic recognition foils. A circumscribed language-related/ executive neurocognitive network could underlie these observations. However, additional research is needed to test this supposition.
In our prior report, Libon and colleagues 4 found that P(r)VLT latency for correct recognition test items was faster for CU and dysexecutive MCI groups compared to aMCI and dementia participants. This observation was largely sustained in the current research. Also, as detailed above, slower recognition latency for correct responses was associated with other P(r)VLT outcome measures including the production of fewer numbers of free recall cluster responses but greater numbers of recognition foils. Moreover, slower recognition latency was also associated with reduced ‘animal’ fluency output, but not BDST SERIAL order recall or BDST perseveration errors.
As with the analyses involving the ‘animal’ AI described above, the relations between P(r)VLT recognition latency and elements from the ‘animal’ and BDST subtests could suggest that these outcome measures are tapping into a circumscribed neurocognitive network involving executive, language, and memory abilities. Emrani and colleagues 16 also found different patterns of performance for latencies for correctly repeating five numbers backwards among memory clinic patients judged to be cognitively normal versus patients meeting criteria for MCI. When brought to scale, such latency outcome parameters could constitute neurocognitive biomarkers for emergent neuropsychological problems related to dementia or cardiovascular illness.
The current research is not without limitations. First, while the current report was able to recruit a much larger sample than our prior report, the sample size for some of our groups was modest. Additional data is required to verify the statistical relationships described above. Second, research using additional types of neurologic and neurodegenerative illness would expand upon the statistical findings described above. Third, there is a need to assess DAC digital neuropsychological outcome parameters in relation to data obtained using serum biomarkers, MRI parameters, and common cardiovascular illness. Fourth, although we feel that the analyses described above are compelling and are consistent with past research documenting differing patterns of impairment seen in MCI and dementia, future research would benefit from additional analyses such as a train/ test split or cross-validation procedure, followed by evaluation of a classifier. Metrics such as ROC AUC, sensitivity, specificity, precision, recall, and confusion matrices would also be helpful.
Nonetheless, the current research has several strengths. First, new time-based outcome measures are described that hold promise in identifying early, emergent dementia-related illness. Second, the results detailed above, generally replicates the findings described in the initial report of Libon and colleagues 4 and suggest that the DAC is able to uncover and define circumscribe amnestic and dysexecutive syndromes. Third, all of the digital subtests described above were derived from classic clinical assessment procedures and research paradigms that have been thoroughly researched for decades. Fourth, digital administration and scoring maximizes reliability for test administration, and the absence of any subjectivity in scoring and error tabulation. In sum, the results of the current research suggest that when brought to scale, and deployed in longitudinal research or clinical environments, the DAC could uncover subtle, highly nuanced behavior that might predict the emergence of dementia and MCI syndromes.
Supplemental Material
sj-docx-1-alz-10.1177_13872877261418280 - Supplemental material for Digital neuropsychological assessment–part 1: Defining mild cognitive impairment subtypes
Supplemental material, sj-docx-1-alz-10.1177_13872877261418280 for Digital neuropsychological assessment–part 1: Defining mild cognitive impairment subtypes by David J. Libon, Deborah Drabick, Rod Swenson, Sean Tobyne, Sheina Emrani, Ondrej Bezdicek, Laura Salciunas, Aeysha J. Brown, Terri Ginsberg, Mitchel Kling, Kevin Overbeck, Leonard Powell, Christian White, Adaora Okoli-Umeweni, Christopher Janson and Stephen Scheinthal in Journal of Alzheimer's Disease
Footnotes
Acknowledgements
The authors have no acknowledgments to report.
Ethical considerations
The study was conducted in accordance with the Declaration of Helsinki and was approved by the IRB Committee of Rowan University (no. pro 2016001115) on June 11, 2025, with the need for written informed consent waived.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Author contribution(s)
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Drs. Libon and Swenson consult to Linus Health, Inc; Dr Libon receives royalties from Linus Health, Inc.; Drs. Libon and Swenson receive royalties from Oxford University Press. Dr Tobyne is an employee of Linus Health. Drs. Libon and Swenson are Editorial Board members of this journal but was not involved in the peer-review process of this article nor had access to any information regarding its peer-review.
Data availability statement
The data that support the findings of this study are available from the corresponding author with Rowan University IRB approval. Some restrictions may apply.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
