Abstract
Objective:
To first validate the diagnostic accuracy of the “Triana Test,” a new story recall test based on emotional material.
Method:
A phase I study of validation. We included 55 patients with amnestic Mild Cognitive Impairment and 69 healthy controls, diagnosed according to the “Memory Associative Test of the district of Seine-Saint-Denis” (TMA-93), and matched by age, gender, and educational level. The Triana Test’s diagnostic accuracy was calculated by ROC curve analysis and Spearman correlations estimated its convergent validity with a hippocampal memory test, the Free and Cued Selective Reminding Test with Immediate Recall (FCSRT+IR).
Results:
The “Triana Test” immediate and delayed recalls showed adequate diagnostic accuracy (AUC ≥ 0,74). The delayed free recall showed the highest diagnostic accuracy (AUC = 0.86). Correlations with the FCSRT+IR were moderate to strong.
Conclusions:
The “Triana Test” demonstrated accuracy for discriminating amnestic Mild Cognitive Impairment patients from healthy controls and convergent validity with the FCSRT+IR.
Keywords
Introduction
Diagnosing Alzheimer’s disease (AD) at a prodromal phase is a primary aim for clinicians who evaluate patients with memory complaints. 1,2 We need sensitive memory tests and specific AD biomarkers. 1,2 Particularly, memory tests focused on some characteristics that suggest hippocampal dysfunction are highly recommended. 1 Among these tests, we find the Free and Cued Selective Reminding Test (FCSRT), 3 which examines facilitation with cueing, and tests focused on binding, 4 the ability to learn by association, as the “Memory Associative Test of the district of Seine-Saint-Denis” (TMA-93). 5
A less developed research line for diagnosing AD is challenging the amygdala by tests focused on emotional memory. 6 -9 Experiments on animals’ behavior under emotional situations and studies on patients with selective bilateral amygdala lesions have demonstrated that this structure is critical in memorizing emotional materials. 6,7 In AD, amygdala and hippocampal neurodegeneration proceed simultaneously, 10,11 so testing the amygdala’s function by memory tests with emotional contents may be an interesting diagnostic approach.
Emotional stimuli relative to neutral stimuli enhance memory and this effect has been called “the emotional memory effect.” 12 Young people better recall negative stimuli whereas older people better recall positive stimuli. This “positivity effect” with aging has been associated with improved regulation of emotion. 13,14
Emotional memory has been mainly studied in patients with affective disorders by tests based on lists of words with positive, negative, or neutral valence, as the “Emotional Verbal Learning Test,” 15 the “Affective Auditory Verbal Learning,” 16 or the “Verbal Affective Memory Test.” 17 In patients with cognitive impairment, studies over “the emotional memory effect” have yielded contradictory results: some studies have found that AD patients recalled better the emotional passage of the stories 8,18 whereas other studies have demonstrated that the “emotional memory effect” is impaired in the AD group by comparison with healthy controls. 19
Story recall tests closely resemble everyday memory demands from conversations or the media. In this type of memory tests, the participants are asked to memorize a short story and then to recall it immediately, after a 30-minute interval, and by a recognition test. 20 The story must be easy to understand and memorize, usually involving a robbery or an accident. 20 The “Weschler Memory Scale Logical Memory test,” 21 and the “Story Recall Test” 22,23 are among the most used story recall tests.
In this paper, a new story recall test, the Triana Test (TT), is presented. The test consists of a narrative text based on the captivating love and heartbreak story between a flamenco dancer and a Japanese student. This arousing story has been intentionally designed to be different from the relatively neutral stories that make up classical story recall tests. 21 -23 Attention is focused on arousing rather than neutral stimuli, according to the theory of selectivity of attention, 24,25 and we hypothesize the TT arousing story could better catch examinees’ attention and facilitate learning. On the other hand, as emotional arousal enhances the declarative memory in healthy older people, 12,26,27 but this effect is less consistent in cognitively impaired patients, 19 we hypothesize that the TT arousing story could facilitate the discrimination between amnestic Mild Cognitive Impairment and healthy controls.
The aim was to first validate the TT for distinguishing aMCI patients from HCs and analyze its convergent validity with the FCSRT.
Methods
Design
A cross-sectional, case–control study with convenience sampling was planned to first evaluate the TT’s discriminative validity for distinguishing patients with aMCI from HCs.
Study Population
The sample consisted of 124 participants from an urban area of Spain. All participants were equal or older than 50 years and spoke Spanish as their native language. We considered age, gender, and educational attainment (“less than first grade,” “first grade completed,” and “higher than first grade”) as sociodemographic variables. Participants comprised 2 groups: 55 patients with aMCI and 69 HCs matched by age, gender, and educational level. All patients were selected by convenience sampling of consecutive cases who had been diagnosed with aMCI at the Memory Units of 2 hospitals: Virgen del Rocio University Hospital (Seville, Spain) and Juan Ramon Jimenez University Hospital (Huelva, Spain). The procedures had consisted of general, neurological, neuropsychological, laboratory, and neuroimaging examinations.
The neuropsychological evaluation had included 2 hippocampal memory tests: the TMA-93 and the picture version of the FCSRT. 3,5 The TMA-93, 5 which examines visual relational binding, provided the normative data to determine the participants’ belonging to a diagnostic group. 28 The picture version of the FCSRT with immediate recall (FCSRT+IR) was later used to study the convergent validity. 3 Permission had been obtained from GRECO and Albert Einstein College for using, respectively, the TMA-93 and the FCSRT+IR. For the latter, a picture version was chosen to be more feasible for the elderly low-educated population we attend in Andalusia. Both tests had been administered following their authors’ instructions. 3,5 We registered the TMA-93 total score and the total free recall, total recall and cued index from the FCSRT+IR. 3,5 The cued index, from the FCSRT+IR, is the result of the “total free recall” − “total recall”/ “total free recall” − 48 ratio and measures the cueing efficiency: the examinee’s ability to retrieve with the cue when items are not freely recalled. Higher cued index value, more preserved cueing efficiency is. 29,30
The diagnosis of aMCI had been made according to the International Working Group on Mild Cognitive Impairment recommendations, 31 which were operationally put in practice as follows: (a) memory complaints corroborated by a reliable informant, (b) clinician had determined that the person was neither normal nor demented by low total score on MMSE but no significant functional decline for activities of daily living [score up to 39 on the “Interview for Deterioration in Daily Living Activities in Dementia” (IDDD)], 32 (c) objective memory impairment measured by a score equal or below the 10-percentile on the “Memory Associative Test of the district of Seine-Saint-Denis” (TMA-93), 28 (d) non-memory domains (language, executive function or visuospatial skills) intact. HCs were recruited among the caregivers and relatives of patients attending both units. They met the following inclusion criteria: (a) absence of memory complaints, (b) absence of objective memory impairment (score equal or above 25-percentile on the TMA-93), 28 (c) intact level in activities of daily living (score between 33 and 36 on IDDD). 32 The exclusion criteria for both groups were: (a) absence of reliable informant, (b) current history of other neurological diseases that potentially cause cognitive impairment, (c) poor vision or hearing despite correction, (d) clinically significant, advanced, or unstable systemic disease that might interfere with cognitive evaluation, (e) current Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-V), diagnosis of active major depression, schizophrenia, or bipolar disorder, and (f) history of abusing alcohol or other substances current history.
Procedures
The TT was administered on a different day and by a different rater (ALT) blinded to participants’ diagnostic and cognitive testing.
In the TT, the participant is asked to carefully listen to the story (in supplementary material, an English version of the TT is available). The examiner reads the story and, immediately, asks the examinee to recall it verbatim. This “immediate free recall” is scored according to the recall of 12 items selected from the story (maximum score: 12). Then, 12 questions about the same items with yes/no answers are asked. This “immediate recognition recall” is also scored over a maximum of 12 points. After a delay interval of 20 minutes during which other non-memory tests may be undertaken, both recalls, now called the “delayed free recall” and the “delayed recognition recall,” are again tested and similarly scored. In this validation study, time was taken on both phases of the test administration and registered as “time for immediate recall” and “time for delayed recall.”
Ethics
This study was approved by the Virgen del Rocio University Hospital (Seville, Spain) ethics committee and conducted according to Helsinki’s World Medical Association Declaration. All participants accepted the study procedures by signing informed consent.
Statistical Study
Descriptive results are presented as frequencies for categorical variables; mean, standard deviation, and range for normally-distributed continuous variables; and median, interquartile range, and range for non-normally-distributed continuous variables.
Between-group comparisons of continuous variables were performed with Student’s t-test or the alternative non-parametric Mann-Whitney U. Effect size was calculated and considered as small (0-0.20), medium (0.20-0.50), or large (>0.50). Between-group comparisons of categorical variables were performed with the Chi square test.
The diagnostic utility of the TT was calculated by conducting receiver operating characteristic (ROC) curve analysis and estimated by the “area under the curve” (AUC), considering it as follows: “excellent” (> 0.90), “good” (> 0.80), “adequate” (> 0.70), or poor (< 0.70). 33 The Youden index was used to determine the optimum cutoffs to provide the best balance between sensitivity and specificity. 34
Comparisons between ROC curves for the TT variables were made by the DeLong method. 35 By this method, the Standard Error of the AUCs and the differences between the AUCs were calculated.
The effect of diagnosis, age, gender, and education on TT total scores was analyzed by linear regression. For educational level, dummy variables were made and “first grade completed” was considered the reference category for the other categories. A stepwise forward method was followed. For the first step, the diagnosis, age, gender, and the 2 dummy variables were included as independent variables while each of the TT total scores was included as a dependent variable. Non-significant dependent variables were left out of the final model that only included those demonstrating significance.
Spearman correlations estimated the convergent validity between the TT and the FCSRT+IR. Correlations were considered as follows: very weak (0.00-0.19), weak (0.20-0.39), moderate (0.40-0.59), strong (0.60-0.79), and very strong (≥ 0.80). 36
The analysis was performed using SPSS version 25, except for comparisons between ROC curves, for which we used MedCalc. Statistical significance was set at p < 0.05 and 95% confidence intervals were calculated.
Results
Socio-demographics characteristics and neuropsychological background for aMCI and HC groups are shown in Table 1. There were no significant differences in age, gender, or educational attainment between groups. More than 30% of the sample was composed of individuals with “less than the first grade” of educational attainment.
Socio-Demographics Characteristics and Neuropsychological Tests Results by Diagnostic Group.
Abbreviations: HCs, healthy controls; aMCI, amnestic mild cognitive impairment; <first grade, participants who did not complete primary studies; First grade, participants who only completed primary studies; >first grade, participants who had higher than primary studies; FCSRT+IR TFR, Free and Cued Selective Reminding Test with Immediate Recall, Total Free Recall; FCSRT+IR TR, Free and Cued Selective Reminding Test with Immediate Recall, Total Recall; FCSRT+IR CI, Free and Cued Selective Reminding Test with Immediate Recall, Cued Index; TT IFR, Triana Test, Immediate Free Recall; TT IRR, Triana Test, Immediate Recognition Recall; TT DFR, Triana Test, Delayed Free Recall; TT DRR, Triana Test, Delayed Recognition Recall; TIR, Triana Test Immediate Recall (duration in minutes); TDR, Triana Test Delayed Recall (duration in minutes).
Note: Total scores for MMSE, TMA-93, TT, and FCSRT are shown as median, interquartile range = P25-P75, and range.
The aMCI group scored significantly lower than the HCs group on MMSE, TMA-93, and FCSRT+IR (Mann-Whitney U test, p < 0.001), the tests used during the diagnosis before the validation (Table 1).
All participants completed TT. TT administration times were longer for aMCI patients than HCs, but the difference was only significant for the immediate phase of the test (Mann-Whitney test, p < 0.005) (Table 1).
The aMCI group had significantly lower total scores than the HCs group on “immediate free recall,” “immediate recognition recall,” “delayed free recall,” and “delayed recognition recall” (Mann-Whitney U test, p < 0.001) (Table 1). The effect size was medium to large (Table 1). TT “recognition recalls” (“immediate” and “delayed”) achieved “adequate” (>0.70) diagnostic accuracy. TT “free recalls” (“immediate” and “delayed”) demonstrated “good” (>0.80) diagnostic accuracy (Table 2). Pairwise comparisons of ROC curves by the DeLong Method showed that the AUC for the “delayed free recall” was significantly higher than those for the “immediate free recall,” the “immediate recognition recall,” or the “delayed recognition recall” (DeLong method, p < 0.05) (Table 3, Figure 1).

Comparison of ROC curves among TT variables. AUC for TT DFR was significantly higher than AUC for TT IFR, TT IRR, and TT DRR.
ROC Curves Analysis and Best Cutoffs for TT Scores.
Abbreviations: TT IFR, Triana Test, Immediate Free Recall; TT IRR, Triana Test, Immediate Recognition Recall; TT DFR, Triana Test, Delayed Free Recall; TT DRR, Triana Test, Delayed Recognition Recall.
Pairwise Comparisons of ROC Curves by DeLong Method.a
Abbreviations: TT IFR, Triana Test, Immediate Free Recall; TT IRR, Triana Test, Immediate Recognition Recall; TT DFR, Triana Test, Delayed Free Recall; TT DRR, Triana Test, Delayed Recognition Recall.
a Significant “P” in bold.
By a stepwise forward method of linear regression, age, gender, and the dummy variable “less than the first grade of education” were demonstrated non-significant and left out of the final models, whereas the diagnosis group and the dummy variable “more than the first grade of education” were significant and remained in the final models (Table 4). For the dummy variable “more than the first grade of education,” the effect consisted of higher scores on all TT recalls for the more educated group compared to participants with only the first grade of education (the reference category) (Table 4).
Linear Regression Analysis for TT Total Scores.
Abbreviations: TT IFR, Triana Test, Immediate Free Recall; TT IRR, Triana Test, Immediate Recognition Recall; TT DFR, Triana Test, Delayed Free Recall; TT DRR, Triana Test, Delayed Recognition Recall. >first grade, participants who had higher than primary studies; Std Error, Standard Error; t, Student’s t-test.
Correlations between the TT total scores and the FCSRT+IR scores were moderate to strong (“r” range = 0.478-0.6740) (Table 5).
Convergent Validity Analysis Between TT and FCSRT+IR.
Abbreviations: TT IFR, Triana Test, Immediate Free Recall; TT IRR, Triana Test, Immediate Recognition Recall; TT DFR, Triana Test, Delayed Free Recall; TT DRR, Triana Test, Delayed Recognition Recall; FCSRT+IR TFR, Free and Cued Selective Reminding Test with Immediate Recall, Total Free Recall; FCSRT+IR TR, Free and Cued Selective Reminding Test with Immediate Recall, Total Recall; FCSRT+IR CI, Free and Cued Selective Reminding Test with Immediate Recall, Cued Index; r = Spearman correlation coefficient.
Discussion
This study is a preliminary validation for the TT, a new story recall test based on an emotional story. The test is demonstrated accurate for distinguishing aMCI patients from HCs and feasible for a limited time-per-patient context, including low-educated individuals.
The TT arousing story was able to discriminate aMCI patients from healthy controls. The TT recalls’ diagnostic accuracy ranged from “adequate” to “good” (0.74 to 0.86). Sensitivity and specificity ranged from 78% to 85% and from 55% to 74%, respectively. For discriminating patients with MCI against healthy controls, a meta-analysis on diagnostic accuracy for memory tests including the entire spectrum of lists of words, stories, cued and selective reminding paradigms, or associative learning tasks, considered values of sensitivity and specificity equal or above 70% as “adequate.” 37 In this sense, the TT “delayed recalls” with pairs of sensitivity and specificity at 84%/74% for the “delayed free recall” and 78%/68% for the “delayed recognition recall” may be considered “adequate.”
By comparison, these TT results are better than those described for other story recall tests for discriminating patients with MCI from healthy controls. On another meta-analysis focused on memory scores for entry into Alzheimer’s disease trials, the “delayed recall” of the “Weschler Memory Scale Logical Memory test” showed low accuracy (AUC <0.75) for distinguishing between normal cognitive and MCI. 38 The accuracy of the “Story Recall Test” has also been reported. 23 For this test, the “immediate recall,” the “delayed recall,” and the “recognition recall” demonstrated AUC of 0.62, 0.73, and 0.68, respectively. 23 By analyzing the TT and the “Story Recall Test” results, we can conclude that both tests follow a similar pattern of discriminative validity with free and delayed recalls being more discriminative than recognition and immediate ones.
To explain the basis for the adequate discriminative validity of the TT, we have to considerer previous studies on emotional memory that demonstrated that an arousing story enhances declarative memory. 8,18,19 An example is Cahill et al study. 8 They used the “paradigm of illustrated history,” including 2 practically identical short stories. 18 These stories were only differentiated by a passage, charged with emotion in 1 of them but neutral in the other. 8 AD patients recalled fewer items than HCs for both stories; however, both AD patients and HCs better recalled the story with the emotional passage. 18 Oppositely, in another study using the “International Affective Picture System,” composed of photographs showing positive, negative of neutral contents, “the emotional memory effect” was demonstrated impaired in the AD group. 19 This “emotional memory effect” seems to be more consistent for healthy controls than for cognitively impaired patients. 18,19 This differential effect by the group could ease the TT discrimination between aMCI patients and healthy controls.
In this validation study, the TT scores correlated with those of the FCSRT+IR, the gold-standard among the hippocampal memory tests. 3 This result on convergent validity strengthens the TT diagnostic potential. On the other hand, compared to hippocampal memory tests that use neutral material, the TT, using emotional contents, could additionally challenge the amygdala function which would be early impaired in aMCI patients, a state of risk for AD, 10,11 and this approach based on examining the amygdala may be interesting for diagnosing AD.
TT demonstrated good feasibility. All participants completed the test, including the more than 30% who had not completed the first grade of education. For the aMCI patients’ group, TT administration time averaged less than 3 minutes in the immediate phase (including the examiner’s initial orders and her aloud reading of the story), and less than 2 minutes in the delayed phase. Neuropsychologists can spend the 20-minute interval between the 2 test phases on other non-memory evaluations. By comparison, the “Weschler Memory Scale Logical Memory test” consists of 2 stories with 14 and 25 items, respectively, to be recalled in both immediate and delayed phases. In addition, 8 and 15 items, respectively, are tested during the recognition task. 21 Its administration time is necessary longer than that for the TT. TT’s shorter administration time makes the test more feasible for contexts with limited face-to-face time per patient. TT’s administration time could be closer to that for the “Story Recall Test” that only includes 1 short story with 24 words to be recalled immediately, after a 20-minute interval, and then by a recognition test based on 10 different questions about the story. 23 However, both tests’ administration time would have to be compared on the same sample to conclude which 1 takes shorter administration time.
This study presents some limitations. As a phase I validation study for a diagnostic test, we analyze if the TT scores discriminate patients with aMCI against healthy controls. In this preliminary phase of validation, there are a few requirements. First, the 2 comparing groups must be distinctly different in the diagnosis. 39 –41 For this reason, our design widely separated the aMCI group (score equal or below the TMA-93 10-percentile) from the control group (score equal or above TMA-93 25-percentile). There must not be cases on the border at this phase of validation. 39,40,41 This rule supposed a limitation: middle percentiles (for example, TMA-93 total score’s 16-percentile) cannot be included in any group of this convenience sample. Second, both groups have to be similar in socio-demographic variables that may explain differences in scores. In other words, the groups only must be different by diagnosis. The results encourage us to advance on phases II and III of validation for the TT. Phase II will also consist of a cross-sectional study, but the sample has to include all cases of the disease, also the borderline ones, with frequencies similar to real-life conditions in which the test will be administered. 39,40,41 This phase makes it possible to know more about the discriminative validity of the test but not about its predictive ability for which a phase III, a prospective cohort study, is needed. 39,40,41 Another limitation of this validation study is not having used AD biomarkers to define the patients’ group as prodromal AD, 31 so patients were at risk for AD, but AD could not be confirmed. Future TT validation studies should target prodromal AD.
We can conclude that the TT is discriminative for diagnosing aMCI, convergent with the gold-standard FCSRT+IR, and feasible for contexts with limited time per patient. These results encourage us to plan a future study comparing the TT’s discriminative validity with that of the gold standard in story recall tests, the “Weschler Memory Scale Logical Memory test,” testing the hypothesis that an emotional story discriminates better than a non-emotional 1 between aMCI and HCs.
Supplemental Material
Supplemental Material, sj-pdf-1-aja-10.1177_15333175211025911 - Preliminary Validation of the Triana Test: A New Story Recall Test Based on Emotional Material
Supplemental Material, sj-pdf-1-aja-10.1177_15333175211025911 for Preliminary Validation of the Triana Test: A New Story Recall Test Based on Emotional Material by Andrea Luque-Tirado, Silvia Rodrigo-Herrero, María Bernal Sánchez-Arjona and Emilio Franco-Macías in American Journal of Alzheimer's Disease & Other Dementias
Footnotes
Abbreviations
AD, Alzheimer’s disease; aMCI, Amnestic Mild Cognitive Impairment; AUC, Area Under the Curve; DSM-V, Diagnostic and Statistical Manual of Mental Disorders, 5th Edition; FCSRT, Free and Cued Selective Reminding Test; FCSRT+IR, Free and Cued Selective Reminding Test with Immediate Recall; IDDD, Interview for Deterioration in Daily Living Activities in Dementia; HC, Healthy Control; ROC curve analysis, Receiver Operating Characteristic curve analysis; TMA-93, the Memory Associative Test of the district of Seine-Saint-Denis; TT, Triana Test.
Acknowledgments
We are thankful to all participants involved in this research.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
