Abstract
Background:
People with Down syndrome (DS) are at high risk to develop Alzheimer’s disease dementia (AD). Behavioral and psychological symptoms of dementia (BPSD) are common and may also serve as early signals for dementia. However, comprehensive evaluation scales for BPSD, adapted to DS, are lacking. Therefore, we previously developed the
Objective:
To optimize and further study the scale (discriminative ability and reliability) in a large representative DS study population.
Methods:
Optimization was based on item irrelevance and clinical experiences obtained in the initial study. Using the shortened and refined
Results:
Comparing item change scores between groups revealed prominent changes in frequency and severity for anxious, sleep-related, irritable, restless/stereotypic, apathetic, depressive, and eating/drinking behavior. For most items, the proportion of individuals displaying an increased frequency was highest in DS + AD, intermediate in DS + Q, and lowest in DS. For various items within sections about anxious, sleep-related, irritable, apathetic, and depressive behaviors, the proportion of individuals showing an increased frequency was already substantial in DS + Q, suggesting that these changes may serve as early signals of AD in DS. Reliability data were promising.
Conclusion:
The optimized scale yields largely similar results as obtained with the initial version. Systematically evaluating BPSD in DS may increase understanding of changes among caregivers and (timely) adaptation of care/treatment.
Keywords
INTRODUCTION
Down syndrome (DS, trisomy 21) is the most frequent genetic cause of intellectual disability (ID), prevalent in approximately 1 in 900 live births [1, 2]. People with DS are at high risk to develop dementia due to Alzheimer’s disease (AD). Despite substantial variation between studies, prevalences rise strongly from age 40 [3]. Indeed, by that age, virtually all people with DS have extensive AD-like pathology in the brain [4]. Nevertheless, onset of clinical symptoms varies substantially in time [5, 6]. Consequently, predicting and monitoring decline and onset of dementia is a diagnostic challenge and of essence in daily care and support for people with DS [7].
Behavioral alterations are very common in AD in addition to decline in cognitive and functional skills. These so-called behavioral and psychological symptoms of dementia (BPSD), or neuropsychiatric symptoms, are defined as “a heterogeneous range of psychological reactions, psychiatric symptoms, and behaviors resulting from the presence of dementia” [8]. Nearly all people with dementia in the general population experience at least one BPSD symptom at some point during their disease course [9, 10]. Various contributing factors have been identified to explain the heterogeneity of these symptoms, among others, factors relating to the person with dementia, factors related to caregivers and environmental factors, and the interaction between these factors [11]. In the general population, BPSD are associated with a reduced quality of life and earlier institutionalization for people with dementia, and increased burden for caregivers [8, 12]. Additionally, BPSD are a key reason for referral to specialist services [13].
Professional caregivers of people with DS + AD found it particularly difficult to respond to the unpredictability of behavioral changes [14] and might not seek intervention until the behavior becomes more difficult to manage [15, 16]. Moreover, changes may be perceived as disability-specific instead of related to dementia (i.e., diagnostic overshadowing) [14, 17]. Therefore, systematic evaluation of BPSD is important to increase awareness, acceptance and understanding among family members and care professionals, which in turn may contribute to adapting care/support [15, 18]. Furthermore, BPSD are reported in prodromal and early stages of dementia and might, as such, serve as early ‘alarm signals’ [19]. For daily practice, anticipating the development of symptoms enables among others timely adaptation of daily support, work and day-care and the living environment [20]. Importantly, BPSD can (partially) be treated, either non-pharmacologically or pharmacologically [17, 21–26].
In the general population, BPSD are regularly evaluated using validated scales such as the
METHODS
Scale optimization
The initial
Translation
The optimized Dutch version (source document) was forward and back translated into French, Italian, and Spanish by a certified professional translation company (DBF Communicatie B.V., Alphen aan den Rijn, The Netherlands). Translations were performed according to the International Organization for Standardization (ISO) standard for translation services (ISO 17100:2015) and the company’s standard operating procedures based on this standard. Firstly, the
Digitization
To facilitate administration, improve data quality (completeness of entered data, guidance of interviewers), and facilitate data processing, a digital version of the optimized scale was developed in various rounds of optimization and problem-solving. An online version of the
Multidisciplinary consortium
Expanding on the previously established consortium [30], a total of 17 Dutch care institutions and 4 European expertise centers took part in the study. This broad, multidisciplinary consortium enabled the study of a large, representative study population of people with DS in a daily clinical practice setting. The following Dutch care institutions participated, providing care, diagnostics and therapy to people with ID/DS throughout nearly the entire country: Amerpoort, Aveleijn, Cosis, De Twentse Zorgcentra, Dichterbij, Elver, Ipse de Bruggen, Nieuw Woelwijck, Philadelphia, Reinaerde, ’s Heeren Loo, Severinus, Sherpa, Sprank, Talant (part of Alliade Care Group), Vanboeijen, and Zuidwester. In addition, University of Antwerp and its Flemish network of care institutions (Belgium), Institut Jérôme Lejeune (France), Policlinico Gemelli (Italy), and Hospital de la Santa Creu i Sant Pau (Spain) participated in administering the scale.
Scoring
Frequency and severity were scored for each scale item. To identify behavioral changes over time and account for pre-existing behavior, frequency and severity were scored for two periods of time: (a) last six months and (b) typical/characteristic behavior before any deterioration occurred. Frequency was scored on a five-point scale: 0 = never or once only, 1 = less than once a month, 2 = monthly, not weekly, 3 = weekly, not daily, or 4 = daily or continuously. The resulting frequency change score (score for sub-item (a) –score for sub-item (b)) is a measure of behavioral change over time and ranged from –4 to + 4. Severity was considered from the perspective of the person with DS and based on two aspects: personal suffering and degree of impact on daily life. Severity was scored on a four-point scale: 0 = none, 1 = slight, 2 = moderate, or 3 = serious. The resulting severity change score (score for sub-item (a) –score for sub-item (b)) is a measure of behavioral change over time and ranged from –3 to + 3.
Depending on the residential circumstances, informant(s) may not always be aware of the person’s sleep behavior. The answer option ‘unknown’ (?) was therefore provided to items in section 2 about sleep problems. In addition, for some items the answer option ‘not applicable’ (N/A) was provided. Depending on the person’s physical disability and/or freedom-restricting measures (items 1.2, 1.4, 2.3, 5.2, and 5.4, see Results section) and verbal (in)abilities (items 3.3, 5.5, 6.1, and 10.2, see Results section), the interviewer could select the answer ‘not applicable’ if a symptom could not occur.
Finally, care(giver) burden was evaluated in each section from the perspective of caregivers/family members. Care burden score per section was based on three aspects: 1) manageability of symptoms, 2) additional time required, and 3) emotional burden. Care burden was scored on a four-point scale: 0 = none, 1 = slight, 2 = moderate, or 3 = serious. Per section, the resulting care burden change (score for sub-item (a) –score for sub-item (b)) ranged from –3 to + 3.
Interviewers
The
Informants
Interviews were conducted with at least one key informant of the person with DS, such as a caregiver working in a day-care center/residential facility or a family member. Informants had to be able to provide an accurate description of the behavior in the last six months as well as the typical/characteristic behavior in the past before decline occurred. Additional key informants were recruited, especially in cases where a single informant could not provide answers to all questions and both time periods. If informants were not able to describe the typical/characteristic behavior and this became apparent during the interview, this resulted in exclusion. In case of multiple informants, they were interviewed in a single session (not separately). Interviews were conducted in absence of the person with DS. Prior to the interview, set-up and scoring system were explained to the informant(s). In case of disagreement between informants, consensus on the score was reached during the interview. If an informant did not understand an item, the interviewer provided clarification. If informants evidently exaggerated or trivialized symptoms, the interviewer addressed this. Informants were asked to give short, succinct answers, and were reminded to do so if they gave long-winded or anecdotal responses. For each person with DS, the
Ethics
The Medical Ethics Review Board of the University Medical Center Groningen (UMCG) evaluated the study protocol (no. 2018/220) and concluded that the Dutch Medical Research Human Subjects Act did not apply. The study was registered in the UMCG Research Register (no. 201800252) and compliant with the EU General Data Protection Regulation and standards for medical research in humans recommended by the Declaration of Helsinki. Local institutional review committees gave their approval, whenever applicable. In Flanders, Belgium, the Institutional Review Boards of the Hospital Network Antwerp (no. 5058) and the University Hospital Antwerp/University of Antwerp (no. 17/50/566) approved the study. In France, the study was authorized under research standard MR-004 by the Commission Nationale de l’Informatique et des Libertés (no. 2214487 v 0). In Italy, the study was approved by the Ethical Committee of the Universitá Cattolica del Sacro Cuore (no. 2731520). In Spain, the study was approved by the Sant Pau Ethics Committee and reported to the Minister of Justice according to the Spanish law for research in people with ID.
Study population, recruitment, and consent
To ensure a representative study population, participants were recruited through the aforementioned care institutions (various backgrounds, regions, living situations) based on inclusion and exclusion criteria. Inclusion criteria: phenotypical diagnosis of DS, aged ≥30 years, and a stable dose of psychoactive medications (if any). Exclusion criteria: profound ID, long-term admission to hospital in the past six months, bed-ridden or in terminal care (e.g., end-stage dementia), presence of a confirmed cerebrovascular accident, and absence of at least one informant able to describe the person’s behavior in the past six months and typical/characteristic behavior in the past. People who faced a recent life event, e.g., moving home or death of a family member, with continued impact on behavior were excluded, according to clinical judgement. Furthermore, individuals were excluded who presented behavioral changes that, according to clinical judgement, were due to another condition (comorbidity), if diagnosed, for example (un)treated depression, epilepsy, hypothyroidism, vitamin B12 deficiency, hearing problems, vision problems, sleep apnea, and chronic pain. A diagnosis of such a comorbidity per se did not result in exclusion if the person functioned normally (e.g., due to effective treatment). ID level, medical conditions, and medication use were based on (medical) records, and if necessary, inquiries with involved care institutions, clinicians, and ID (neuro)psychologist. After selection, an information letter with consent form was sent. Except for a few individuals with DS capable to provide consent themselves (adapted informed consent form with pictograms), written informed consent was generally obtained from legal representatives (proxy consent). Consent was provided for evaluation of behavioral changes using the
Dementia diagnosis
Three diagnostic study groups were distinguished in this cross-sectional study: 1) DS without dementia (DS), 2) DS with questionable dementia (DS + Q), i.e., (slight) deterioration that is suggestive of dementia, but does not (yet) clearly meet the diagnostic criteria, and 3) DS with a clinical diagnosis of dementia (DS + AD). People with DS were assigned to one of the three study groups on the basis of expert clinical judgement by clinicians and/or ID (neuro)psychologists at the participating care institutions. Clinical diagnosis of dementia in people with DS is valid and reliable [16, 31]. This judgement was generally based on routine multidisciplinary clinical evaluation, informant interview(s), information from medical records, and general dementia criteria [32, 33]. People with DS were not subjected to new dementia assessments. The diagnosis, and thus assignment to the three study groups, was established in advance and not based on outcomes of the
Validity
This study builds further on the initial development process in which face and content validity of the
Discriminative ability
In the context of discriminative ability, 1) item (ir)relevance, 2) total scale scores, and 3) sensitivity, specificity and predictive values were analyzed. First of all, we aimed to confirm relevance of behavioral items in the optimized
Reliability
Reliability was studied by evaluating interrater reliability (IRR) and test-retest reliability (TRR) for a subset of individuals with DS. For IRR, the same interview was scored by two interviewers blinded to each other’s scores. For this purpose, the first interviewer conducted the interview, and a second interviewer was present as a ‘fly on the wall’ (not involved in the interview). IRR assessments were performed in various combinations of interviewers in multiple centers. For TRR, a second interview by the same interviewer and with the same informant(s) was conducted within 1–7 weeks after the first interview. Originally, we aimed to have a retest conducted within 4 weeks. However, due to practical difficulties, this was extended in the course of the study. IRR and TRR may be calculated using percent agreement, correlations or Cohen’s kappa [35]. Because of the categorical nature of the item scores in the
Data processing and quality control
Each completed scale, including those administered for reliability testing, was thoroughly checked for any lack of clarity, missing data, inclusion/exclusion criteria, and compliance with the instructed method and rules described in the manual, including rules regarding the interviewer, informants and scoring (see above). In particular, rules concerning the possibility to answer ‘not applicable’ were double checked and scores adapted, if necessary. That is, ‘not applicable’ was only allowed if a person was not verbal, physically impaired, or freedom-limiting measures were in place. If required, the involved interviewers were consulted, and issues were solved through consensus.
Statistics
For population characteristics (Table 1), Pearson’s chi-squared tests were applied to compare categorical data between groups. ANOVA tests were used to compare normally distributed continuous data (age and IQ-scores) between groups.
Characteristics of the three diagnostic study groups
ID level refers to the highest level of functioning (baseline) before dementia-related decline occurred. Dependence on a wheelchair was defined as requiring a wheelchair not only outdoors for longer distances, but also indoors. If the person with DS does not need a wheelchair indoors, they may show most behavioral items related to physical activity in the
In the context of discriminative ability, analysis focused on item (ir)relevance of individual items. ‘Unknown’ and ‘not applicable’ answers were treated as ‘missing values’. To compare individual item scores (frequency change and severity change) between the three groups, Kruskal-Wallis tests were used. Statistical analysis was conducted using original underlying frequency change scores (–4 to + 4), severity change scores (–3 to + 3), and care burden change scores (–3 to + 3). In Figs. 3–9 and Supplementary MaterialFigures 1–6, however, a simplified graphical representation of changes is provided in which the changes were simplified to ‘decrease’, ‘unaltered’, and ‘increase’. Secondly, frequency change and severity change scores for the total scale were calculated as the sum of individual item change scores for frequency and severity, respectively, and were compared between groups using Kruskal-Wallis tests. Thirdly, using total scale frequency change scores (all items), ROC analyses were performed. In addition to sensitivity and specificity, positive and negative predictive values were calculated as well [34].
Concerning reliability, IRR and TRR were calculated as percent agreement for frequency change scores and severity change scores per item and for the total scale. Total scale scores for frequency change could range from –208 to + 208 points and for severity change from –156 to + 156 points. Given the fact that the
Additional analyses were performed to evaluate the effect of age, sex, ID level, and presence of depression on total scale scores for frequency change and severity change. To assess the effect of age, a linear regression analysis was performed within the study group without dementia (DS) using age as independent variable and total scale frequency change or severity change as dependent variable. The effect of sex (male/female) and depression (presence/absence) were studied using Mann-Whitney U tests on the entire study group of 524 participants (regardless of dementia status). Finally, the effect of ID level was studied on the entire study group comparing total scale scores (one-way ANOVA) between groups with mild, moderate ID, and severe ID.
RESULTS
Scale optimization
Based on prior results and clinical experiences obtained in the initial study [30], the

Schematic overview of optimization of the
The
Population demographics
Using informant interviews, the

Schematic overview of included and excluded interviews and the three study groups. BPSD-DS II, Behavioral and Psychological Symptoms of Dementia in Down Syndrome II scale; CVA, cerebrovascular accident; DS, Down syndrome without dementia; DS + Q, Down syndrome with questionable dementia; DS + AD, Down syndrome with diagnosed AD dementia; ID, intellectual disability.
Validity
Face and content validity were already ensured for the initial version of the
Discriminative ability: item (ir)relevance
Here, we aimed to confirm the relevance of the remaining and refined 52 items by comparing frequency change and severity change across the three study groups. Hereafter, items are described per clinically defined section with the corresponding

Significant frequency changes for items in section 1 (anxious behavior). Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing any decreased, unaltered or any increased frequency comparing the last sixth months to the typical/characteristic behavior in the past. Proportions of individuals (%) with missing values is depicted in dark grey, but not in numbers. Item descriptions and

Significant frequency changes for items in section 2 (sleep problems). Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing any decreased, unaltered or any increased frequency comparing the last sixth months to the typical/characteristic behavior in the past. Proportions of individuals (%) with missing values is depicted in dark grey, but not in numbers. Item descriptions and
Section 1: Anxious behavior
The first section addressed worrying about upcoming activities/events (item 1.1; pfq < 0.001, psv < 0.001), going to the toilet unusually often or for an unusually long time with any (apparent) physical reason (1.2; pfq < 0.001, psv < 0.001), being tense (1.3; pfq < 0.001, psv < 0.001), avoiding situations/ places that makes the person nervous (1.4; pfq < 0.001, psv < 0.001), being scared to be left alone (1.5; pfq < 0.001, psv < 0.001), and being easily panicked (1.6; pfq < 0.001, psv < 0.001). Figure 3 visualizes that increased anxiety was prominent for people with DS + AD, but also for DS + Q. For items 1.1–1.4, the difference between DS + Q and DS + AD was rather small, suggesting that increased anxiety may already occur in an early stage of dementia.
Section 2: Sleep problems
This section evaluated finding it hard to fall asleep (2.1; pfq = 0.001, psv = 0.001), waking repeatedly during the night (2.2; pfq < 0.001, psv < 0.001), wandering around at night (2.3; pfq < 0.001, psv < 0.001), waking long before it is time to get up/ the alarm goes (2.4; pfq < 0.001, psv < 0.001), finding it hard to get up in the morning (2.5; pfq < 0.001, psv = 0.002), being tired or complaining of fatigue (2.6; pfq < 0.001, psv < 0.001), and sleeping in the daytime (2.7; pfq < 0.001, psv < 0.001). Figure 4 demonstrates that the proportion of individuals showing an increase in sleep problems was consistently highest in DS + AD, intermediate in DS + Q, and lowest in DS. Specifically, for items 2.6 and 2.7, the proportion of individuals showing an increased frequency was already substantial in more than one third of individuals with DS + Q.
Section 3: Irritable behavior
Three items evaluated were: being irritable, touchy (3.1; pfq < 0.001, psv < 0.001), being impatient (3.2; pfq < 0.001, psv < 0.001), and being short-spoken, responding grumpily (3.3; pfq < 0.001, psv < 0.001). In general, the proportion of individuals showing an increase was highest in both DS + Q and DS + AD, and lowest in DS (Fig. 5). The difference between DS + Q and DS + AD was rather small, suggesting that increased irritable behavior may already occur in an early stage of dementia.

Significant frequency changes for items in section 3 (irritable behavior), section 4 (obstinate behavior), section 5 (restless & stereotypic behavior), and section 6 (aggressive behavior). Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing any decreased, unaltered or any increased frequency comparing the last sixth months to the typical/characteristic behavior in the past. Item descriptions and
Section 4: Obstinate behavior
Being self-willed (4.1; pfq = 0.436, psv = 0.362), being argumentative, uncooperative, or obstructive (4.2; pfq = 0.089, psv = 0.007), not willing to accept necessary help (4.3; pfq = 0.420, psv = 0.415), and sighing/groaning (4.4; pfq < 0.001, psv < 0.001) were evaluated in this section. Only the latter item differed significantly between groups, with the proportion of individuals showing an increased frequency being lowest in the DS group and higher in DS + Q and DS + AD groups (Fig. 5). Though not significant, items 4.1–4.3 showed an interesting bidirectional change: for a substantial proportion of individuals a decreased frequency was reported, while for another substantial proportion an increased frequency was reported. Among the DS + AD group, item 4.1 (14.2% decreased frequency; 15.9% increased frequency), 4.2 (13.3%; 24.8%), and 4.3 (15.0%; 23.9%) was found.
Section 5: Restless & stereotypic behavior
This section included items on general restlessness (5.1; pfq < 0.001, psv < 0.001), wandering (5.2; pfq < 0.001, psv < 0.001), stereotypic behavior (5.3; pfq < 0.001, psv < 0.001), repeatedly dressing and undressing (more than necessary) (5.4; pfq = 0.001, psv < 0.001), verbal stereotypy (5.5; pfq < 0.001, psv < 0.001), and compulsive behavior (5.6; pfq = 0.018, psv = 0.005). Between groups, the same pattern was observed for all items, with the proportion of individuals showing an increase being consistently highest in DS + AD and lowest in DS (Fig. 5).
Section 6: Aggressive behavior
Verbally aggressive behavior (6.1; pfq < 0.001, psv < 0.001), destructive behavior (6.2; pfq = 0.004, psv = 0.014), and physically aggressive behavior towards others (6.3; pfq = 0.003, psv = 0.020) were evaluated. Figure 5 demonstrates that an increased frequency of verbal aggression was more pronounced in DS + AD than in DS. Destructive and physically aggressive behavior showed a rather similar pattern, though less pronounced.
Section 7: Apathetic behavior
Different possible symptoms of apathetic behavior were assessed: lack of initiative (7.1; pfq < 0.001, psv < 0.001), lack of interest in the direct living environment (7.2; pfq < 0.001, psv < 0.001), hard to motivate to get involved in familiar activities/tasks, appearing lazy (7.3; pfq < 0.001, psv < 0.001), not independently completing activities/tasks, needs encouragement or help (7.4; pfq < 0.001, psv < 0.001), not participating much in conversation (7.5; pfq < 0.001, psv < 0.001), social withdrawal (7.6; pfq < 0.001, psv < 0.001), lack of sympathy or empathy (7.7; pfq < 0.001, psv = 0.109), and jaded emotional responses (7.8; pfq < 0.001, psv = 0.002). Among all sections, apathetic behavior was increased most evidently in relation to the status of dementia, with proportions of individuals showing an increase up to 73.5 % (item 7.4). The proportion of individuals showing an increase was consistently highest in DS + AD, intermediate in DS + Q, and lowest in DS for all items (Fig. 6), with the exception of item 7.6 in which DS + Q and DS + AD were relatively similar. A substantial proportion of individuals with DS + Q already demonstrated an increase frequency of most apathetic items, thus suggesting that increased apathetic symptoms may already present in an early phase of dementia.

Significant frequency changes for items in section 7 (apathetic behavior). Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing any decreased, unaltered or any increased frequency comparing the last sixth months to the typical/characteristic behavior in the past. Item descriptions and
Section 8: Depressive behavior
Although many symptoms overlap, apathy and depression are regarded as two distinct neuropsychiatric conditions [29]. Items in this section emphasized depressive over apathetic characteristics, including rapid mood swings (8.1; pfq < 0.001, psv < 0.001), being sad and/or weeping a lot (8.2; pfq < 0.001, psv < 0.001), being very downhearted and appearing to be in low spirits (8.3; pfq < 0.001, psv < 0.001), having physical complaints without any apparent illness or injury (8.4; pfq = 0.001, psv = 0.007), and moving and responding slowly (general slowness) (8.5; pfq < 0.001, psv < 0.001). For items 8.1, 8.2, and 8.5 the proportion of individuals showing an increase was evidently highest in DS + AD, intermediate in DS + Q, and lowest in DS (Fig. 7). For items 8.3 and 8.4, the proportion of individuals in DS + Q and DS + AD was rather similar as compared to DS. The proportion of individuals in the DS + Q group showing an increased frequency was already substantial for items 8.2 and 8.5, suggesting that these symptoms may present early in the course of the disease.

Significant frequency changes for items in section 8 (depressive behavior) and section 9 (psychotic behavior). Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing any decreased, unaltered, or any increased frequency comparing the last sixth months to the typical/characteristic behavior in the past. Item descriptions and
Section 9: Psychotic behavior
This section concerned incorrect beliefs/thoughts (delusions) (9.1; pfq < 0.001, psv < 0.001) and abnormal sensory experiences not experienced by others (hallucinations) (9.2; pfq < 0.001, psv < 0.001). Psychotic symptoms were reported substantially less commonly as compared to other sections. Figure 7 shows that the frequency of psychotic behavior increased in a larger proportion of individuals with dementia compared to those without dementia.
Section 10: Disinhibited behavior
Items addressed behaving impulsively (10.1; pfq = 0.001, psv = 0.111), making inappropriate comments or jokes (10.2; pfq = 0.526, psv = 0.071), and behaving in impolite or indecent ways (loss of decorum) (10.3; pfq < 0.001, psv = 0.302). The proportion of individuals demonstrating an increased frequency of disinhibited behavior was highest in DS + AD and lowest in DS (Fig. 8).

Significant frequency changes for items in section 10 (disinhibited behavior) and section 11 (eating & drinking behavior). Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing any decreased, unaltered or any increased frequency comparing the last sixth months to the typical/characteristic behavior in the past. Item descriptions and
Section 11: Eating & drinking behavior
This last section addressed drinking poorly, having to be encouraged to drink (11.1; pfq < 0.001, psv < 0.001), poor appetite, having to be encouraged to eat (11.2; pfq < 0.001, psv < 0.001), eating slowly (11.3; pfq < 0.001, psv < 0.001), being picky about food and drink (11.4; pfq = 0.078, psv = 0.116), and putting substances/objects in the mouth that are not intended for consumption (pica) (11.5; pfq = 0.002, psv = 0.162). For items 11.1–11.3, the proportion of individuals showing an increase was highest in DS + AD and lowest in DS (Fig. 8).
Items with unaltered frequency change and severity change scores (change = 0) for≥85% of DS + Q and DS + AD were regarded irrelevant. Only 11.5 (pica) fulfilled this criterion: frequency change was unaltered for 98.3% of DS + Q and 98.1% for DS + AD, and severity change was unaltered for 100% of DS + Q and 99.1% of DS + AD. None of the other items were found to be irrelevant as substantial changes were observed between groups.
Discriminative ability: Total scale scores
Kruskal-Wallis tests were used to compare sum scores for the total scale between the three groups. Total scale frequency change scores differed significantly between groups (
Discriminative ability: Sensitivity, specificity, and predictive values
Although a diagnosis of dementia cannot exclusively be based on a purely behavioral assessment (without addressing cognitive and functional decline), scores on a behavioral scale may aid clinicians and ID (neuro)psychologists in the diagnostic process. Therefore, we aimed to identify cut-off scores for discrimination between groups, i.e., DS versus DS + Q/DS + AD (cut-off 1) and DS/DS + Q versus DS + AD (cut-off 2). Between groups, frequency changes were more pronounced than severity changes. Therefore, Supplementary Figure 7 shows the ROC curves for total scale frequency change scores for DS versus DS + Q/DS + AD and for DS/DS + Q versus DS + AD. Since sensitivity, specificity, positive predictive value and negative predictive value approached each other, Table 2 presents a range of cut-off scores.
Cut-off scores with corresponding sensitivity, specificity, positive predictive value, and negative predictive value
The range of cut-off scores provided here starts with the first cut-off score reaching a specificity≥70%. DS, Down syndrome without dementia; DS + Q, Down syndrome with questionable dementia; DS + AD, Down syndrome with diagnosed AD dementia; PPV, positive predictive value; NPV, negative predictive value.
Care burden
Care burden was scored per section considering those items within each section that were answered positively (frequency ≥1). Kruskal-Wallis tests demonstrated that care burden change scores differed significantly between groups for all sections: 1) anxious behavior (

Care burden changes for each section. Underlying frequency change scores are depicted in a simplified way: the proportion of individuals (%) per group showing a decreased, unaltered or increased care burden comparing the last sixth months to the typical/characteristic behavior in the past.
Reliability
IRR was determined for a subset of N = 82 individuals (15.6%): DS (
Interrater reliability and test-retest reliability
IRR and TRR for frequency change and severity change are provided as percent agreement. For each section, the range of agreement for individual items is given, i.e., the lowest and the highest percent agreement for items in each section. For each individual item, IRR, TRR, as well as internal consistency, are provided in Supplementary Table 1. For the total scale, both IRR and TRR were calculated for perfect agreement (identical scores) as well as with of plus or minus 1, 2, or 3 points. DS, Down syndrome without dementia; DS + Q, Down syndrome with questionable dementia; DS + AD, Down syndrome with diagnosed AD dementia; IRR, interrater reliability; TRR, test-retest reliability.
In addition, IRR and TRR were calculated as percent agreement for section care burden change scores: 1) anxious behavior (IRR = 95.1; TRR = 78.0), 2) sleeping problems (96.3; 86.0), 3) irritable behavior (97.6; 80.0), 4) obstinate behavior (95.1; 76.0), 5) restless & stereotypic behavior (95.1; 84.0), 6) aggressive behavior (100.0; 80.0), 7) apathetic behavior (98.8; 86.0), 8) depressive behavior (97.6; 66.0), 9) psychotic behavior (98.8; 96.0), 10) disinhibited behavior (98.8; 92.0), and 11) eating & drinking behavior (97.6; 92.0). IRR and TRR were also calculated for the total scale care burden change score for perfect agreement (IRR = 82.9; TRR = 40.0) and with margins –1 to + 1 point (93.9; 64.0), –2 to + 2 points (98.8; 86.0), and –3 to + 3 points (100; 92.0).
Internal consistency was evaluated by determining Cronbach’s alphas for frequency change and severity change of individual items against the total scale scores for frequency change and severity change, respectively. Individual items had Cronbach’s alphas above 0.839 (frequency change) and above 0.788 (severity change) (Supplementary Material Table 1). Among all items together, overall Cronbach’s alpha was 0.845 (frequency change) and 0.799 (severity change). In summary, reliability and consistency data for frequency change and severity change were good and confirm previous results [30].
Effect of age, sex, depression, and ID level
Various potential confounding factors were already addressed in advance through clinical judgement (see Methods section). In addition, effects of age, sex, depression, and ID level were evaluated. Since age is the major risk factor for dementia [38], the DS + AD group was expectedly older. A linear regression analysis in the group without dementia (DS) showed that age did neither significantly influence total scale scores for frequency change (coefficient
DISCUSSION
Using the shortened and refined
Study strengths
Virtually all individuals with dementia present with one or more behavioral changes [9, 10]. Whereas these symptoms are well studied in the general population [8, 39], BPSD have not been comprehensively studied in DS [29]. To that end, we developed the
Diagnosing dementia in DS is rather complex due to the baseline level of functioning, presence of (life-long) characteristic behavior and the frequent presence of comorbidities which may contribute to dementia-like symptoms [29, 40–42]. Considering this diagnostic complexity and specific circumstances relating to individuals with DS, an important strength of this scale is its specific adaptation to the target group. To account for life-long characteristic behavior, the central aim of the scale is to identify ‘change’. The scoring system has been developed as such that the individual with DS is compared to oneself. As people with ID often find it difficult to verbally express their feelings and emotions, or may not understand the items, identification of changes is based on interviews with key informants [43].
In this study, the diagnosis of (questionable) dementia was based on existing clinical multidisciplinary assessment, the current gold standard [16]. Participants were not subjected to new dementia assessments. To minimize the effect of other potential causes of decline than dementia and to reduce the risk of erroneously attributing changes to dementia, each individual was carefully evaluated for other causes of decline/change, such as major life events or comorbidities that may present with dementia-like symptoms.
Results from the initial study [30] guided optimization of the scale. The
Study limitations
Among the 524 eligible interviews, the interviewer was previously involved in the diagnostic procedure of (questionable) dementia of 92 individuals with DS. Although one could argue that this is a strength (well aware of the diagnostic process and its thoroughness), this might also be regarded as a risk of bias. Therefore, it is important to note that the scores were provided by informants (not the interviewer). Moreover, diagnosis of dementia, and thus division into study groups, was established without considering the outcomes of the
In retrospective interviews, recall bias and a degree of subjectivity—especially regarding (variable) behavior—may influence results. Although the selection and quality of informants was considered in advance by the interviewer, differences in informants’ personal attentiveness to signal changes could be a potential limitation. In the context of assessing dementia in individuals with ID, however, Jamieson-Craig et al. showed that ‘retrospective carer report of change in everyday function was as good as, if not better than, prospective ratings to identify dementia’ [42]. Nevertheless, certain informants may have exaggerated or trivialized behavioral alterations. That is why an interview design was chosen rather than self-completion by informants. Experienced interviewers may, in part, recognize and address this during the interview.
Another possible limitation is the fact that this study did not consider individuals with profound ID who often face other disabilities as well, such as motor or sensory disabilities. They require a specific approach to identify symptoms of dementia [44] as (many) specific skills have never been developed or care professionals have taken over. Consequently, items addressing such skills are not relevant as they cannot demonstrate decline, i.e., they cannot be indicative of dementia [16, 45]. Currently, no adapted scales are available for dementia in this subpopulation. Potentially, a future selection of
Lastly, we faced practical difficulties to schedule more retest interviews. Although we extended the time interval from maximum 4 weeks to 7 weeks, it was still not feasible on many occasions due to, e.g., holidays, illness, far travel distances to the interview location, illness, full agendas.
Future implications
Primary goal of this study was to optimize and further study an adapted assessment tool for BPSD in adults with DS. We have shown that the
The
In addition, it would be valuable to apply the
CONCLUSION
The optimized
Footnotes
ACKNOWLEDGMENTS
This study was financially supported by the J. Th. Guepin Stichting Onderzoek Down Syndroom and further supported by in kind contributions from the participating care institutions and European expertise centers. Founding work of this project [
] was financially supported by the Research School Behavioral and Cognitive Neurosciences (RUG/UMCG) and the Gratama Stichting/Stichting Groninger Universiteitsfonds (2015-04). In Belgium, Institute Born-Bunge/University of Antwerp was granted a subsidy from Research Foundation Flanders (G053218N). In Spain, this study was supported by Fondo de Investigaciones Sanitario, Instituto de SaludCarlos III (PI14/01126, PI17/01019 to JF) and the CIBERNED program, partly jointly funded by Fondo Europeo de Desarrollo Regional, Unión Europea, Una manera de hacer Europa. This work was also supported by the National Institutes of Health (NIA grants 1R01AG056850–01A1, R21AG056974, R01AG061566 to JF), Fundació La Marató de TV3 (20141210 to JF), Fundació Catalana Síndrome de Down, Fundació Víctor Grífols i Lucas and Generalitat de Catalunya (SLT006/17/00119 to JF).
The authors are grateful for the participation of all caregivers/familiy members for being informants as well as support staff of the various care institutions for the local organization and administration. We wish to thank all interviewers who took part in the project (per institute): Mylou Pool, Marije Ravesteijn, Nadi de Vos (Amerpoort), Lizan Exterkate, Rachel Kemna, Nardine Lukassen, Danielle Oosterling, Carla Wensink (Aveleijn), José Eleveld, Desiree van Leth, Sharina Grefelman, Cobien Wever (Cosis), Sanne Apperloo, Eline de Jong, Sandra Kleijer (De Twentse Zorgcentra), Jade Oostrom (Dichterbij), Marscha Brunia, Rosanne Derksen, Esther Scholten, Marije van Dijk-van der Werff (Elver), Lyanne Hassefras, Marloes Lansbergen, Simone Wilson-Koudenburg (Ipse de Bruggen), Janette Drolinga, Tineke Arts-Wiegmink (Nieuw Woelwijck), Hannelore Broers, Regine van Duijvenboden, Maarten Faber, Annebeth Maarsman, Lilian van der Meer, Romy Scheggetman, Karen van de Weijer (Philadelphia), Merel Duim, Kirsten Hendrickx, Britta te Nijenhuis, Loes Velner (Reinaerde), Eden van den Akker, Lobke Berendsen (Severinus), Fenna Blaas, Herco Elbertsen, Marjon van der Poel (Sherpa), Marieke Groen, Corien Rikkers-van Nes, Corine van Essen (’s Heeren Loo), Annemarieke Bronswijk, Ybelina de Jong-van der Meulen (Sprank), Natascha Albers, Eva Smit, Nienke Stap (Talant/Alliade Care Group), Aart-Jan Lenstra, Tijs van der Linden, Marian Roffel-de Jong (Vanboeijen), Monique Bomer-Veenboer, Lisette Meinster, Deliah Ormskerk (Zuidwester), Gianluca Radice (Institut Lejeune), and Laura Videla (Sant Pau).
