Abstract
Background:
Visuo-cognitive impairment is common in patients with Parkinson’s disease with mild cognitive impairment (PD-MCI) and constitutes a prognostic factor for the conversion to Parkinson’s disease dementia (PDD). However, systematic analyses on which neuropsychological tests are most suitable to assess visuo-cognition in PD-MCI and PDD and to differentiate these cognitive stages are lacking.
Objective:
To review neuropsychological tests used to assess visuo-cognition including visuo-perceptual and visuo-spatial processing, visuo-constructive copying and drawing on command abilities; and to identify the visuo-cognitive subdomain as well as tests most suitable to discriminate between PD-MCI and PDD.
Methods:
MEDLINE, PsycINFO, Web of Science Core Collection, and CENTRAL were systematically searched for relevant studies assessing visuo-cognitive outcomes in patients with PD-MCI and PDD. Risk of bias was assessed using a customized form based on well-established tools. Random-effect meta-analyses were conducted.
Results:
33 studies were included in the systematic review. Data of 19 studies were entered in meta-analyses. Considerable heterogeneity regarding applied tests, test versions, and scoring systems exists. Data indicate that visuo-constructive command tasks are the subdomain best suited to discriminate between PD-MCI and PDD. Furthermore, they indicate that the Rey-Osterrieth-Complex-Figure Test (ROCF), Corsi Block-Tapping Test, Judgment of Line Orientation (JLO), and Clock Drawing Test (CDT) are tests able to differentiate between the two stages.
Conclusion:
We provide suggestions for suitable visuo-cognitive tests (Corsi Block-Tapping Test, or JLO, ROCF, CDT) to improve diagnostic accuracy. Methodological challenges (e.g., heterogeneity of definitions, tests) are discussed and suggestions for future research are provided.
Registration:
https://www.crd.york.ac.uk/prospero/, ID: CRD42018088244
Keywords
INTRODUCTION
Non-motor symptoms (NMS) have gained increased interest in the processes of diagnosing and treating Parkinson’s disease (PD) within the last decade. NMS can occur in an early disease stage, precede motor symptoms, and can even serve as a pre-diagnostic marker of PD [1]. Within the heterogeneous field of NMS, cognitive changes have been identified as one of the most challenging NMS affecting patients’ mood and quality of life [2]. Cognitive functioning can be considered as a continuous spectrum ranging from healthy cognition to dementia. On this spectrum, mild cognitive impairment (MCI) is an intermediate stage between intact cognitive functioning and dementia in PD. At the time of diagnosis, approximately 32% of patients with PD fulfill diagnostic criteria for PD-MCI [3, 4]. In the course of disease progression, patients with PD-MCI have an increased risk to develop Parkinson’s disease dementia (PDD) [5, 6]. A meta-analysis suggests that 20% of the patients with PD-MCI convert to PDD within a time period of 3 years [7]. The conversion rate even increases to 34% when patients are followed for more than 3 years after diagnosis. Further studies suggest that PDD is associated with increasing disease duration, motor severity, and older age [8–10]. Regarding the progression from PD-MCI to PDD from a neural perspective, the Dual Syndrome Hypothesis suggests two independent but partially overlapping profiles of (early) cognitive impairment in patients with PD [11, 12], i.e., on the one hand, dopamine-modulated fronto-striatal impairments resulting in a cognitive profile that is characterized by early deficits in executive functions and attention, and on the other hand, cholinergic-modulated more posterior and temporal lobe impairments leading to dysfunctions in language (e.g., semantic fluency), memory, and visuo-cognition (e.g., visuo-spatial abilities). Especially the latter profile is frequent in patients rapidly converting from PD-MCI to PDD. On the spectrum of cognitive functioning, PDD and dementia with Lewy bodies (DLB) share overlapping pathological bases (α-synuclein deposition) and similar clinical manifestations (including cognitive impairment and possible hallucinations), especially deficits in visuo-cognitive abilities may be prominent and may occur early [13]. The diagnosis of PDD is made when cognitive impairment manifests itself more than 1 year after the onset of parkinsonian motor symptoms, whereas DLB is diagnosed when cognitive symptoms appear before or within 1 year of the onset of motor symptoms [14, 15].
Visuo-cognition: A theoretical framework
In general, “visuo-cognition” describes the interaction between vision and cognition across multiple levels of information processing [16]. This domain includes all nonverbal cognitive abilities operating upon perceptual stimuli as well as mental images allowing individuals to interact with the environment [17], e.g., recognize objects or estimate distances. As indicated by this broad definition, visuo-cognition summarizes several processes and cannot be considered as a uniform concept. The term visuo-cognition seems suitable as an umbrella term for these processes and is thus used here.
Based on previous research and established frameworks [18–22] a 3-component model of visuo-cognitive abilities can be suggested: (1) visuo-perceptual, (2) visuo-spatial and (3) visuo-constructive abilities with (3a) copy tasks and (3b) command tasks. Visuo-perceptual (1) abilities are involved in elementary processing with no or minimal requirement of additional cognitive functions, e.g., executive functions, memory, attention [18–20]. This domain comprises those mental operations involved in analysis, synthesis, and identification of visual stimuli as in the process of object recognition [21, 22]. Visuo-spatial abilities (2) can be considered as more ‘complex’ processing stages requiring additional cognitive resources, e.g., executive functions, memory, or attention [18–20]. This domain refers to those processes involved in perceiving, integrating, and modulating spatial information, such as location, orientation, direction, and distance [21, 22]. The distinction between visuo-perceptual and visuo-spatial skills is supported by Ungerleider & Mishkin’s classical categorization of visuo-cognitive processing [23]. According to this model, the ventral pathway— known as “what” stream— is an occipito-temporal network involved in identifying and recognizing visual stimuli, e.g., object recognition (visuo-perceptual abilities). Whereas, the dorsal pathway (“where” stream) is an occipito-parietal network involved in processing of spatial information, e.g., object localization (visuo-spatial abilities). As third component, visuo-construction (3) broadly summarizes the ability to construct a visually perceived or imagined stimuli [22]. Construction involves organizing and understanding spatial relations and requires additional cognitive resources such as action planning (executive functions) or recall (memory) and fine motor control. This domain is usually assessed by graphomotor tasks (drawing) or assembly tasks (arranging objects). A further division of visuo-constructive abilities in two categories seems to be appropriate [18]: First, copy tasks including the reconstruction of a visible stimulus (e.g., copy a geometrical object or assemble sticks in a presented arrangement). Second, command tasks involving construction without visible input (e.g., draw a geometrical object from memory or assemble sticks in a pre-learned arrangement). In this 3-component model of visuo-cognition we summarized findings allowing a theoretical tripartite division — yet some degree of overlap between the three subdomains must be acknowledged.
Visuo-cognitive impairment in patients with Parkinson’s disease
Visual dysfunctions, e.g., in contrast sensitivity or color discrimination, can be already observed in an early stage of disease and may be partly due to retinal dopamine deficiency [24]. Visual symptoms increase as the disease progresses to PDD, e.g., double vision or misjudging objects [25]. Apart from basic visual dysfunctions, visuo-cognitive impairment becomes frequent in the course of disease and there is a large body of evidence reporting a significant decline of performance as disease progresses [26, 27]. Visuo-cognitive symptoms may entail difficulties in everyday life situations (in the interplay with other cognitive or motor functions), i.e., patients report problems with navigation around their environment, reading maps or bumping into doorways [28]. Consequently, those difficulties contribute to a reduced quality of life (QoL) [29].
To shed light on the precise profile of visuo-cognitive impairment in PD, previously published reviews have examined patients’ performances on neuropsychological outcomes. An early literature review by Lazaruk [30] summarized consistent effects of impaired facial recognition and information extraction from embedded material. Shortly after, Waterfall and Crowe [31] conducted a meta-analysis defining 13 different categories of visuo-cognitive functions; patients performed significantly worse on three out of these 13 compared to healthy controls, i.e., the Rey-Osterrieth-Complex Figure Test (ROCF, copy trial) [32, 33] and Raven’s Progressive Matrices or Block Design from the Wechsler Adult Intelligence Scale-Revised (WAIS-R) [34]. Strikingly, the authors did not identify impaired performance on the widely-used Judgement of Line Orientation test (JLO) [35], although more recent single studies consistently found impairments in this test [36, 37]. Additionally, recent investigations have revealed specific deficits for example in visuo-spatial working memory (n-back test) [38], position discrimination (subtest of Visual Object and Space Perception battery, VOSP [39]) [40], and mental rotation (Mental Rotation Test) [41].
Visuo-cognition in PD-MCI and PDD
The predictive role of different visuo-cognitive functions, among other cognitive predictors, for a progression from PD-MCI to PDD is well-documented: for instance, Galtier et al. [42] identified performance in visuo-perceptual and visuo-spatial tasks as predictors, and Williams-Gray et al. [43] identified visuo-constructive abilities as a prognostic factor. Thus, early diagnosis of visuo-cognitive dysfunction seems suitable for identification of patients at risk for PDD and for therapeutic intervention as early as possible.
A theoretical frame to define subdomains of visuo-cognitive skills is largely lacking in the context of PD, and the choice of tests to assess dysfunction in this domain seems rather random. Notably, in the clinical diagnostic criteria for PDD of the Movement Disorder Society (MDS), a categorization is proposed including visual-spatial orientation, perception, and construction/praxis, and these are all covered by the label of “visuo-spatial functions” [44]. Further, in the associated recommendations for neuropsychological tests in Level II-diagnostics, this tripartite distinction is also applied [45], and tests to assess each of the categories are suggested. However, a theoretical basis for the categorization is not presented, and data on the question whether all these tests are suitable to the same degree to differentiate between PD-MCI and PDD are lacking.
Research challenges
Even though some decline of visuo-cognitive functioning from PD-MCI to PDD is well-established, the multidimensional nature of visuo-cognitive processes is rarely taken into consideration, leading to an only fragmentary assessment of this domain in empirical studies and clinical settings. Visuo-cognitive abilities are frequently detected by means of a single neuropsychological test (e.g., JLO or pentagon copying). In cases where more than one test is used, reporting of composite scores derived from various tests measuring different visuo-cognitive processes is common. Moreover, the use of heterogeneous definitions, diverse labeling of domains (e.g., visual cognition vs. visuo-spatial abilities vs. visual processing skills), a wide range of neuropsychological tests, and varying assignments of tests into domains (e.g., clock copying as visuo-spatial test vs. clock copying as visuo-constructive test) may impact empirical findings. As a result, diagnostic accuracy might decrease.
Aim of the present review and meta-analyses
Even though visuo-cognitive impairments are common in PD-MCI, negatively impact QoL and have a prognostic value for the conversion to PDD, to date and to the best of our knowledge, there is no systematic review and meta-analysis investigating visuo-cognitive abilities and their neuropsychological assessment in patients with PD-MCI and PDD. It remains to be answered which visuo-cognitive subdomain is mostly affected and which tests are best suited to discriminate between PD-MCI and PDD to increase diagnostic accuracy, identify patients at risk of PDD, and derive early interventions. Therefore, the aim of the present review and meta-analysis was to systematically investigate the following research questions with reference to our literature-based 3-component model of visuo-cognition: (1) Which neuropsychological tests are used to assess visuo-cognition in PD-MCI and PDD in empirical studies, (2) which visuo-cognitive subdomain (visuo-perceptual, visuo-spatial, visuo-constructive abilities: copy and command tasks) discriminates best between PD-MCI and PDD, and (3) which test is the best within each subdomain to discriminate between PD-MCI and PDD.
MATERIAL AND METHODS
The present systematic review and meta-analysis is part of a larger project on visuo-cognition in PD. The project was preregistered in PROSPERO (https://www.crd.york.ac.uk/prospero, ID: CRD42018088244). The overall project consists of further sub-projects addressing visuo-cognitive test performance in patients with PD in other cognitive stages/categories. We chose the comparison PD-MCI versus PDD as first sub-project to be published due to its clinical relevance: e.g., different drug indications for treating cognitive impairment in the two stages [46], dementia as surgical exclusion criterion for deep brain stimulation (DBS) [47]. The reporting in the present article follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline for systematic reviews and meta-analyses [48]. The PRISMA checklists for abstracts and systematic reviews are displayed in the Supplementary Material.
Search strategy
According to recent recommendations [49, 50], we conducted a systematic literature search in the following databases: MEDLINE (via Ovid), PsycINFO (via Ovid), Web of Science Core Collection (Science Citation Index), and CENTRAL. Databases were searched from January 2008 to July 2018 to identify relevant literature published. An update search was then conducted until 29 May 2020 in the same databases. Search strategies included combinations of free text words and MeSH terms and were adapted to the individual interface of each database. The search consisted of key words on the population (P) and the outcome of interest (O). Key words employed were “Parkinson’s disease” (P) combined with the following visuo-cognition-related terms (O): “cogn*” or “neuropsych*” or “visual*” or “visuo*” or “spatial” or “spatio” or “orientation” or “percept*” or “mental rotation” or “nonverbal” or “non-verbal” or “constructive” or “intelligence”. A broad search strategy was selected to minimize the risk of omitting relevant literature. Full search strings for each database are listed in the Supplementary Material. Reference lists of all identified trials were hand searched for further literature.
Eligibility criteria
Original human studies of all study designs published in peer-reviewed journals in English or German language were eligible for inclusion. Further eligibility criteria were defined in terms of the population (P) and the outcome (O) of interest. We included studies with a total sample size of N≥50 patients with idiopathic PD diagnosed according to the UK Brain Bank Criteria (UKBBC) [51] of all age groups with and without cognitive impairment (P). We chose a total sample size of N≥50 to ensure sufficient statistical power. However, considering the focus of the present systematic review and meta-analysis we included only a subgroup of publications reporting information about visuo-cognitive testing (tests used, scores) in patients with PD-MCI and PDD. Therefore, subsample sizes of patients with PD-MCI or PDD could consist of n≤50 participants. There were no restrictions with regard to diagnostic criteria used for PD-MCI and PDD. Studies were included when at least one neuropsychological test assessing visuo-cognition and a measure of global cognition were reported (O). The assessment of global cognition was defined as an inclusion criteria in the overall project in order to ensure a Level I characterization [52] of the patients’ cognitive status. In the overall process of study selection, n = 11 studies were not eligible for inclusion since the authors did not report the administration of a global cognitive measure (e.g., Mini-Mental State Examination (MMSE), Montreal Cognitive Assessment (MoCA)). We assessed these excluded studies retrospectively again to ensure that we did not miss a study relevant for the present analyses. There was no study matching the scope of the present review.
Study selection and data extraction
Two reviewers assessed all titles and abstracts for inclusion according to predefined eligibility criteria using Covidence systematic review software [53]. Then, full texts of studies meeting the inclusion criteria were further reviewed for inclusion. In cases where full texts were unavailable, we contacted the authors and asked them to provide full text publications. In the title and abstract as well as full text screening, all identified studies were reviewed by HLJ and a second reviewer. The role of the second reviewer was shared by a team of four trained research assistants. In all cases where no consensus could be obtained between the two reviewers, the cases were discussed with a third reviewer until a final consensus was reached.
Data were independently extracted by two review authors (HLJ, LB) using a customized data extraction form. We only extracted raw scores of neuropsychological tests, age and/or education adjusted scores were not extracted. The two review authors compared the two extracted datasets for accuracy, disagreements were resolved by discussion until consensus was reached. If required information and data were not available, the authors of the reviewed studies were contacted and asked to supply the data needed within a time frame of 2 weeks. 10 of 23 contacted authors replied, and the requested data were received from 8 studies. Publications based on the same or overlapping sample with the same visuo-cognitive measure were further evaluated. In such cases, preference was given to the study with the largest sample size. In studies with multiple time points of testing (e.g., longitudinal studies), baseline data were extracted. Visuo-cognitive tests were classified according to the literature-based 3-component model of visuo-cognition outlined above with the subdomains (1) visuo-perceptual abilities, (2) visuo-spatial abilities, (3a) visuo-constructive abilities: copy tasks, and (3b) visuo-constructive abilities: command tasks. We derived a definition and exemplary tests for each subdomain from previous research. Tests used in the included studies were then classified independently into the matching subdomain by two review authors, classifications were compared and disagreements were resolved by consensus. Table 1 provides an overview of the assignment of each identified test to the specific visuo-cognitive subdomain. Furthermore, test descriptions are provided reflecting similarities and differences of tests.
Overview with descriptions of identified visuo-cognitive tests
Risk of bias assessment
Risk of bias was examined using a customized quality assessment tool based on the Newcastle-Ottawa Quality Assessment Scale for Case Control Studies [54], Down’s and Black’s Checklist for the assessment of the methodological quality [55], and Cochrane’s assessment tool for non-randomized studies of interventions (ROBINS-I) [56]. Quality judgements were made across three domains: sample selection, comparability of groups, and reporting of methods and results. The risk of bias was rated as low, medium, or high for each domain. Based on the rating of the three domains, an overall risk of bias was rated for each study. A detailed description of the rating system is provided in the Supplementary Material. The assessment was conducted independently by two reviewers (HLJ, LB). Cases of disagreements were discussed until a consensus was reached.
Data synthesis and statistical analyses
The meta-analyses were conducted using RStudio [57]. We conducted one overall analysis including all studies that provided data on group differences between patients with PD-MCI and patients with PDD in all investigated visuo-cognitive domains (visuo-perceptual abilities, visuo-spatial abilities, visuo-construction: copy tasks, visuo-construction: command tasks) to identify the most sensitive domain for the differentiation between the two investigated groups. Furthermore, we conducted subgroup analyses of each of the four domains to identify the most sensitive neuropsychological test within each domain. All analyses included all studies regardless their risk of bias assessment. There were studies assessing the same visuo-cognitive subdomain by several different tests. In such cases, studies were included with each test in the meta-analysis. Therefore, the same study sample was included multiple times in the same analysis leading to a total number of test results exceeding the total number of participants in the analysis.
We conducted sensitivity analyses including only studies with a low overall risk of bias rating (for results, see the Supplementary Material). Furthermore, we conducted additional sensitivity analyses for each visuo-cognitive domain by using a meta-analytic methodology that allowed us to integrate and control for data of multiple tests performed by the same study population (cf. Chapter 4 in Schwarzer et al. [58]). Results are also displayed in the Supplementary Material.
The statistics used for the meta-analyses were the mean test score (raw score), the standard deviation, and the number of evaluated participants for each group. The alpha level was set at 0.05 for all analyses. Random effects models were calculated using standardized mean differences (SMDs) as effect sizes. Effects from 0.2 to under 0.5 were regarded as small, effects from 0.5 to under 0.8 as medium, and effects of 0.8 or higher as large [59]. We used the 95% confidence interval (CI) as measure of uncertainty. The I2 statistic was used for addressing heterogeneity of the included studies. As recommended in the Cochrane Handbook for Systematic Reviews of Interventions [50], we interpreted the heterogeneity as follows: 0% to 40% : not important/low heterogeneity; 30% to 60% : moderate heterogeneity; 50% to 90% : substantial heterogeneity; 75% to 100% : considerable heterogeneity.
To identify possible publication bias, funnel plots were calculated for those meta-analyses including five or more studies.
RESULTS
Search results
We identified a total of 36,280 studies through our systematic literature search for initial screening. After removing 12,944 duplicates, 23,336 articles remained in the title and abstract screening. After screening titles and abstracts, 4,173 articles were reviewed in the full text screening. A total of 524 studies met the eligibility criteria and were considered for inclusion. We identified 55 studies comparing visuo-cognitive performance in patients with PD-MCI to PDD. In this pool of n = 55 studies, multiple publications were based on the same or overlapping sample. According to our predefined eligibility criteria, we only included the publication with the largest sample size. Therefore, we further excluded 22 studies due to smaller sample sizes than the same or overlapping sample reported in another included publication. This leads to n = 33 articles eligible for inclusion in the systematic review. There was one study with questionable and potentially incorrectly reported sociodemographic as well as neuropsychological data, which was therefore excluded from meta-analytic examinations [60]. Thus, sociodemographic and neuropsychological data of this study are described as missing data in the following sections. Among the remaining 32 publications, relevant data of visuo-cognitive measures were not reported in 13 articles leading to a total of n = 19 studies included in the main meta-analysis. The study selection process is illustrated in Fig. 1.

PRISMA diagram showing the study selection process.
Descriptive characteristics of included studies
Study characteristics of all included studies are depicted in Table 2. The eligible 33 studies provided a pool of N = 3,494 patients with PD of which n = 2,239 participants were diagnosed with PD-MCI and n = 1,255 with PDD. Sample sizes ranged from n = 12 [61] to 353 [62] in the PD-MCI group and from n = 9 [63] to 193 [64] in the PDD group. In most included studies, subsample sizes were larger in the PD-MCI group than in the PDD group. There were 4 studies [65–68] with more participants with PDD than PD-MCI and one study with an equal number of patients in the two groups [69]. The sample across studies varied demographically in terms of sex, age, years of formal education, disease duration, and disease severity as follows: in the PD-MCI group, 18 of 33 studies [61–63, 68–81] reported a sex distribution with more male than female participants with a ratio of male participants ranging from 57% [63] to 83% [61, 79]. There were six studies reporting an almost equal number of female and male participants in the PD-MCI group [60, 82–84]. In the PDD group, there were 15 studies with more male than female participants [61–64, 81–84] with a proportion of male participants ranging from 57% [81] to 91% [62]. Further, ten studies reported an almost equal sex ratio in the PDD group [65, 85–87]. Four studies did not report sex distribution among participants in both groups [67, 88–90]. In the PD-MCI group, age ranged from 59.2 [90] to 75.7 years [61] and years of formal education from 6.8 [80] to 16.4 [74]. Patients in the PDD groups were 61.4 [84] to 78.7 [79] years old with formal education ranging from 5.4 [91] to 16.3 years [74]. Information about mean ages was unavailable in two studies [60, 64]. Mean years of formal education were not reported in five studies [60, 90]. There was one study providing information about education in percentages [64]. Disease duration was reported either in months (n = 4) or years (n = 23) with missing data in six trials [60, 88].
Study, participant, and outcome characteristics of the included studies
Note: n.a., not available (test was conducted in study, but result not reported); empty table cell, no test was conducted; BVDT, Benton Visual Form Discrimination Test; CAMCOG, Cambridge Cognition Examination; CANTAB, Cambridge Neuropsychological Test Automated Battery; CDT, Clock Drawing Test; CERAD, Consortium to Establish a Registry for Alzheimer’s Disease; DRS 2, Dementia Rating Scale; FMT, Figural Memory Test; H&Y, Hoehn & Yahr; JLO, Judgment of Line Orientation Test; LPS, Leistungspruefsystem; ROCF, Rey-Osterrieth-Complex Figure; MMSE, Mini Mental State Examination; MoCA, Montreal Cognitive Assessment; PANDA, Parkinson Neuropsychometric Dementia Assessment; PD-CRS, The Parkinson’s Disease – Cognitive Rating Scale; VOSP, Visual Object and Space Perception Battery; WISC-R, Wechsler Intelligence Scale for Children – Revised; WMS III, Wechsler Memory Scale III.
In the group of patients with PD-MCI, disease duration ranged from 1.3 [66] to 10.7 years [77] and in the PDD group from 2.4 [91] to 13.8 years [73], showing an expectedly longer disease duration in PDD patients due to the progressive nature of cognitive impairments in PD. Disease severity was assessed using UPDRS-III (motor examination) and/or Hoehn and Yahr scale. Information about patients’ disease severity was unavailable in three studies [60, 88]. UPDRS-III scores ranged from 15.2 [83] to 44.5 [81] in the PD-MCI group and from 22.5 [66] to 61.0 [81] in patients with PDD. Descriptions of Hoehn and Yahr stages were reported as means and standard deviations in 18 included publications [63, 89–92] with scores varying between 1.6 [63] and 2.9 [65] in persons with PD-MCI and 2.0 [63] to 3.7 [81] in PDD. Reports of Hoehn and Yahr stages were expressed in median and range in 3 studies [61, 73] and in percentages in 2 studies [64, 77]. The presence of hallucinations was evaluated in 5 studies [64, 90] with patients with PD-MCI showing a greater frequency of hallucinations than patients with PDD. There were some studies defining the presence of hallucinations as exclusion criteria for study participation. Participants’ global cognitive status was assessed by the MMSE in 26 studies [60, 90–92] with unreported scores in 5 studies [60, 92]. Average MMSE scores ranged from 19.27 [90] to 28.4 [84] in the PD-MCI group and from 14.04 [81] to 24.81 [71] in the PDD group. Other global cognitive screening tests were used (e.g., MoCA, DRS) in the remaining 7 included publications.
Risk of bias assessment
Results of the risk of bias assessment are depicted in Table 3. The overall quality of the majority of studies was moderate to high. However, methodological weaknesses could be observed in the domain sample selection which also included the rating of the diagnostic elaboration of the patients’ cognitive impairment. The quality of the comparability of groups (e.g., comparable levels of education, same method of data collection for the two groups) as well as the reporting of methods and results was moderate to high in the majority of studies.
Results of the risk of bias assessment
Customized risk of bias assessment based on the Newcastle-Ottawa Quality Assessment Scale for Case Control Studies [54], Down’s and Black’s Checklist for the assessment of the methodological quality [55], and the Risk Of Bias In Non-randomized Studies – of Interventions assessment tool (ROBINS-I, Cochrane Collaboration) [56] across four domains: sample selection, comparability of groups, reporting of methods and results, total risk of bias. Red color indicates a high risk of bias, yellow color indicates a moderate risk of bias, green color indicates a low risk of bias.
Overall meta-analysis across all visuo-cognitive domains
In total, 19 studies were included in the meta-analytic examination (N = 2,474; PD-MCI: n = 1,599; PDD: n = 875). As described above, there were studies examining patients’ performance on more than one visuo-cognitive test. Therefore, n = 14 studies were included multiple times – according to the number of tests – in the overall analysis. As a consequence, the total number of test results exceeds the above stated total number of participants. In total 5,411 test results were analyzed in the overall meta-analysis, 3,508 test results of patients with PD-MCI and 1,903 test results of patients with PDD (see Fig. 2). The overall effect size, the standardized mean difference (SMD), was 1.03 in our random effects model (95% CI: 0.94; 1.12), indicating a large effect. The overall heterogeneity was I2= 46%, which can be interpreted as medium. The results show that over all investigated subdomains, the patients with PD-MCI performed better than the patients with PDD.

Forest plot of meta-analysis over all visuo-cognitive subdomains.
When focusing on the four different visuo-cognitive subdomains, the effect sizes were large for all domains except for the visuo-spatial domain, where we could identify a medium effect size (SMD = 0.68, 95% CI: 0.50; 0.87). Further, command tasks of the visuo-constructive domain showed the largest effect (SMD 1.09, 95% CI: 0.97; 1.22) of all subdomains.
The heterogeneity was low in all domains, except in the visuo-constructive domain (copy tasks: I2=44% & command tasks: I2= 31%), where the heterogeneity was moderate. The results show that all measurements within each visuo-cognitive domain, except in the visuo-constructive domain, measure fairly homogeneous constructs. No evidence for publication bias was found; the funnel plot appears reasonably symmetrical (see the Supplementary Material).
When including only studies that were rated as having a high quality (i.e., with a low risk of bias) in the meta-analysis, the direction of effects was similar. However, the tendency of visuo-constructive command tasks to be the subdomain with the largest effect was even stronger (see the Supplementary Material).
Visuo-perceptual abilities
According to our definition, visuo-perceptual abilities were examined in 9 of 33 studies with 7 different neuropsychological tests [63, 92]. The recognition trial of the ROCF was used in three studies [66, 87] and the subtest Incomplete letters of the VOSP in two studies [70, 92]. All other visuo-perceptual measures were employed once (Table 2).
In the visuo-perceptual domain, 5 studies were integrated in the meta-analysis including 5 different measures (see Fig. 3). In total, 263 test results of patients with PD-MCI and 151 test results of patients with PDD were integrated in the model. The overall random effects model shows an effect size of SMD = 1.04 (95% CI: 0.81; 1.27), again indicating a large effect [59]. The overall heterogeneity was low, I2= 12%. The recognition trial of the ROCF was the only test that was used by two different studies. The effect size was SMD = 1.0 (95% CI: 0.68; 1.32) and there was no statistical heterogeneity, I2= 0%. The effect size of the performance on the Benton Visual Form Discrimination Test was comparable to ROCF performance (SMD = 1.0, 95% CI: 0.18; 1.83); however, this test was only used in one study [74]. Goldman et al. [73] used the subtest Figure Learning of the Figural Memory Test with the highest effect size (SMD = 1.59, 95% CI: 1.04; 2.15) within the visuo-perceptual domain. The funnel plot shows a slightly unequal distribution of the SMDs, with more measures showing an effect smaller than 1.

Forest plot of visuo-perceptual tests.
The results are supported by an additional meta-analysis including only studies with a low risk of bias (see the Supplementary Material).
Visuo-spatial abilities
Visuo-spatial abilities were assessed in 11 studies with 6 different neuropsychological tests [62, 92]. The JLO was the most frequently used test in the visuo-spatial domain. It was conducted in a total of 6 publications with 4 studies using the long version [67, 72] and 2 studies using short versions of the test [62, 73]; however, none of the studies specified the exact short version. Several short forms of the original 30-item version have been developed varying in the total number of items and psychometric properties [93, 94]. The second most frequently visuo-spatial measure was the Corsi Block-Tapping Test (forward condition) that was utilized in 3 studies [72, 92]. All other visuo-spatial tests were used in one study, respectively.
Five studies were integrated in the domain visuo-spatial with 6 measures covering two different tests (see Fig. 4). In total, 567 test results of patients with PD-MCI and 249 test results of patients with PDD were integrated in the model. The overall random effects model shows an effect size of SMD = 0.68 (95% CI: 0.50; 0.87), indicating a medium effect [59]. The overall heterogeneity was low, I2= 15%. Three different versions of Benton’s JLO were used in five studies. Only one study used the Corsi Block-Tapping Test, which shows with a SMD = 1.03 (95% CI: 0.33; 1.74) the largest effect within the visuo-spatial domain. The funnel plot does not reflect asymmetry.

Forest plot of visuo-spatial tests.
An additional meta-analysis with low risk of bias studies revealed a different result with an increased effect size of the overall model and the long version of Benton’s JLO as test with the largest effect size (see the Supplementary Material).
Visuo-constructive abilities: Copy tasks
Copy tasks as part of visuo-constructive abilities were assessed in 26 of 33 studies with 13 different neuropsychological tests [60, 89–92]. An overview of included tests is provided in Table 2. The copy trial of the ROCF was the most frequently used test in a total of 11 publications [66, 92]. Other frequently used copy tasks were pentagon copying of the MMSE (5 of 26 studies), different versions of clock copy test (4 of 26 studies), and the copy figures subtest of the CERAD-Plus (copy of four geometric figures) (3 of 26 studies).
The quantitative analysis in the domain visuo-constructive abilities: copy tasks includes 16 studies, 14 measures and seven different tests (see Fig. 5). In total, 1,295 test results of patients with PD-MCI and 744 test results of patients with PDD were integrated in the model. The overall random effects model shows an effect size of SMD = 1.06 (95% CI: 0.91; 1.20), indicating a large effect [59]. The overall heterogeneity was moderate, I2= 44%. Six studies used copy trial of the ROCF, showing an overall effect size of SMD = 1.27 (95% CI: 0.98; 1.55), indicating a large effect. This test seems to be most appropriate to assess copy abilities in the visuo-constructive domain meaning that it differentiates best between patients with PD-MCI and PDD. Three studies used the copy figures subtest of the CERAD-Plus, also showing a large effect, SMD = 1.07 (95% CI: 0.87; 1.28). This copy figures subtest shows no statistical heterogeneity, I2= 0%. There were two studies measuring performance on clock copying and further two studies applying pentagon copying of the MMSE; however, different versions and/or scoring systems were used. No evidence for publication bias was found; the funnel plot appears reasonably symmetrical.

Forest plot of visuo-constructive copy tasks.
Again, results are supported by a further meta-analysis with low risk of bias studies. The effect of the ROCF copy trial as the most appropriate test within this subdomain even increased (see the Supplementary Material).
Visuo-constructive abilities: Command tasks
Performance of command tasks was used in 27 of 33 studies with 17 different tests to assess visuo-constructive abilities [60, 79–92]. Clock drawing was the most frequently applied command task in 17 of 27 studies. However, a total of 9 CDT versions with different scoring systems were used. The CDT version was not specified in 4 studies [60, 83]. The recall trial of the ROCF was the second most frequently used test and was applied in a total of 11 studies with 6 studies employing the immediate recall trial [66, 91] and 9 publications the delayed recall trial [66, 92]. Sobreira et al. [83] did not specify the ROCF recall interval.
The quantitative analysis of visuo-constructive command tasks includes 14 studies, 20 measures and four different tests (see Fig. 6). In total, 1,383 test results of patients with PD-MCI and 759 test results of patients with PDD were integrated in the model. The overall random effects model shows an effect size of SMD = 1.09 (95% CI: 0.97; 1.22), indicating a large effect [59]. There was moderate heterogeneity, I2= 31%. Six studies used the delayed recall trial of the ROCF, showing an overall effect size of SMD = 1.05 (95% CI: 0.82; 1.28), indicating a large effect. There were nine studies using 7 versions of CDT. One study [68] examined an unspecified 10 point-scoring version of the CDT yielding the largest effect size within this subdomain (SMD = 1.54, 95% CI: 1.22; 1.85). Again, no evidence for publication bias was found; the funnel plot appears reasonably symmetrical.

Forest plot of visuo-constructive command tasks.
In the meta-analysis with high quality studies the effect of the delayed recall trial of the ROCF even increased. The analysis points in the direction that different versions of the CDT and the delayed recall trial of the ROCF yield comparable large effect sizes.
DISCUSSION
The aim of the present systematic review and meta-analysis was to shed light on visuo-cognitive impairment and its neuropsychological assessment in patients with PD-MCI and PDD. Based on our eligibility criteria, 33 studies were identified for the systematic review. Of those, 19 studies could be included in the meta-analysis. Our main results are the following: (1) Regarding our first question which neuropsychological tests are used to assess visuo-cognition in PD-MCI and PDD, we found a high heterogeneity regarding visuo-cognitive tests used and regarding conducted test versions or scoring systems. (2) Regarding our second question which visuo-cognitive subdomain discriminates the best between PD-MCI and PDD and referring to our 3-component model of visuo-cognition, data indicate that visuo-constructive command tasks are best suited to discriminate between PD-MCI and PDD as they yielded the largest effect size. (3) Regarding our third question which are the best tests to discriminate between the two stages, we (carefully) suggest the following tests based on our meta-analyses: ROCF recognition trial to assess visuo-perceptual abilities, Corsi Block-Tapping Test or JLO to assess visuo-spatial abilities, ROCF copy trial to measure visuo-constructive copy abilities, and ROCF delayed recall trial or CDT to assess visuo-constructive command tasks.
Sociodemographic and clinical sample characteristics
It is well established that demographic (e.g., older age, lower education) and clinical characteristics (e.g., longer disease duration, hallucinations) negatively impact cognitive functioning in PD and predict the progression to PDD [95, 96]. Our review indicates that there is a large heterogeneity in the included studies with regard to sample characteristics in terms of age, years of formal education, and disease severity. Moreover, sex distribution was unequal in the majority of studies with more male than female participants, matching the higher prevalence of PD in men. Strikingly, some studies did not report some of these sociodemographic characteristics. The minority of studies evaluated the presence of hallucinations. Visual hallucinations are associated with poorer visuo-cognitive test performance [97] and therefore, may confound results. Its impact should be subject of future research. The present review underlines the need of more comparable groups of patients and emphasizes the need of an adequate reporting of sample characteristics in order to generate reliable and more consistent results to gain a better understanding of cognitive changes from PD-MCI to PDD.
Which neuropsychological tests are used to assess visuo-cognition?
In order to systematically overview visuo-cognitive assessment used in the included studies, we derived a 3-component model of visuo-cognitive abilities (visuo-perceptual, visuo-spatial, and visuo-constructive abilities with copy and command tasks) with exemplary tests from literature. Based on derived definitions, we assigned visuo-cognitive tests into subdomains (Table 1).
The results of the present review show that the choice of tests was highly heterogeneous between studies in all subdomains. Visuo-perceptual abilities was the subdomain assessed the least in the included studies with the recognition trial of the ROCF as most frequently used test. In the visuo-spatial domain, the JLO was used in the majority of studies; however, choice of test versions (short vs. long versions) was variable between studies with some studies not even specifying the exact version used. Further, our review revealed that the visuo-constructive domain was assessed the most in the included studies. Visuo-constructive copy abilities were mainly assessed by the ROCF copy trial. Pentagon copying of the MMSE and clock copying could be also identified as frequent tasks; however, with the latter assessed by a variety of test versions. Focusing on visuo-constructive command tasks, CDT was the test most frequently applied, though with a variety of scoring systems. CDT versions mainly differ in administrations (e.g., position of clock hands to be drawn, freely drawn on a blank sheet vs. drawing in a pre-printed clock face), and the scoring system (e.g., elements to be scored, number of total points). Even though the test versions share commonalities, they seem to be associated with different grey matter regions and might therefore reflect different areas of brain damages [98]. Due to those differences, we did not collapse CDT data but entered them separately in our systematic review and meta-analysis. The scoring systems suggested by Sunderland et al. [99] and Manos and Wu [100] were the only versions used twice each — all other versions were used in only one study.
Which visuo-cognitive subdomain discriminates best between PD-MCI and PDD?
In line with previous research, our meta-analytic results support visuo-cognitive deterioration and provide evidence that patients with PDD have greater impairments in all above mentioned visuo-cognitive subdomains compared to patients with PD-MCI. Therefore, visuo-cognition seems to be a relevant domain in the characterization of cognitive changes from PD-MCI to PDD. However, since we could only enter 19 of 33 studies in our meta-analyses due to unreported results, we would like to underline the importance of transparent reporting of raw test scores (e.g., in the Supplementary Material) in future studies even if visuo-cognitive outcomes are assessed as secondary outcomes.
Regarding the three subdomains of visuo-cognition, results of the present overall meta-analysis indicate that visuo-constructive command tasks are most suitable to differentiate visuo-cognitive changes between PD-MCI and PDD. This means that visuo-constructive command tasks yielded the largest effect size in our meta-analysis and therefore, show the greatest dynamic range across the spectrum of cognitive impairment from PD-MCI to PDD. However, it should be kept in mind that clinically the key aspect for differentiating PD-MCI from PDD are dysfunctions in activities of daily living. Visuo-constructive command tasks require an interplay between fine motor skills and visuo-perceptual, visuo-spatial as well as further cognitive abilities to draw or assemble an object without visual model. Therefore, those tasks are more complex and go far beyond visuo-perceptual and visuo-spatial abilities; command tasks reveal information about a person’s ability to recall an internal representation of a target object from memory, maintain the internal representation and then construct it as a whole with all component elements in their correct spatial relation without omitting parts [18]. Due to the complexity of those tasks, various factors impact performance. On the one hand, impairments in fine hand motor skills (e.g., tremor) increase in the course of disease in PD and are more pronounced and frequent in PDD than in PD-MCI [101]. Even if impaired fine motor skill performance does usually not affect scoring of visuo-constructive tasks, it might negatively impact a person’s motivation and patience to complete a task. As patients with PDD might suffer from more severe motor symptoms, completing a visuo-constructive command task might be more challenging for them than for patients with PD-MCI. On the other hand, visuo-constructive command tasks also rely on memory and fronto-striatal functions such as executive functions and working memory components (maintenance, updating) which are early and frequently impaired in PD [12]. Presumably, this interplay of early and frequently impaired functions might lead to the diagnostic superiority of this subdomain in the present review. On the basis of the aforementioned aspects, we carefully suggest visuo-constructive command tasks as optimal measure to capture visuo-cognitive deterioration from PD-MCI to PDD, although possible confounders (motor and cognitive variables) have to be considered.
Which test is the best within each subdomain to discriminate between PD-MCI and PDD?
A research question that has remained to be discussed in Level II testing of PDD was whether all tests within a visuo-cognitive subdomain are suitable to the same degree to differentiate between PD-MCI and PDD. Within the visuo-perceptual domain, we identified the subtest Figure Learning of the FMT as measure discriminating the best between PD-MCI and PDD. However, this result should be interpreted with caution due to the small number of studies included in this meta-analysis. The present results underline the need of more elaborated research on visuo-perceptual abilities and suitable tests in PD-MCI and PDD. Heterogeneity could also be observed in the assessment of visuo-spatial abilities. The Corsi Block-Tapping Test yielded the strongest effect size. Notably, the effect changed in favor of the JLO long version when only considering high quality studies in our sensitivity analysis. Again, we suggest a careful interpretation of results due to the small number of included studies in the main meta-analysis and the sensitivity analysis. The JLO is a commonly used measure with high reliability in PD [102]. Furthermore, short versions of this test have yielded comparable reliability and internal consistency compared to the long version, especially a 20-item version suggested by Winegarden et al. [36, 94]. Against this psychometric background, we endorse the use of short versions due to the increasing patients’ fatigue during testing and the short versions’ efficiency. As some JLO short versions were unspecified in the included studies, we would like to encourage researchers to specify exact short versions to ensure transparent reporting and to enable replication of empirical studies.
Within the visuo-constructive domain, the ROCF was not only identified as most frequently used visuo-constructive copy measure but also as copy task most suitable to differentiate between PD-MCI and PDD. When comparing the ROCF copy trial to other copy tasks, differences in stimulus size, administration time, and difficulty can be observed. In general, the ROCF is a complex figure consisting of 18 abstract geometrical shapes requiring a mean time to copy of 199.3 s (3.32 min) in healthy adults [103]. In contrast, other identified copy tasks involve only one or two simple shapes to copy at a time (e.g., MMSE: two intersecting pentagons or CERAD: successively, circle, rhombus, two rectangles, cube) resulting in shorter administration times. For these methodological reasons, ROCF might be a more complex and challenging copy task leading to better diagnostic discrimination between PD-MCI and PDD. At the same time, administration and scoring of this test are time-consuming and time is frequently limited in clinical practice. In addition, the intense administration time might lead to exhaustion in patients. Beside its diagnostic value, characteristics of the setting (e.g., time resources) and the patient (e.g., well-being, fatigue) should be considered when deciding whether to conduct the ROCF or not.
When analyzing visuo-constructive command measures, different CDT versions (unspecified 10-point CDT version, Goodglass and Kaplan [104] and Rouleau et al. [105]) yielded the largest effects — though each effect was based on a single result. Further, the ROCF immediate and delayed recall trials were used in multiple studies and also revealed large effects in the meta-analysis. Even though we assigned CDT and ROCF to the same visuo-cognitive subdomain, the two tests differ in some fundamental aspects. Drawing a clock requires the recall of an everyday item by referring to clock-related concepts acquired over life span. This involves knowledge of clock semantics as part of language functions to understand instructions and to complete the task [106]. As outlined above, the ROCF is a complex geometrical stimulus without reference to a well-known object. In the context of visuo-cognition, it is advantageous that ROCF can be considered as a ‘purer’ visuo-constructive command measure due to the minimized verbal impact and less reliance on recall of semantic memory. Bearing this conceptual difference in mind, we suggest administering either CDT or ROCF to cover visuo-constructive drawing-to-command abilities.
Strengths and limitations
To the best of our knowledge, this is first review and meta-analysis systematically investigating visuo-cognition in patients with PD-MCI and PDD. Strengths include the applied methods with a review question based on the P(IC)O-system, an elaborated systematic literature search in multiple databases yielding an enormous amount of studies to assess, a customized data extraction form with individualized risk of bias assessment based on well-established forms, the inclusion of well-characterized patient samples with regard to motor and cognitive parameters, and a reporting according to PRISMA guidelines. Further, we conducted meta-analyses with two sorts of sensitivity analyses.
Limitations must be considered when interpreting the results. Publications were only eligible for inclusion in this review when the total sample size was N≥50 to avoid undue influence of small studies, which in general are less precise and more subject to publication bias, especially since the random effects model gives more weight to small studies than the fixed effects model. It follows that we omitted studies with relevant target outcomes but smaller sample sizes. Furthermore, we defined this eligibility criterion with regard to the size of the total sample and not to the subsamples. This means that we included a study when the total number of patients with PD (e.g., PD-NC, PD-MCI, and PDD) was ≥50 even though the subgroup sizes of PD-MCI and PDD were smaller. A further limitation regarding the included samples must be considered. The diagnosis of cognitive impairment was established using neuropsychological Level I- or Level II-diagnostics, biomarkers were not considered in most of the studies. Even though neuropsychological tests are sensitive to detect cognitive impairment, they do not provide information about the impairment’s actual cause. As patients with idiopathic PD may develop an independent coexisting neurodegenerative disease, such as Alzheimer’s disease [107], they may be incorrectly classified as PDD when not considering biomarkers. Therefore, we cannot ensure that all analyzed patients did actually suffer from PDD. A further diagnostic issue refers to the continuum nature of PDD and DLB. As the two dementia syndromes are characterized by overlapping motor, cognitive, and neuropsychiatric characteristics, it still remains uncertain if all patients were correctly classified as PDD. Another methodological limitation is the restriction of publication date. Due to the increasing awareness of NMS in PD within the last decade, we decided to include only studies published within the last ten years. We then conducted an updated search two years later. Therefore, we assessed studies within a time frame of 12 years which should provide a thorough overview of the relevant literature. However, it is possible that we omitted relevant studies impacting our results.
To shed light on the heterogeneity of visuo-cognition, we derived a 3-component model from literature and assigned neuropsychological tests into domains according to our definition. It should be noted that this 3-component model requires a systematic empirical investigation in future research in order to examine whether the three distinct but overlapping subdomains can be confirmed. Further, categorization of tests into domains was based on expert rating. Although a consensus was reached between two independent raters, there are tests which could be assigned to other categories. For example, we classified the Hooper Visual Organization Test (HVOT) as visuo-spatial measure, whereas it could be also argued to be a visuo-perceptual task. Future research will have to use statistical methods, such as factor analysis, to assign tests to factors. Another limitation refers to our meta-analyses. We included all identified visuo-cognitive tests in our analyses. There were studies conducting multiple tests of the same subdomain. In such cases we entered the same study population with different outcome measures in the same analysis. We performed this procedure due to the relatively small amount of available data. However, we acknowledge that we violated the assumption of independence of effect sizes which possibly reduced heterogeneity and limits the validity of our findings. To address this issue, we conducted a sensitivity analysis for each visuo-cognitive subdomain, allowing us to integrate data of multiple outcomes performed by the same study population in one meta-analysis.
CONCLUSION
To summarize, this is the first systematic review with meta-analysis investigating visuo-cognitive impairment in patients with PD-MCI and PDD. The present publication provides consistent evidence for deterioration in visuo-perceptual, visuo-spatial, visuo-constructive copy and command tasks from PD-MCI to PDD with more severe impairment in all subdomains in PDD. However, there was a high heterogeneity regarding visuo-cognitive tests and scoring systems used in studies. We (carefully) suggest visuo-constructive command tasks as most suitable to differentiate between PD-MCI and PDD. Our results indicate that an elaborate assessment of visuo-cognition could involve the following tests: ROCF recognition trial (visuo-perceptual abilities), Corsi Block-Tapping Test or JLO (visuo-spatial abilities), ROCF copy trial (visuo-constructive copy abilities), ROCF delayed recall trial or CDT (visuo-constructive command tasks). However, the present review showed that there is still room for improvement in this research area. Future studies should focus on comprehensive reporting of sample characteristics, applied test versions and results. Controlling for the impact of demographic and clinical sample characteristics on visuo-cognitive outcomes should be a goal of future meta-analyses.
Footnotes
ACKNOWLEDGMENTS
We thank all researchers who provided us with requested data for our analyses and all staff members of the University Hospital Cologne who contributed to the project’s conceptualization and study selection (title/abstract screening, full text screening), especially Katharina Göke, Paulina Olgemöller, Laura Rauser, and Constanze Weber. Further, we would like to thank Michael Fanning for supporting us to develop search strategies in Ovid and Maria-Inti Metzendorf for her professional judgement with regard to our selection of databases.
CONFLICT OF INTEREST
HLJ, MR, LB, JF, and EK do not declare any conflicts of interests.
