Abstract
In the field of stem cell technologies, exciting advances are taking place leading to translational research to develop cell-based therapies which may replace dopamine releasing neurons lost in patients with Parkinson’s disease (PD). A major influence on trial design has been the assumption that the use of sham operated comparator groups is required in the implementation of randomised double-blind trials to evaluate the placebo response and effects associated with the surgical implantation of cells. The aim of the present review is to identify the improvements in motor functioning and striatal dopamine release in patients with PD who have undergone sham surgery. Of the nine published trials, there was at the designated endpoints, a pooled average improvement of 4.3 units, with 95% confidence interval of 3.1 to 5.6 on the motor subscale of the Unified Parkinson’s Disease Scale in the ‘OFF’ state. This effect size indicates a moderate degree of improvement in the motor functioning of the patients in the sham surgical arms of the trials. Four of the nine trials reported the results of 18F-Fluorodopa PET scans, indicating no improvements of dopaminergic nigrostriatal neurones following sham surgery. Therefore, while the initial randomised trials relying on the use of sham operated controls were justified on methodological grounds, we suggest that the analysis of the evidence generated by the completed and published trials indicates that placebo controlled trials are not necessary to advance and evaluate the safety and efficacy of emerging regenerative therapies for PD.
Keywords
INTRODUCTION
Parkinson’s disease (PD) is a neurodegenerative disorder marked by the progressive loss of neurons in multiple regions of the central nervous system (CNS) [1, 2]. The impairment of the dopaminergic (DA) neurons of the nigrostriatal projections are of particular importance, as these systems are causally linked to the movement disorders characterising PD. The discovery of the benefits of levodopa, and the subsequent development of a variety of DA agonists has greatly improved motor functioning and quality of life in people with PD. With the progressive degeneration of the DA nigrostriatal projections, there is a gradual decline in the therapeutic benefits of pharmacological agents and the emergence of adverse side effects. Currently, there is a dedicated international research effort in progress to develop novel treatments for PD, including regenerative therapies which aim to reverse the progress of the disease or replace the loss of dopaminergic neurons [3–5]. Research in this field has taken the following approaches, the outcomes of which have been evaluated using double-blind randomised trials: Implanting infusion devices to deliver neuroprotective and/or neurotrophic agents, such as nerve growth factors. Using viral vectors to deliver enzymes for enhancing DA synthesis. Implanting progenitors to generate replacements for impaired or lost DA producing cells.
Three decades ago, breakthrough research involving the transplantation of human fetal cells emerged as a promising means for restoring motor functions, and quality of life by partially reconstructing the neural networks damaged by the progress of PD [3, 6]. Initially, the efficacy of neural transplantation was evaluated in open-label studies generally indicating positive outcomes [6]. In the mid-1990 s, there was a call for the introduction of randomised controlled trials (RCTs) to ensure the rigorous evaluation of the safety, and efficacy of the transplantation of fetal cells for PD [7, 8]. The perceived need for RCTs is consistent with the generally held position that these designs are necessary to complete the translation pathway for the development of novel treatments. As many of the regenerative therapies for PD involve neurosurgical procedures to introduce the active agents into the CNS, it was argued that the adoption of double-blind surgical placebo controlled trials (SPTs) was essential to minimise the probability of false positive outcomes associated with confounding and bias [9, 10].
In contrast to the comparatively moderate risks and burdens involved in pharmacological double-blind trials, the participants assigned to the control group of an SPT are exposed to a process which are implemented to imitate the harms and burdens associated with undergoing a neurosurgical procedure for PD. Placebo surgery imitates active neurosurgical procedures to the extent to which it is required to convince the research participants and that they have undergone an actual experimental procedure. Researchers conducting sham surgical procedures attempt to ensure a minimal level of harm and risk to the participants, as for example, administering a burr hole without penetrating the dura and the brain, when imitating the intracerebral transplantation of cells (Table 1).
Overview of the procedures and results for the nine placebo controlled groups
(*) Primary outcomes were assessed on UPDRS (Motor, Off), with the exception of Freed et al. (2001), where the UPDRS (motor, off) score for the control group was estimated from the ‘Total UPDRS outcome’.
Being assigned to the placebo arm of an SPT does not simply involve undergoing a sham surgical procedure, but depending on the mechanism of active treatments, the administration of potent but useless medications, such as immunosuppressants. Initially, the objections to sham surgical procedures were based on the argument that research designs which impose additional harms and burdens on participants were, in principle, ethically unacceptable [11, 12]. Despite initial opposition, SPTs were justified within the framework of utilitarian ethics. That is, the advocates of sham surgery argued that what they considered as a reasonable degree of risk and harm to the trial volunteers was necessary to ensure the ‘greater good’, referring to the accurate and unbiased determination of the efficacy, and safety of regenerative therapies [7, 9].
A fundamental assumption advanced for justifying the need for double-blind trials has been that the placebo response is very large, particularly in surgical trials [1, 2]. However, Hróbjartsson and Gøtzsche [13] demonstrated that it is not sufficient to simply hold opinions regarding the magnitude of the placebo response; rather these opinions needed to be tested and confirmed through the evidence provided by the critical analysis of the results of relevant RCTs [14].
The aims of the current review are to: Conduct a meta-analysis to quantify the magnitude of the placebo response on the primary outcome reported in published SPTs of novel regenerative interventions for PD. Estimate the magnitude of the placebo effect in sham surgery treated patients as demonstrating a clinically meaningful improvement. Provide an evidence-based critique for the necessity of SPTs in comparison to alternative research designs for evaluating the safety and efficacy of emerging regenerative therapies.
After a quiescence in human trials of cell implantation for PD, exciting advances are being made in this technology. The present meta-analysis is focused on the critical evaluation of the evidence applicable to the design of RCTs for the translation of stem cell technologies to develop safe and effective treatment options for PD in the future.
METHODS
The systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement. Joanna Briggs Institute guideline for systematic review and meta-analysis were followed for this review. It must be noted that the present meta-analysis does not include the active treatment arms included in the trials. Rather, results of only the sham operated, placebo control arms were included in this review (Table 1).
Search strategy
Databases were systematically searched without publication period restriction until 30th of July 2020, using the following online databases: Medline, CINAHL, PsycINFO, and ProQuest Central for all available years of publications.
Key terms: Parkinson’s disease, PD, Parkinson’s Disease (MESH) Regenerative Therapy, Regenerative medicine, Cell therapy, Gene therapy, Sham surgery, Sham treatment.
Criteria for including studies were: being published in peer reviewed journals using double-blind trial design relying on surgical strategies for introducing active agents
Criteria for excluding studies were: diseases other than PD being published in languages other than English.
Participants
The review considered studies that included participants aged 18 years and older who were clinically diagnosed with Parkinson’s disease by the expert medical teams conducting the randomised trials. All the participants in trials continued to receive their medications for PD.
Study selection
Following the search, all identified citations were collated and uploaded into EndnoteX9 and duplicates were removed. Titles and abstracts were screened by two independent reviewers (LK & MB) using the inclusion/exclusion criteria for the review. Potentially relevant studies were retrieved in full and their citation details imported into the Covidence System for the Unified Management Assessment and Review of Information [15] and included in the reference list. Two independent reviewers assessed the full text of selected citations in detail against the inclusions criteria. The search results for article selection are presented in Fig. 1.

Systematic review flowchart.
Data extraction
Data was extracted from the sham operated control groups by two independent reviewers and included in the review based on the standardised data extraction tools in Covidence. The data extracted was changed from baseline to trial endpoint on UPDRS (motor, off), a measure of efficacy which was used as the primary outcome measure in eight out of the nine trials. The results for Freed et al. [16] was estimated from the evidence Total UPDRS (off) outcomes. The Unified Parkinson’s Disease Rating Scale (UPDRS), revised by the Movement Disorder Society [17] is recognised as a valid and reliable measure of the non-motor, and motor symptoms of PD. Also, the results of UPDRS measures are presented here in a way that negative (–) outcomes indicate a reduction of motor functioning, while the positive numbers (+) were used to represent improvements in motor functioning.
Four of the nine trials reported the results of 18F-Fluorodopa PET scans providing indirect evidence for improvements in DA capacity and storage. 18F-Fluorodopa imaging has been shown to be associated with the regeneration of the nigrostriatal dopaminergic system in patients with PD participating in trials and studies of regenerative therapies [18, 19].
Meta-analysis
The meta-analysis graphs using effect size were produced using Comprehensive Meta-Analysis software (version 3.3.070).
Assessment of methodological quality
The risk of bias within the citations were evaluated with the Joanna Briggs Institute (JBI) critical appraisal tools for RCTs. The main criteria in assessing risk of bias included the appropriateness of study design, adequacy of sample size, methods and measurements, and data analysis.
It is essential to note that the nine randomised trials were published in high impact, peer reviewed journals, with one [16] in the New England Journal of Medicine, one [18] in Brain, one [20] in Parkinsonism & Related Disorders, three [21–23] in Lancet Neurology, and three papers in Annals of Neurology. There have been concerns that sham procedures, such as partial thickness burr holes of the participants’ skulls did not adequately imitate the neurobiological effects of actual procedures involving the transplantation of cells or the delivery of gene vectors or trophic factors directly into the brain. In the nine trials reviewed, the closest replication of the active procedures [18] involved the implantation of a skull-mounted port in all the participants and for the 9-month duration of the trial.
Given this limited scope of the present review, the extraction and analysis of evidence did not go beyond that of the placebo group in each of the trials. As further randomised trials are completed in the future, a more extensive and critical systematic review and meta-analysis comparing the factors which determine the safety and efficacy of emerging regenerative therapies for PD will be required.
RESULTS
When assessing the magnitude of the effect sizes evident in the sham operated control groups it is relevant to distinguish between the terms ‘placebo response’ and ‘placebo effect’ [27]. The placebo response refers to the empirically determined change in performance of the placebo controls on designated primary or secondary outcomes, while the placebo effect implies clinical benefits inferred from, but only partly determined by the magnitude of the placebo response (Fig. 2). The placebo effect depends on the aims of the novel treatment under development and evaluation. For example, when the aim of the novel therapy is to restore or protect the dopaminergic nigrostriatal system, then the placebo effect will be demonstrated by evidence for the partial restoration or protection of this neural system at the trial endpoint.

Placebo response and placebo effects in SPTs: conceptual framework. a: Active group, b: Placebo group, c: True effect size; c = a-b, d: Placebo effect; d = b-e, e: Observer bias and confounding; e = b-d. As there was no third group included in any of the current SPTs, the magnitude of ‘e’ cannot be estimated, therefore, ‘d’ could not be determined from the data provided by the present meta-analysis.
Trial characteristics
This meta-analysis summarised the results of the primary outcome measure for the sham operated control arms of nine SPTs, evaluating the safety and efficacy of emerging regenerative therapies for PD. Four of the trials evaluated the benefit of administering neurotrophic factors, one of the trials the use of viral vector for enhancing DA synthesis, and four trials evaluated the outcomes of cell transplantation (Table 1).
There were 174 sham operated participants in the nine trials with an average sample size of 19.3 (SD = 8.4) per trial. The average age of the participants was 57.7 (SD = 6.9) years at the time of surgery, which is considerably lower than the average age of the population of people with PD. Considering the detailed selection criteria for volunteers being included in each of the trials there is no implication that the 174 participants constitute a representative sample of the population diagnosed with PD.
Trial duration from surgery to the predetermined endpoint of the trials was an average of 11.3 (SD = 5.8) months with a range of 6 to 24 months. Taking into account that the optimal time for trial endpoints to evaluate the benefits of regenerative therapies may be 3 years or longer [19], it is likely that insufficient time was allowed for the regenerative process to reach optimal levels in the active groups. In contrast, it may be that the placebo controls would have shown diminished efficacy if the duration of the SPTs were 3 years or longer.
What is the magnitude of placebo response?
The placebo response was defined and assessed in the nine randomised trials included in the present analysis as the average change from baseline in the sham operated group on the primary outcome measure at the predetermined endpoint of a trial. As stated earlier the primary outcome measure selected to determine efficacy was UPDRS motor (off), a scale which is regarded as a valid, and reliable tool for assessing the motor symptoms of PD [17].
The Weighted Mean Average (WMA) representing the pooled placebo response in the nine trials was 4.3 (95% Confidence Interval: 3.1 to 5.6, Fig. 3). When the two outliers [21, 25] were excluded from the analysis, the MWA placebo response was 3.9 with 95% CI 2.6 to 5.2 for the seven trials (Fig. 4). These results indicate a level of improvement in motor functioning which corresponds to the 10.1% increase on the primary outcome measure reported in a recent review [14].

Meta-analysis including the 9 double-blind trials. Favours A refers to UPDRS (motor, OFF) scores which show decline in motor functioning. Favours B refers to improvement on UPDRS (motor, OFF). ♦ Indicates the Weighted Mean Average (WMA). 5 of the trials indicated statistical significance.

Meta-analysis excluding two outliers - Olanow et al. (2003) and Gross at al. (2011). Favours A refers to UPDRS (motor, OFF) scores which show decline in motor functioning. Favours B refers to improvement on UPDRS (motor, OFF). ♦ Indicates the Weighted Mean Average (WMA). 5 of the trials indicated statistical significance. Note that the exclusion of the two outliners, which were negative and positive in outcome didn’t change significantly the WMA when moving to 7 groups. The results indicate that regardless of the heterogeneity of the trial, the WMA for the sham operated groups is a robust estimate.
Based on a detailed study of 653 patients with PD undergoing pharmacotherapy, Shulman and colleagues [28] proposed that degrees of recovery could be represented on a 3 point ordinal scale. The concordance among multiple measures of changes can be expressed in UPDRS (motor) scores, such that a 2.5-point change represents minimal, 5.2 a moderate, and 10.8 a large, clinically important difference.
The results of the present meta-analysis indicate a statistically significant placebo response, the magnitude of which represents a moderate level of clinical importance. Previously, Freed [29] suggested that large treatment benefits are required to justify the effort, risks and costs associated with implementing neurosurgical interventions for PD. It follows that an effect size which can be regarded, and aimed for, as an adequate goal for regenerative therapies is greater than 10.8 points of improvement or greater on UPDRS motor [28]. Given that the probable evidence-based estimate of the WMA score of the sham operated, placebo groups was found to be 4.3, it is unlikely that this moderate level of improvement would be identified as a successful disease modifying intervention. Therefore, the overall evidence does not support the notion of a strong, clinically meaningful placebo response on the primary outcome associated with the sham operated arm of an SPT.
As indicated in Fig. 2, the true therapeutic effect of the active treatment is determined by subtracting the mean of the placebo group from the mean of active group. The point is that if the criteria for indicating efficacy are set and confirmed as adequately high to justify the implementation of a regenerative treatment in health care settings then a placebo response of approximately 4.3 on UPDRS (motor) would not impede accurate decisions regarding the statistical and clinical significance of an emerging intervention being evaluated.
What is the variability of the placebo response on the primary outcome?
As indicated in Figs. 3–5, the placebo response varies both within and between the placebo groups in the nine double-blind trials. The between group average differences ranged from a mean value of 8.4 decrease [25] to a 10.1 of improvement in the motor symptoms [21].

Relationship between time of publication and magnitude of the placebo response. a. This graph represents the mean weighted averages (MWAs) and the CI 95% for each group of 3 double-blind trials, as well as the combined data for the 9 trials. b). Group 1, includes the first published 3 trials [16, 25]. c. Group 2, includes the next set of 3 trials [21–23]. d. Group 3, includes the most recently published 3 trials [18, 26]. e. Combined data for the 9 trials (see Fig. 3). f. A statistically significant, 6.54-point difference is indicated between Groups 1 and 2. g. The graph illustrates that even though there is a considerable degree of variability between trials, the results of the combined data (4.31, CI 95% 3.01–5.55) is likely to represent the placebo responses of sham operated participants.
The pooled variance of the nine sham treated control groups is represented in Fig. 3 by the 95% CI of 3.1 to 5.6. Similarly, the 95% CI for the seven groups with the exclusion of the two outliers was 2.6 to 5.7 (Fig. 4). Therefore, the confidence intervals encompass the probability of a minimal to moderate degree of placebo response associated with sham surgery for PD. Given the limitations of the available database for analysing the multiple factors influencing the outcomes, it is not the purpose of the present descriptive review to explain the mechanisms which determine the magnitude of the placebo responses between the specific trials. It is interesting to note, however, that one of the factors which appears to be associated with the variability is the date of publication and the degree of recovery reported [14].
Consistent with this suggestion, the first three SPTs [16, 25] shown in Fig. 5, published between 2001 and 2006, demonstrated an overall average estimate of a placebo response of 0.08 (95% CI: –3.531 to +3.692), a pooled result which lacked both statistical and clinical significance. The next three SPTs, which evaluated the benefits of viral gene-delivery techniques [21–23] (Fig. 5), were published between 2010 and 2011 indicating a moderate pooled effect size of 6.62 (95% CI: 4.55 to 8.69). As shown in Fig. 5, there was a statistically significant increase of 6.54 points in the second group of three trials in comparison to the first group. The most recently published three trials [18, 26] (Fig. 5), published between 2015 and 2019, demonstrated a WMA of 3.68 units of improvement, with a 95% CI of 1.97 to 5.40. These statistics indicate a modest to moderate level of improvement falling within the probable range of the overall placebo response for the nine groups as shown in Fig. 5.
There are multiple factors which determine the placebo responses as assessed by UPDRS (motor, off). The present analysis indicates that an estimate of a variance of 0.633 represents numerically the pooled variance of the placebo response (Fig. 3). These statistics, along with the combined CI95% of 3.1 to 5.7, indicate that the probability of obtaining a sample mean in a placebo control group significantly greater than 10.8 on UPDRS (motor) is highly unlikely.
There are interesting questions regarding the possible reasons and causes for the variability of the placebo response between control groups and the times at which the trials were conducted. The factors impacting on such wide variability between the results of the placebo controlled groups may be considered worthy of investigation by the researchers who are interested in the mechanisms of the placebo response.
What is the estimated magnitude of the placebo effect?
The placebo effect is a component of the placebo response, representing the ‘true’ therapeutic benefits of the sham surgical procedure (Fig. 2). In the context of SPTs, the placebo effect may be estimated from statistics which represent the magnitude of the placebo response on the primary outcome measure. In the present review, given that the WMA was 4.31, the average placebo effect is assumed to be somewhat less than the UPDRS (motor, off) for sham operated controls. It is important to note that none of the nine published SPTs included an additional non-surgical, best-practice treatment arm for comparison, so it is not possible to quantify the influence of various confounders on placebo responses and effects (Fig. 2). Although we are unable to estimate accurately the magnitude of the true therapeutic benefit of being assigned to a sham operated group, it appears to be a modest to moderate (i.e., less than 4.3) improvement in the participants’ motor functioning. This improvement may not be significantly greater than that of a control group provided with best available treatment included in a randomised trial evaluating the efficacy of a treatment requiring surgical interventions.
Given the difficulties in estimating the true placebo effect size on the basis of data representing motor functioning only, it is relevant to note that possible therapeutic improvements in the placebo groups have been associated previously with significant increases in dopaminergic activity in the CNS, particularly in the nigrostriatal system [30, 31]. Four of the nine trials included neural imaging based on 18F-Fluorodopa neuroimaging indicating no significant increases in the uptake of the ligand within the neostriatum (Table 2). Therefore, there was no evidence reported in any of the trials for either the regeneration of neurons in the impaired basal ganglia system, or the increased synthesis of dopamine, in any of the placebo control groups. In the next section, we will approach the question of the efficacy of sham surgery by looking for the evidence of recovery in persons with PD, following sham surgery.
Evidence for changes in putaminal DA turnover in the placebo arms of 4 trials reporting the results of brain imaging
•The differences in the reporting of the imaging results by the 4 groups prevented synthesising the results. However, the absence of statistically or biologically significant increases in any of the groups rule out evidence for improvements in the dopamine releasing synapses of the putamen. •It may be relevant to note that Ko and colleagues [33] found using 3-dimensional metabolic brain imaging, that network analysis indicated the greatest activity in a cerebello-limbic circuit associated with placebo response.
How do we identify a placebo reactor?
A placebo reactor or responder has been defined as a person assigned to the sham operated control group who has demonstrated true therapeutic gains at the endpoints of the trials [30]. Identifying the persons who have experienced meaningful improvements following sham surgery would provide evidence for the extent to which the burdens of sham surgery may be mitigated by the putative therapeutic benefits of the placebo effect [32]. There have been several different approaches to identifying PD patients as placebo reactors, as for example selecting a specific outcome on the primary outcome measure: A specific score on the primary outcome may be nominated as the criterion: for example, Ko and colleagues [33] relied on a 2 point improvement or greater on UPDRS motor (off) to identify subjects for a study of the neural basis for the placebo effect. Although the majority of the participants scored above the 2-points, setting such a minimal criterion on one specific outcome measure has been seen as adequate for categorising PD patients as clinically improved [34]. In the trial [22] which was the source of the participants [33] a criterion of a minimum 9-points improvement on UPDRS (motor) was set to indicate a clinically meaningful recovery. In both the actively treated and the sham operated groups using this standard of efficacy of participants as meaningfully improved, left only 3 out of 21, or 14% classified as placebo responders. Similarly, when Whone and colleagues [18] set a standard of a 10-point of improvement on the primary outcome as the indicator for efficacy, none (0%) of the sham operated patients were identified as placebo responders.
The problem with identifying placebo responders on the basis of moderate benefits in motor functioning under no drug conditions, is that these benefits may well be due to expectations of improvement resulting in a transient motivation to perform. The term placebo (I shall please, from Latin) by demonstrating that ‘I am feeling and functioning better’, just as it was suggested by the researchers and consequently anticipated by ‘me’ as the trial participant. Variations in symptom intensity is not unusual with people experiencing movement disorders. There have been numerous credible anecdotal descriptions of placebo responders, as well as reports of the phenomenon of kinesia paradoxica describing extreme conditions, such as life-threatening fires where people with advanced PD demonstrate remarkable but temporary levels of activity [35].
The issue here is whether evidence for moderate and periodic improvements in motor functioning constitutes ‘proof of concept’ for the benefits of sham surgical procedures in persons with progressive neurodegenerative disorders. Brim & Miller [32] suggest that this is the case stating that, “we argue that the potential benefits from the placebo effect should be considered in the risk-benefit assessment, and the informed consent for sham-controlled trials of procedures to treat PD and pain” (p. 704). While the suggestion by Brim and Miler [32] may well be applicable to randomised trials of pain management [27], it is problematic with reference to regenerative therapies of the CNS, as these interventions are targeted to restore, at least partially, the neural networks impaired by disease or injury. In relation to PD, it is essential to demonstrate targeted long term structural improvements in the basal ganglia system, before we may claim that functional improvements, such as UPDRS (motor) are identified of genuine recovery [36]. However, the results of trials which reported the results of brain imaging did not indicate that there was a reversal of the neuropathology associated with PD following sham surgery (Table 2). Therefore, it is worth posing the question of whether in the light of problematic evidence for long term benefits associated with the placebo effect should be presented to prospective trial participants and regulatory authorities as meaningful clinical benefits which mitigate the burdens of sham surgery.
DISCUSSION
With the renewed interest in research aiming to develop treatments for the reconstruction of the nigrostriatal system using stem cells [4, 5], the methodological implications of two ground breaking trials transplanting progenitors of fetal mesencephalic dopaminergic cells [16, 25] are of particular relevance. Given the complex challenges associated with the reconstruction of the human brain, there were well founded concerns that these two double-blind trials were prematurely implemented [3, 37]. Nevertheless, the two SPTs resulted in outcomes which had a strongly influence on the program of reconstructive therapies by: identifying severe drug independent graft induced dyskinesias (GIDs) in some of the participants, and failing to demonstrate statistically and clinically significant differences between the transplanted and sham operated groups at the designated trial endpoints.
As discussed by Barker and colleagues [38] the consequence of failing to demonstrate significant benefits in these two trials, and the identification of adverse effects, resulted in the temporary abandonment of further research aimed to advance cell-based approaches to neural reconstruction. The negative results of the two trials [16, 25] which were in marked contrast to the promising evidence of the initial pre-clinical and open-label studies [3, 39], because regarded as compelling justification for implementing placebo controlled trials [40, 41]. A strong consensus has since emerged among researchers working in the field that placebo controlled surgical trials for regenerative therapies for PD are methodologically necessary and therefore, ethically justified [9, 10]. This opinion is illustrated by Marks et al. [23], “although once thought controversial, the use of sham brain surgery is now regarded as essential to control for placebo effects in surgical interventions for Parkinson’s disease. The failure of multiple double-blind trials of surgical therapies for Parkinson’s disease to confirm the results of open-label trials illustrates this point” (p. 1170).
The claim that the two RCTs [16, 37] provided conclusive results based on the comparison between the transplanted and sham operated groups was challenged on the grounds of the published results [42, 43]. It was argued that the evidence provided by the two trials did not depend on the comparisons between the sham operated controls and the transplanted groups but rather, the absence of meaningful clinical benefits was evident by the fact that neither the active nor the sham operated groups improved sufficiently from baseline to demonstrate the efficacy of cell transplantation. Also, as there were no cases of GIDs found in the sham operated control groups, it is probable that selecting an alternative best available control group to conduct the trial would have led to the same conclusions regarding these severe adverse effects. It is also relevant that in contrast to the controls, the patients with fetal mesencephalic grafts found significant increases in DA activity as indicated by the 18F-Fluorodopa assessments and also there was post mortem evidence for survival and growth of the transplanted cells in post mortem studies [6].
The promising overall results of open-label studies initially suggested proof of concept for the benefits of cell transplantation [39]. For example, a relevant review [44] reported very positive overall improvements on UPDRS (motor, off) of 43.86% and increases of neostriatal 18F-Fluorodopa uptake of 40.32% in the 4 grafted groups. Importantly, there was a positive correlation between increases in 18F-Fluorodopa uptake and the improvements in motor functioning. These results are consistent with a causal association between structural and functional recovery and strongly imply that the discontinuation of cell transplantation for PD was premature [38, 42] and that it was perhaps a ‘false negative’ decision [42].
As illustrated by the SPTs listed in Table 2, several approaches have been investigated with the aim of developing disease modifying treatments for PD other than cell transplantation [45]. An analysis of the results of 6 sham surgery controlled trials evaluating emerging regenerative interventions for PD, and the open-label trials which preceded them, indicated a pooled average effect size of 35.6% in the open-label trials but only an average of 17.0% improvement in the active groups in the SPTs [46]. The approximately 50% reduction of the effect size in the active groups in SPTs in comparison to open-label trials were further replicated in more recent analyses [14, 42]. Although the reduction in the active arms of SPTs may be attributed to a larger placebo response in open-label trials [46], the evidence does not support this hypothesis [35].
The pooled average of the sham operated control arms in the present analysis indicated a moderate and variable placebo response associated with sham surgery in participants with PD. In the absence of reports for the long-term recovery of the progressive neuropathology, characteristics of PD, in particular of dopaminergic nigrostriatal projections, there was no evidence for placebo effects. The overall evidence provided by our meta-analysis does not support the notion of a strong placebo response which serves as the foundation for the necessity of relying upon sham surgical arms when designing evaluative trials (Table 3). Rather, as argued previously, [14, 42] there are alternative methods available to control for inflated estimates of efficacy due to bias and to identify the true benefits of emerging regenerative therapies [14].
Overview of the results of the meta-analysis for evaluating the efficacy of emerging reconstructive therapies for PD
There are many reasons why apparently promising scientific discoveries fail to be translated into safe and effective treatments [47]. It is possible that the emerging treatments may have been validly evaluated and correctly identified as being unsuitable to be developed and introduced as a new treatment. Second, the methodological decisions guiding the translation process may be flawed and generated invalid ‘false negative’ decisions regarding treatment safety and efficacy.
In a recent critical review, Deaton and Cartwright [48] examined the advantages and limitations using RCTs to test hypotheses regarding causal effects, and to identify the probable therapeutic benefits of novel interventions. They called attention to the erroneous practice of dismissing the results of an observational studies (open-label trials, in present context) simply because the findings contradicted those of RCTs. In relation to the conflicting findings regarding the open-label and double-blind evaluations of cell transplantation, adopting a theoretical framework which postulated that the mechanisms for recovery include the actions of the cell recipients [42, 49] may contribute to apprehending the reasons for the differences between the outcomes. It has been argued that depriving the participants of accurate knowledge of the actual treatment they received (placebo or active) may account for a considerable component of the approximately 50% loss of the recovery of the active groups in double-blind trials [42].
Conclusion
After a quiescence in human trials of cell implantation for PD, exciting advances are being made in this technology. Stem cells are being manufactured to the highest standards of ‘good manufacturing practice’ engineered to reduce the need for immunosuppression and standards set for the quality of dopamine neuron differentiation. This means we are likely to see clinical trials in the near future and therefore it is necessary to consider trial designs. A strong influence on trial design has been the assumption that significant placebo responses and effects associated with the surgical implantation of cells requires the use of sham operated comparator groups to enable the implementation of double-blind trials. The evidence from our analysis of published SPTs does not support the claim for their ongoing use as the ‘gold standard’ research design.
A variety of research designs and analytical methods are now available to obtain the evidence required to conduct trials to develop and determine the safety and efficacy of emerging regenerative therapies [14, 42]. Rather than relying on sham operated control groups, comparisons with best available evidence-based treatments for PD would make false positive decisions highly improbable, provided sufficiently high standards were set for determining the efficacy of the novel treatment.
Footnotes
ACKNOWLEDGMENTS
We would like to acknowledge Michele Bernshaw for editing and proofreading the paper.
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
