Abstract
INTRODUCTION
By 2012, a series of observations by multiple groups [1–11] had indicated that Lewy-type α-synucleinopathy (LTS) may be present in the colon of many subjects with prodromal and clinically-manifest Parkinson’s disease (PD) and other Lewy body disorders (LBD), raising the potential for an endoscopically-obtained molecular diagnosis. However, a number of subsequently-published studies [12–27] have had conflicting results regarding the sensitivity and specificity of colonic biopsies for the detection of LTS. Under the sponsorship of the Michael J. Fox Foundation for Parkinson’s Research (MJFF), two studies were directed at comparing the ability of several immunohistochemical (IHC) methods to accurately differentiate Parkinson’s disease (PD) subjects from normal controls on the basis of colonic LTS.
The first study, now published [28], examined LTS in archived colonic biopsies. The conclusion of this study was that adequate diagnostic accuracy was not achieved by any of the staining methods or morphologically-defined staining types. Positive staining, in densities similar to those seen in PD cases, was seen in most of the slides from control cases, with most of the staining methods used, by most of the raters. Limitations of this prior study included the lack of neuropathological confirmation of the diagnosis of PD, the absence and/or relatively small amounts of submucosa in many of the slides and the probable scarcity of neuronal elements in the typically small colonic biopsy samples.
The objective of this second study, therefore, was to evaluate several IHC methods in large, full-thickness sigmoid colon sections from autopsied PD and control subjects. There were at least two reasons for using colon autopsy specimens in this second study, rather than biopsy specimens: 1) the unequivocal presence or absence of PD in each autopsied subject had been established through clinical evaluation and a full neuropathological examination, whereas subjects with archived biopsies are generally still alive or, if they have died, often have not had neuropathological confirmation of PD. The diagnostic error rate of a clinical diagnosis of PD ranges from 15 to 50%, depending on disease duration [29] 2) autopsy specimens are much larger than biopsy specimens, and include abundant submucosa as well as the muscularis layer with their respective neuronal plexuses, so there will be a greater probability that the specimen will contain adequate densities of neuronal structures. The plexus within the muscularis layer has been reported to have, within the colon, the greatest density of LTS [3, 12]. The biopsy study may have failed due to sparse or absent neuronal structures in the much smaller samples.
MATERIALS AND METHODS
Consecutive or near-consecutive sections of formalin-fixed paraffin-embedded (FFPE) autopsy-derived sigmoid colon, from 5 PD and 5 control subjects, were provided by the Banner Sun Health Research Institute Brain and Body Donation Program (BBDP; www.brainandbodydonationprogram.org) [30]. The operations of the BBDP have been approved by the Western Institutional Review Board. Subjects (Table 1) were chosen on the basis of the absence of clinical or incidental colonic pathology as well as clinicopathological confirmation of PD and both clinical (absence of dementia or parkinsonism) and pathological control status (absence of LTS in the brain and/or previously-sampled peripheral nervous system). As the sigmoid colon had not previously been tested for the presence of LTS in most of these subjects, the PD subjects were chosen from amongst those with documented higher density scores in the submandibular gland and esophagus (two of the peripheral sites routinely tested for LTS by the BBDP), on the assumption that these subjects would also be likely to have higher LTS density scores in other peripheral nervous system locations. As the intent of the study was not to provide an unbiased estimate of LTS density in a representative sampling of PD subjects, but rather to test IHC methods, we hoped to instead select PD subjects with higher-than-average colonic LTS densities.
Alternating sets of 5 sections (e.g. Set 1 consisted of section number 1 plus every nth section thereafter; Set 2 consisted of section number 2 and every nth section thereafter, etc), as well as positive and negative control sections taken from autopsy-obtained cerebral cortex with and without LTS, were sent to four participating laboratories, who each stained one or more of the section sets with their own optimized method and/or an assigned method (Table 2). These were the same four labs that had participated in the prior colon biopsy study [28]. Seven different IHC methods were used (Table 2). Primary antibodies included one polyclonal and three different monoclonal antibodies against alpha-synuclein phosphorylated at serine 129 (Methods 1–6) and one monoclonal antibody against alpha-synuclein without post-translational modifications (Method 7). Five different combinations of epitope exposure pretreatments and signal development protocols were used.
After preliminary examination, three staining patterns were considered as candidates for specific LTS. All three of these were morphologically consistent with nerve fibers and/or puncta. Stained sections were graded blinded to diagnosis by the same four raters that performed this function for the prior biopsy study, using three custom templates devised after initial examination of the morphologies of presumptive specifically-stained structures (Fig. 1). Method 4 was graded by only three raters. The major template used to assign density grades, as with the prior biopsy study [27], was termed “fibers and dots” (dots synonymous with puncta), and was applied to slides stained with all methods (Fig. 2). The second template (“fibers only”) was used only for slides stained with Method 7, due to the presence of frequent non-specific puncta. The third template, also previously-used [28], was termed “perivascular dots”, and was used only for slides stained with Method 3, as it was not present in slides stained with other methods. This template was used only for a particular morphological appearance of dots embedded in vascular walls. As for the prior biopsy study, semi-quantitative density scores (none, sparse, moderate and frequent) were assigned based on the highest density of stained structures at any single location and separately for mucosa, submucosa and muscularis (Fig. 2). The perivascular pattern of staining, being relatively infrequent, was graded without reference to layer. As non-specific, clearly artifactual staining (defined as being without meaningful morphological characteristics and/or found in slides from all subjects regardless of diagnosis) was present to some extent in slides produced with all methods, raters were instructed not to grade these types of staining. As non-specific staining of epithelial cells was particularly common (diffuse staining of cytoplasm and intraluminal fecal material; an example is shown in Fig. 1d), mucosal grading was limited to the lamina propria with adjacent muscularis mucosa. Another relatively common type of non-specific staining was a granular pattern in the lamina propria, as described in the prior colonic biopsy study [28]. Following blinded grading, the score sheets were sent to a statistician, for whom the blind was then lifted.
The primary analysis was for the ability of a subject-level positive or negative designation to predict diagnostic status (PD or control). Any LTS score greater than zero, in any slide, from any layer, was regarded as defining that subject as positive for LTS. The diagnostic performance was expressed as sensitivity, specificity, Youden Index and inter-rater agreement (intra-class correlation coefficient, ICC). The Youden Index, a measure of overall diagnostic accuracy, equals sensitivity + specificity, expressed as decimals, minus one. It has a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e when the test has no discriminatory value at all. A Youden Index of one indicates that there are no false positives or false negatives, i.e. when the test does not misclassify a single subject. Hereafter, to be concise, “diagnostic accuracy”, will refer to the relative performance of methods and raters as expressed by the Youden Index.
The density measures, along with prevalence of positive staining, were used in secondary analyses to compare the three colonic layers assessed.
The primary analysis aimed to first calculate diagnostic accuracy and inter-rater agreement with all colonic layers included, as it was considered likely that the muscularis layer, with its reportedly higher prevalence of LTS [3, 12], would be critical for this, and then to repeat the calculations with the muscularis excluded, so as to model the biopsy setting, where the muscularis layer is never obtained.
The results from these initial analyses were used to compare the histological methods and raters in terms of their ability to predict a PD diagnosis. The most accurate method and raters were then used to compare LTS density in the three colonic layers.
Two-color immunolabelling was used to confirm the neuronal origin of the “fibers and dots” staining morphology (Fig. 3). The same monoclonal antibody against phosphorylated α-synuclein utilized for the immunoperoxidase procedures in Methods 1 and 6 (P-syn/81aBiolegend [31]) was used in series with a rabbit polyclonal antibody against phosphorylated and non-phosphorylated neurofilament (Abcam ab 7795, Cambridge, MA). Signal development utilized an alkaline-phosphatase-activated red chromogen (Impact Red, Vector Laboratories, Burlingame, CA), in combination with a peroxidase-coupled bluish-black chromogen (ABC, Vector Laboratories, Burlingame, CA; 3,3’-diaminobenzidine with saturated nickel ammonium sulfate and imidazole, Sigma, St. Louis, MO)[36].
RESULTS
The appearance of the staining with several different methods is depicted in Figs. 1–3. Overall, in most subjects, tissue elements exhibiting even the most common “fibers and dots” staining pattern were present at low densities. Double-staining of selected sections with an antibody raised against a pan-neuronal marker (neurofilament protein) confirmed the impression gained from morphology alone that the “fiber and dot” morphology represents neuronal structures (Fig. 3i-p). Additionally, most of the presumptive specifically-stained structures were present in an anatomical context consistent with neuronal tissue, such as within nerve fascicles in the submucosa or within the intermyenteric plexus of the muscularis.
Inter-rater agreement and mean diagnostic performance were both poor to moderate when data from all colonic regions (lamina propria, submucosa and muscularis), templates and raters were included (Table 3). Only Methods 4, 5 and 6 achieved a mean Youden Index greater than 0.5 as well as ICC scores greater than 0.5. As diagnostic performance with the “perivascular dots” template, used only for Method 3, was uniformly poor with all raters, this pattern of staining was regarded as non-specific and scores using this template were not utilized in subsequent calculations. The diagnostic performance for Method 3 improved to an average Youden Index of 0.5 (from 0.0) after exclusion of results obtained with the perivascular template.
The next sets of analyses (Table 4) were performed after exclusion of results from the muscularis layer, to model the biopsy situation. Diagnostic accuracy was overall fairly similar to that obtained with inclusion of the muscularis layer (results not shown). Results again showed poor to moderate inter-rater agreement and diagnostic performance, with average ICC values ranging from –0.07 to 0.65 and average Youden indices varying from –0.05 to 0.75. Again, only Methods 4, 5 and 6 achieved both a mean Youden Index greater than 0.5 as well as ICC scores greater than 0.5.
In a third set of analyses (Table 5), only the results from the two raters with mean Youden Indices greater than 0.5 were used (Raters 1 and 4 from Table 4, with mean Youden Indices of 0.7 and 0.625, respectively). The mean diagnostic performance and inter-rater agreement were both moderate to very good for Methods 1–6. Method 5 achieved 100% accuracy with both raters. Method 6 was 80% accurate with both raters while three other methods all achieved 70% accuracy, averaged across the two raters. In general, specificity was better than sensitivity; six methods had 100% specificity.
The results from Method 5 and the two most-accurate readers were then used to assess the prevalence, distribution and density of LTS within the three colon layers assessed (Table 6). Combining the scores from both raters, the submucosa had the highest prevalence of pathological LTS staining; 5/5 PD cases were positive for both raters in the submucosa versus 4/5 for the muscularis and 3/5 for the mucosa. The mean LTS density score within the mucosa was approximately 63% of that within the submucosa and muscularis, which were roughly equivalent.
DISCUSSION
This autopsy-based study, in contrast with the prior biopsy-based study by this same group [28], was successful in identifying several IHC methods with high accuracy for PD. The differing outcomes of the two studies were most likely due to several reasons, including: 1) the autopsy specimen was much larger and hence more likely to contain a much larger number of neuronal elements, which may have been absent or very sparse in the biopsy samples 2) the diagnosis of PD in each autopsied subject had been established through a full neuropathological examination, whereas the biopsied subjects had only a less-accurate, clinically-based diagnosis. 3) Positive or specific staining was defined as that with a morphology consistent with a neuronal origin and/or a presence in an anatomical context consistent with a neuronal origin. In the prior biopsy study, all staining morphologies were accepted as potentiallyspecific.
Several methods used in this study had acceptable diagnostic accuracy. Method 5 was the most accurate, in terms of both ICC and Youden Index. However, with only 5 subjects in each group, the 95% confidence intervals for sensitivity and specificity generally were very wide, ranging between 48% and 100%. Therefore the rank order of the top several methods might not be preserved with much larger subject numbers.
Raters may have widely varying accuracy. It is recommended that raters should receive more formalized training prior to commencing slide judging, in order to more clearly distinguish between truly positive LTS and common artifactual (non-specific or false-positive) staining patterns. Although a detailed training program, in slide-show format, was circulated to all raters, it would be preferable to conduct a training program in a videoconference format or in-person under a multi-headed microscope.
Positive and negative control slides were used, but control sections stained with omission of the primary antibody were not used; this may have resulted in a somewhat higher false-positive rate. However, it is noted that, with the two most accurate raters, specificities with several methods were very high (6 methods had 100% specificity), indicating a low rate of false-positive decisions. Also, all the antibodies used in the study have been extensively characterized.
Although this autopsy study identified several methods with useful accuracy for identifying PD subjects, positively-and specifically-stained nerve fibers were scarce and widely scattered in some subjects, raising concerns that randomly-acquired biopsies may often miss positive sites, resulting in a high false-negative rate with respect to a true PD diagnosis. Also, the amount of submucosa obtained in typical biopsies is often minimal. If colonic biopsies are to be used for studies of diagnostic accuracy, or to select subjects for clinical trials, multiple biopsy samples from each PD subject are therefore advisable to increase sensitivity. It is possible that future methods may have higher sensitivities.
Although prior studies had suggested that the muscularis layer had a considerably higher density and prevalence of LTS than the submucosa [3, 12], in this study they were roughly equivalent. This is a critical finding as biopsies do not contain the muscularis layer and therefore, to be a potentially useful diagnostic site, the submucosa and/or mucosa must contain an acceptable pathology prevalence. However, this was a small sample size, so this issue would need further exploration with a much larger and more diverse case series.
Due to our deliberate design to ensure adequate LTS densities in the PD colonic tissue, rather than sampling subjects to be representative of the PD population as a whole, we selected subjects with a long clinical symptom duration, averaging more than 15 years, and who had previously-demonstrated higher peripheral nervous system (submandibular gland and esophagus) LTS scores. Also, by coincidence, all the PD subjects were males. As a result of this deliberately-biased sampling, the prevalences and densities of colonic LTS in this small subset of PD cases are likely to be higher than that expected for the general PD population.
CONFLICTS OF INTEREST
The authors have no conflicts of interest to report.
Footnotes
ACKNOWLEDGMENTS
This study was supported by a grant from the Michael J. Fox Foundation for Parkinson’s Research (Grant ID: 9035.01). The Banner Sun Health Research Institute Brain and Body Donation Program is supported by the National Institute of Neurological Disorders and Stroke (U24 NS072026 National Brain and Tissue Resource for Parkinson’s Disease and Related Disorders), the National Institute on Aging (P30 AG19610 Arizona Alzheimer’s Disease Core Center), the Arizona Department of Health Services (contract 211002, Arizona Alzheimer’s Research Center), the Arizona Biomedical Research Commission (contracts 4001, 0011, 05-901 and 1001 to the Arizona Parkinson’s Disease Consortium) and the Michael J. Fox Foundation for Parkinson’s Research. AGC is a recipient of a “poste d’accueil Inserm”.
