Abstract
INTRODUCTION
Parkinson’s disease (PD) is a common neurodegenerative disease with varying degrees of cardinal manifestations, including resting tremor, rigidity and bradykinesia [1]. Based on distinct major symptoms, the heterogeneous entity of PD patients can be grouped into the postural instability and gait difficulty– dominant (PIGD), the tremor-dominant (TD) and the mixed subtypes [2, 3]. Patients with the PIGD subtype have a relatively malignant course [3], including shorter life expectancy, faster progression, worse prognosis, and higher risk of complications [4]. Routine dopamine replacement therapies [5] and surgical procedures [6] work with less success in treating predominant PIGD symptoms in the long term [6]. The PIGD subtype is thus a refractory group of PD population that needs further discrimination from other subtypes of PD. In clinical practice, the identification of subtypes is based on the clinical scales, which may incorporate subjective judgement. For this reason, an efficient and unbiased way to distinguishing patients with the PIGD subtype is essential for making appropriate treatment plans.
In the last few years, there has been increased interest in subtype-specific brain changes of PD and in particular, of PD patients with the subtype of PIGD [7–12]. It has been generally recognized that multiple neural mechanisms participate in the presence of distinct phenotypes of PD [2, 14]. Brain alterations in structure [12, 16] and function [12, 18] have been frequently reported in comparisons between the PIGD and the non-PIGD subtypes; however, the consensus of the specific brain biomarkers for the PIGD subtype is still far from being reached. Hence, a workable integration of multi-modal data is required to facilitate differential diagnosis of the PIGD subtype.
Along with the advances in machine learning algorithms, the machine learning-classifiers make it possible to effectively analyze complex and multivariate functional magnetic resonance imaging (fMRI) data [19]. Machine learning-based pattern recognition techniques such as support vector machine (SVM) [20, 21] have been widely applied to patient classification across various neurological and psychiatric disorders, with excellent results have been obtained [22–26]. The SVM algorithm allows studying brain characterizations at the individual level and is sensitive to subtle effects in the brain, exhibiting its high translational potential in clinical practice [24].
In the present study, we aimed to first introduce an SVM-based classification to distinguish patients with the PIGD subtype of PD from those with the non-PIGD subtypes, and then predicted classification outcomes would be compared to the existing clinical diagnosis categorization results. This proposed classification was done by preparing features from structural and functional MRI datasets of PD patients, constructing the classifier with the SVM algorithm, and assessing its performance using indices of sensitivity, specificity and accuracy, receiver operating characteristic (ROC) analysis, and the individual-by-individual agreements test as compared to the acknowledged categorization. As we would show, our newly proposed classification appeared to give satisfactory diagnostic results of expectation.
MATERIAL AND METHODS
Subjects
A total of 97 subjects (M/F 55/42, mean age 57.99 ± 9.50y) participated in the present study, of which 52 were PD patients (M/F 31/21, mean age 58.86 ± 9.02y) and 45 were normal controls (NC, M/F 24/21, mean age 56.98 ± 10.02y). All the patients were diagnosed according to the UK Parkinson’s Disease Society Brain Bank criteria for PD in the Department of Neurology at the Second Affiliated Hospital of Zhejiang University School of Medicine. The NCs were recruited from the communities in Hangzhou who were voluntary to participate in the study. Subjects with history of self-reported or clinically observed neurological/psychiatric disorders were excluded from the study. Using the clinical categorization proposed by Jankovic et al. [3], the recruited PD patients were grouped into the PIGD subtype (19 patients, M/F 11/8, mean age 58.68 ± 8.38y), TD subtype (25 patients, M/F 16/9, mean age 56.96 ± 8.85y) and mix subtype (8 patients, M/F = 4/4, mean age 65.25 ± 9.19y). Here the TD and mix subtypes were merged into one single group termed the non-PIGD group due to the main concern of the present study focusing on patients with the PIGD subtype of PD. Sixteen in 52 of the recruited patients were newly diagnosed and untreated. In the rest of the 36 treated patients, one used to receive irregular Traditional Chinese Medicine treatment. At least a 12-hour withdrawal from medication was required for each treated patient before the examination. Clinical examinations of the Unified Parkinson’s Disease Rating Scale (UPDRS), the Hoehn and Yahr scale (H&Y), duration of disease, and the Mini-Mental State Examination (MMSE) were investigated for all the recruited patients. The duration of disease was defined as the time since the self-conscious parkinsonian signs were present until the time when he/she participated in the study. Clinical examinations ensured the subjects were able to complete independent examination without extra aids. All the subjects provided written informed consent to give permission to participate in the study. The entire protocol has been reviewed and approved by the Ethics Committee of the second affiliated hospital of Zhejiang university school of medicine.
MRI acquisition
Multi-modal MR data were acquired on a 3.0T GE Signa EXCITE scanner with an 8-channel phase-array head coil for all the subjects at rest. The functional images were acquired with an echo-planar imaging (EPI) MRI sequence (TR/TE = 2000/30 ms; matrix = 64×64; FOV = 24×24 cm2; flip angle =80°; 23 slices; and thickness = 5mm). The DTI scans were acquired in the axial plane with EPI sequence. Thirty-one non-collinear diffusion sensitization gradients (b = 1000 s/mm2) and another non-weighted diffusion image (b0 = 0 s/mm2) were performed once with the following parameters: TR/TE = 2000/30 ms; FOV = 24 × 24 cm2; acquisition matrix = 128×128, 38 slices, thickness = 3 mm and flip angle = 90°. To improve anatomical structures, high resolution T1-weighted MR images were scanned using an axial fast spoiled gradientrecalled-echo (3D-FSGPR) sequence with following parameters: TR/TE = 5100/1.2 ms; matrix =256×256; FOV = 24×24 cm2; thickness = 1.2 mm; space = 0 mm; and 124 slices. During scanning, all the subjects were at rest with their eyes closed to remain immobile and physiologically stable. Foam pads and earplugs were used to reduce head motion artifacts.
Data preprocessing
Resting-state fMRI (rs-fMRI) data
The rs-fMRI data were analyzed using the Statistical Parametric Mapping toolbox (SPM8, Wellcome Trust Centre for Neuroimaging, London, UK) and REST toolbox (version 1.8) [27] in MATLAB R2009b (The MathWorks Inc, Natick, MA, USA). The first ten points of the data were discarded for magnetization equilibrium within the initial MR signals. The preprocessing workflow was performed including slice timing correction, realignment, normalization to the MNI template (McGill University, Montreal QC, Canada), spatial smoothing with an 8 mm Gaussian full width at half maximum (FWHM) to reduce space noise, removal of linearly signal-drifting trend, and a temporal band-pass filter (0.01–0.08 Hz) for all voxel series that eliminated physiological signal noise. The time series of the white matter (WM), cerebrospinal fluid (CSF), and 6 rigid body motion parameters were regressed out as nuisance variables. Two radiologists thoroughly visually checked all the data in case severe artifacts and distortion existed. Some subtle motor distortions can be adjusted, whereas the subjects with severe head motion in any direction greater than 2.5 mm or a rotational degree more than 2.5 degree during scanning were excluded from this study. For rs-fMRI data, we measured the indices of regional homogeneity (ReHo) [28] and global amplitude of low-frequency fluctuation (ALFF) [29] for each subject. In recent years, ALFF has been widely used to delineate the extent of spontaneous neuronal activities in disease states [29–32]. ReHo is another approach for reflecting regional spontaneous activities [28]. Regional changes of ALFF and ReHo in PD have been extensively found in subcortical and cortical regions [33–35]. For details of these measurements, please refer to our previous work [23].
T1-weighted data
The T1-weighted images were processed using SPM8 in the MATLAB R2009b. The voxel based morphology (VBM) analysis is a commonly used method to study brain structural changes [12, 37], which was performed with a refined registration method of DARTEL-VBM [38] to examine volumetric differences in both whole-brain and regional grey matter (GM) between the three groups. In patients with PD, GM loss was observed in areas involving the basal ganglia nuclei [12, 39–41], calcarine cortex [42, 43] and visual cortex [44, 45] and in particular, in the PIGD subtype compared to the TD subtype [12]. All the T1-weighted images were first manually reoriented along the anterior-posterior commissure before starting analysis and were screened by eyes for any apparent artifact. Then the reoriented images were segmented into GM, WM and CSF probability maps in the individual native space. The DARTEL method was performed by modeling the shape of individual brain to generate a series of increasingly accurate group-level GM, WM and CSF templates [38]. Lastly, with the final study-specific templates, all the GM, WM and CSF images in the native space were normalized into the MNI standard space and then preserved by the “modulation” option respectively, where the modulated images allow us to achieve absolute volumes of grey matter tissue while unmodulated images represent the relative concentration of these structures [46]. The normalized GM images were smoothed (FWHM = 8 mm) and the voxel size was resliced to 1.5 mm × 1.5 mm × 1.5 mm. To screen for signals outside the brain, a whole brain mask was used.
Diffusion tensor imaging (DTI) data
For all the subjects, the DTI data were analyzed using the FSL package and the BET toolbox (version 4.1.0, FMRIB Analysis Group, Oxford, UK). The head motions and eddy currents were first corrected with an affine registration approach. To eliminate effects outside the brain, intracranial signals were extracted to generate a binary brain mask. Using the tensor construction at a voxel level, the indices of fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD), and axial diffusivity (AD) were calculated in the native space for each subject. Clear evidence has shown that WM loss in PD is a frequent occurrence (reviewed in [47]). In particular, WM patterns differ in distinct PD subtypes [15, 49]. The WM group-wise statistics were performed in a voxel-by-voxel fashion in SPM. Maps of FA, MD, RD and AD were then normalized to the MNI space within a white matter mask. The normalized images were resampled into a voxel size of 2 × 2 × 2 mm3 followed by smooth with an 8 mm FWHM. No subjects were excluded in this study.
Feature extraction and selection
After data prepressing, for each subject, we have obtained GM, WM and CSF maps from T1-weighted data, ALFF, ReHo maps from rs-fMRI data and FA, MD, RD, AD maps from DTI data. To get rid of the features that were excessively redundant for the patients with PD, those multi-modal measurements of both the NC and the merged PD patients (merged by groups of the PIGD and the non-PIGD) were first compared, respectively. Then, the features that are not conducive to the classification of PD subtypes were removed, reducing computational load for the following feature selection. Unpaired two-sample t tests at a voxel-wise group level were performed with a statistical threshold set at P < 0.001 and cluster size >10 voxels with no correction limited. The regionsshowing significant functional and/or structural alterations in the merged PD group were set as regions of interest (ROI) for the subsequent subtype-specific feature extraction. Multi-modal features were extracted for each ROI using the Marsbar toolbox (version 0.43, http://marsbar.sourceforge.net/).
The SVM is a supervised, multivariate classification method consisting of training and testing phases [21, 50], where the training phase is to develop an algorithm for discriminating the two groups while the testing phase is to evaluate the classification performance on a blind prediction of unseen data [19, 24]. The classifier algorithm was developed using MATLAB platform (version 8.3.0.532) and the LIBSVM toolbox (http://www.csie.ntu.edu.tw/cjlin/libsvm/). We applied an SVM-Recursive Feature Elimination (SVM-RFE) algorithm embedded in 100 times of a 5-fold cross validation to construct SVM classification models. The nonlinear radial base function (RBF) kernel function was used in transforming features derived from training dataset to a higher space [51]. To identify the effective features with the highest discriminating power, we employed the SVM-RFE algorithm that is an feature selection strategy proposed by Guyon et al. [52, 53]. The SVM-RFE algorithm uses the weighted coefficients in SVM classification models to rank features and iteratively eliminates non-informative features with the smallest ranking criteria [52], which has been used in many clinical applications [19, 54–59]. The SVM-RFE algorithm was implemented in this study to select features with the highest discriminating power as inputs to the SVM algorithm. In each iteration, a new SVM classification model was trained and a new weight coefficient map was generated. The ranking criterion we used was the value of each weight coefficient vector ∥w i ∥. Features with low weight coefficient vector ∥w i ∥ were iteratively removed from the feature dataset, whilst features with high weight coefficient vector ∥w i ∥ were retained [52, 60]. The output of SVM-RFE algorithm is a ranked feature vector from the highest weight coefficient to the smallest weight coefficient. We repeated a 5-fold cross validation for 100 times to determine the feature size for each patient. The final feature size was determined when the classification model reached the highest averaged classification accuracy on the testing dataset after 100 times of 5-fold cross validation. The overall procedure of feature extraction and selection was displayed in Fig. 1. Our implementation of the SVM-RFE was delineated in a pseudo-code provided in the Supplementary materials (S-Figure 1).
Classifier performance assessment
After feature selection, the dimensions of feature vectors for each PD patient were reduced from 319 to 20. The preserved features for each patient were aligned into a single vector, which were then used to construct the SVM classifier for the PIGD and non-PIGD groups of PD. The main procedure included two steps: grid search for parameter C and G, and the leave-one-out cross-validation (LOOCV). The grid search method was employed to select the optimal parameter C and the kernel function parameter G. We used the LOOCV method to estimate the performance of the classifier. Given a training set of data samples, the classifier removes one data sample in each trial and trains the classifier on the rest of the data samples, with the removed sample used for model testing [61] (Fig. 2). The output of the proposed SVM classifier was a set of predicted values, according to which we identified the type of the subtypes to which subtype the test samples belong. Using these output values of the classifier, we drew a scatter plot for the two PD subtypes.
The performance of the SVM model was assessed using measures of accuracy, sensitivity, specificity and maximum area under the curve in the ROC analysis, which were defined as follows:
RESULTS
Demographic characteristics and clinical assessments were displayed in Table 1. All the PD patients and NCs age- and sex-mached. For the two PD subtypes, there were no statistic differences observed in age, sex composition, disease duration, the H&Y stage, MMSE scale and UPDRS. The comparisons of multi-modal brain patterns between the NC group and the merged PD group showed significant alterations in regions covering frontal, parietal, occipital, temporal cortices and cerebellum in patients with PD. More details please refer to Supplementary material (Fig. 3 and S-Table 1).
In the performance analysis of classification, the proposed classifier discriminated patients with the PIGD subtype with a diagnostic accuracy, sensitivity and specificity of 92.31%, 84.21% and 96.97%, respectively. The positive predictive value (PPV) was up to 94.12% and the negative predictive value (NPV) was 91.43%. The ROC analysis showed the maximum area under the ROC curve (AUCmax) reached 0.9585 (Fig. 4). The individual-by-individual agreement test showed the Kappa value was 0.83, suggesting an almost perfect diagnostic agreement of the proposed classifier with the existing clinical categorization. The scatterplot of the predicted values of the classifier exhibited the power of effectively discriminating the PIGD and non-PIGD subtypes (Fig. 5).
DISCUSSION
The present study for the first time introduced a SVM-based classifier for the differential diagnosis of patients with the PIGD subtype of PD using multi-modal MRI data. In general, our proposed classifier exhibited satisfactory classification power, with the diagnostic accuracy, sensitivity, specificity, and the maximum AUC up to 92.31%, 84.21%, 96.97%, and 0.9585, respectively. Moreover, the Kappa value of 0.83 suggested an excellent diagnostic agreement between the proposed classifier performance and the current clinical categorization.
Patients with the PIGD and non-PIGD subtypes of PD have been linked to different associated clinical manifestations [3, 63], pathologic patterns [14, 64], genetics [65], and degrees of risk in developing complications [66–68]. These findings have been revealed and confirmed by means of neuroimaging techniques, where functional and structural differences between the two subtypes have been often observed by studies across various modalities, including positron emission tomography (PET) [69], single photon emission computed tomography (SPECT) [17, 70] and fMRI [12, 72]. However, conclusive and definitive brain biomarkers specific for the PIGD subtype are still lacking. Thus, based on prior studies [57, 74] investigating single modality data, we integrated functional and structural MR data (including T1-weighted and DTI data) in one single study to analyze patients with specific PD subtypes. Multi-modal features of the ALFF, ReHo, brain volumes, and DTI properties of WM have been extensively used to detail alterations in brain functions and structures [12, 76], giving substantial support for better understanding clinical disorders.
Multi-modal MR data give massive features for constructing the classifier though, where features excessively more than examples can lead to overfitting [19]. One way to overcome the adverse effect of overfitting is to select only effective features, which is essential for developing a classifier with a high predictive accuracy [77]. The SVM-RFE algorithm uses weighted coefficients as the selection criterion, which facilitates the full use of the training dataset and avoids overfitting [52, 77]. Recently, advances in the SVM-based classifiers have allowed us to classify patients with PD and in particular, with the phenotypically specific PD patients. The SVM-based imaging recognition classifiers can help distinguish PD patients from healthy controls with SPECT images [78] and the combination of T1-weighted and functional MR images [73], separate PD patients and suspected PD patients with susceptibility-weighted images (SWI) [79] and DTI [80] scans and, of note, diagnose patients with predominant tremor from those having essential tremor with rest tremor with structural MR images [25]. In these previous attempts, the SVM-based classifications have exhibited excellent performance.
Previous successful applications have given us substantial support for the use of SVM-based classifier in differentiating patients with the PIGD subtype of PD. To our best knowledge, there have been no studies reporting the exact diagnostic rates for phenotypic patients with PD. In the performance analysis, our classifier showed a high diagnostic accuracy, sensitivity, specificity, and maximum AUC value, exhibiting a promising diagnostic power in discrimination of the patients with the PIGD subtype of PD. However, it was shown that some PD patietns were wrongly grouped by our classifier, which could be due to the following two reasons. First, the diagnostic rates of PD itself need to be improved. The misdiagnosis and missed diagnosis probably have occurred at the first unpaired two-sample t test procedure. Patients with predominant parkinsonian symptoms but not PD were wrongly recruited in the present study. In a prospective clinicopathological study, an autopsy confirmed diagnosis of PD was only 76% [81]. Even though using the UK Parkinson’s disease brain bank criteria for clinical diagnosis, there are about 10% of patients that have been diagnosed as PD in life still subject to an alternative diagnosis during post-mortem inspections [82]. Besides, these disabling clinical phenomena, such as freezing of gait, balance disturbance, postural instability can separately exist in PD [83–85], which are also often observed in many other parkinsonisms [86]. Thus, current diagnostic criteria of PD in clinical practice are not an exact science, which may constrain the effectivity and validity of the pattern recognition in the consequent procedures for specific PD patients. Second, the diagnosis of the PIGD subtype of PD is dynamic, with subtype transition from the non-PIGD to PIGD subtype in patients with PD have been observed [66]. Thus, overlapping brain patterns existing between patients of the two subtypes may by nature interfere MRI pattern discrimination in the machine learning procedure. It should be stressed again that the main purpose of our present work is to develop a fast, automatic classification comparable to the existing clinical categorization, and thus to substitute for the complex and subjective large-scale computing. In light of the aforementioned, the Kappa test exactly fitted our objective to estimate the diagnostic agreement between the proposed classifier performance and the existing clinical categorization. Since the Kappa value ranges from –1 to 1, where the value calculated for classifiers approaching “1” represents that the classifier performance is more realistic rather than by random chance and the statistic of 0.81–1.00 indicates an almost perfect agreement [87, 88]. The Kappa value of 0.83 in the diagnostic agreement for the predicted classification results again demonstrated that this classifier was comparable to the existing clinical categorization.
In conclusion, we for the first time introduced the machine learning-based classification to distinguish patients with the PIGD subtype of PD from those with the non-PIGD subtype at the individual level with multi-modal MRI scans. This classifier showed a promising diagnostic accuracy of 92.31% (AUCmax = 0.9585) and an almost perfect agreement for differential diagnosis with clinical categorization (Kappa statistic = 0.83). With these satisfactory results, we successfully proposed an automatic, accurate and specified classification for distinguishing patients with the PIGD subtype of PD; moreover, we demonstrated the availability and validity of this SVM-based classification in the differential diagnosis of specific PD populations.
CONFLICTS OF INTEREST
None.
Footnotes
ACKNOWLEDGMENTS
This study was supported by the National Science & Technology Supporting Program, China (Grant No. 2012BAI10B04) and the Natural Science Foundation of Zhejiang Province, China (Grant No. LY12H09006). We would like to express our gratitude to all the participants in this study. Furthermore, we also thank Dr. Dan Long for his initial attempts in this project and Dr. Hsu-lei Lee for her critical and careful amendments.
